This document discusses Neptune, a framework for scheduling suspendable tasks for unified stream and batch applications. It introduces coroutines to implement suspendable tasks that can pause and resume efficiently. It also includes a pluggable scheduling layer that can satisfy the diverse latency and throughput requirements of stream and batch jobs through policies like prioritizing stream jobs. The implementation extends Spark to support suspendable tasks and job priorities, showing it can efficiently share resources while meeting latency goals for stream workloads.
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...IRJET Journal
This document proposes a Multi Queue (MQ) task scheduling algorithm for heterogeneous tasks in cloud computing. It aims to improve upon the Round Robin and Weighted Round Robin algorithms by overcoming their drawbacks. The MQ algorithm splits tasks and resources into separate queues based on size/length and speed. Small tasks are scheduled on slower resources and large tasks on faster resources. The document compares the performance of MQ to Round Robin and Weighted Round Robin algorithms based on makespan, average resource utilization, and load balancing level using CloudSim simulations. The results show that MQ scheduling performs better than the other algorithms in most cases in terms of these metrics.
Performance comparison on java technologies a practical approachcsandit
Performance responsiveness and scalability is a make-or-break quality for software. Nearly
everyone runs into performance problems at one time or another. This paper discusses about
performance issues faced during one of the project implemented in java technologies. The
challenges faced during the life cycle of the project and the mitigation actions performed. It
compares 3 java technologies and shows how improvements are made through statistical
analysis in response time of the application. The paper concludes with result analysis.
Evolutionary Multi-Goal Workflow Progress in ShadeIRJET Journal
This document summarizes an evolutionary multi-goal workflow scheduling algorithm for cloud computing environments. It begins by highlighting challenges with applying existing scheduling algorithms to clouds, which differ from traditional heterogeneous environments. It formulates the cloud workflow scheduling problem to optimize makespan and cost simultaneously as a multi-objective optimization problem. The paper then proposes an evolutionary algorithm-based approach using novel encoding, population initialization, fitness evaluation, and genetic operators tailored for this problem. Experimental results show the algorithm can achieve better solutions than existing QoS optimization scheduling algorithms in most cases.
The document discusses various performance measures for parallel computing including speedup, efficiency, Amdahl's law, and Gustafson's law. Speedup is defined as the ratio of sequential to parallel execution time. Efficiency is defined as speedup divided by the number of processors. Amdahl's law provides an upper bound on speedup based on the fraction of sequential operations, while Gustafson's law estimates speedup based on the fraction of time spent in serial code for a fixed problem size on varying processors. Other topics covered include performance bottlenecks, data races, data race avoidance techniques, and deadlock avoidance using virtual channels.
The document discusses parallel programming and synchronization techniques for concurrent tasks. It covers parallel programming concepts like concurrent execution, guarded commands, and tasks. It also describes synchronization mechanisms like semaphores, messages, and rendezvous that allow tasks to coordinate access to shared resources and data. The document uses Ada as an example programming language to illustrate features for defining and managing concurrent tasks.
The document describes an algorithm called HK scheduling for scheduling real-time tasks on multiprocessors. It aims to combine the advantages of list scheduling and H scheduling. HK scheduling maintains a variable tCK that divides the schedule into two parts - the first part tries to keep at least k processors busy, while the second part uses highest priority scheduling. The analysis shows H2 scheduling produces better worst-case lengths than H scheduling, while remaining feasible like H scheduling.
This document discusses various load balancing algorithms that can be applied in cloud computing. It begins with an introduction to cloud computing models including infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS). It then discusses the goals of load balancing in cloud computing. The main part of the document describes and provides examples of several load balancing algorithms: Round Robin, Opportunistic Load Balancing, Minimum Completion Time, and Minimum Execution Time. For each algorithm, it explains the basic approach and provides an example to illustrate how it works.
A novel methodology for task distributionijesajournal
Modern embedded systems are being modeled as Heterogeneous Reconfigurable Computing Systems
(HRCS) where Reconfigurable Hardware i.e. Field Programmable Gate Array (FPGA) and soft core
processors acts as computing elements. So, an efficient task distribution methodology is essential for
obtaining high performance in modern embedded systems. In this paper, we present a novel methodology
for task distribution called Minimum Laxity First (MLF) algorithm that takes the advantage of runtime
reconfiguration of FPGA in order to effectively utilize the available resources. The MLF algorithm is a list
based dynamic scheduling algorithm that uses attributes of tasks as well computing resources as cost
function to distribute the tasks of an application to HRCS. In this paper, an on chip HRCS computing
platform is configured on Virtex 5 FPGA using Xilinx EDK. The real time applications JPEG, OFDM
transmitters are represented as task graph and then the task are distributed, statically as well dynamically,
to the platform HRCS in order to evaluate the performance of the designed task distribution model. Finally,
the performance of MLF algorithm is compared with existing static scheduling algorithms. The comparison
shows that the MLF algorithm outperforms in terms of efficient utilization of resources on chip and also
speedup an application execution.
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...IRJET Journal
This document proposes a Multi Queue (MQ) task scheduling algorithm for heterogeneous tasks in cloud computing. It aims to improve upon the Round Robin and Weighted Round Robin algorithms by overcoming their drawbacks. The MQ algorithm splits tasks and resources into separate queues based on size/length and speed. Small tasks are scheduled on slower resources and large tasks on faster resources. The document compares the performance of MQ to Round Robin and Weighted Round Robin algorithms based on makespan, average resource utilization, and load balancing level using CloudSim simulations. The results show that MQ scheduling performs better than the other algorithms in most cases in terms of these metrics.
Performance comparison on java technologies a practical approachcsandit
Performance responsiveness and scalability is a make-or-break quality for software. Nearly
everyone runs into performance problems at one time or another. This paper discusses about
performance issues faced during one of the project implemented in java technologies. The
challenges faced during the life cycle of the project and the mitigation actions performed. It
compares 3 java technologies and shows how improvements are made through statistical
analysis in response time of the application. The paper concludes with result analysis.
Evolutionary Multi-Goal Workflow Progress in ShadeIRJET Journal
This document summarizes an evolutionary multi-goal workflow scheduling algorithm for cloud computing environments. It begins by highlighting challenges with applying existing scheduling algorithms to clouds, which differ from traditional heterogeneous environments. It formulates the cloud workflow scheduling problem to optimize makespan and cost simultaneously as a multi-objective optimization problem. The paper then proposes an evolutionary algorithm-based approach using novel encoding, population initialization, fitness evaluation, and genetic operators tailored for this problem. Experimental results show the algorithm can achieve better solutions than existing QoS optimization scheduling algorithms in most cases.
The document discusses various performance measures for parallel computing including speedup, efficiency, Amdahl's law, and Gustafson's law. Speedup is defined as the ratio of sequential to parallel execution time. Efficiency is defined as speedup divided by the number of processors. Amdahl's law provides an upper bound on speedup based on the fraction of sequential operations, while Gustafson's law estimates speedup based on the fraction of time spent in serial code for a fixed problem size on varying processors. Other topics covered include performance bottlenecks, data races, data race avoidance techniques, and deadlock avoidance using virtual channels.
The document discusses parallel programming and synchronization techniques for concurrent tasks. It covers parallel programming concepts like concurrent execution, guarded commands, and tasks. It also describes synchronization mechanisms like semaphores, messages, and rendezvous that allow tasks to coordinate access to shared resources and data. The document uses Ada as an example programming language to illustrate features for defining and managing concurrent tasks.
The document describes an algorithm called HK scheduling for scheduling real-time tasks on multiprocessors. It aims to combine the advantages of list scheduling and H scheduling. HK scheduling maintains a variable tCK that divides the schedule into two parts - the first part tries to keep at least k processors busy, while the second part uses highest priority scheduling. The analysis shows H2 scheduling produces better worst-case lengths than H scheduling, while remaining feasible like H scheduling.
This document discusses various load balancing algorithms that can be applied in cloud computing. It begins with an introduction to cloud computing models including infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS). It then discusses the goals of load balancing in cloud computing. The main part of the document describes and provides examples of several load balancing algorithms: Round Robin, Opportunistic Load Balancing, Minimum Completion Time, and Minimum Execution Time. For each algorithm, it explains the basic approach and provides an example to illustrate how it works.
A novel methodology for task distributionijesajournal
Modern embedded systems are being modeled as Heterogeneous Reconfigurable Computing Systems
(HRCS) where Reconfigurable Hardware i.e. Field Programmable Gate Array (FPGA) and soft core
processors acts as computing elements. So, an efficient task distribution methodology is essential for
obtaining high performance in modern embedded systems. In this paper, we present a novel methodology
for task distribution called Minimum Laxity First (MLF) algorithm that takes the advantage of runtime
reconfiguration of FPGA in order to effectively utilize the available resources. The MLF algorithm is a list
based dynamic scheduling algorithm that uses attributes of tasks as well computing resources as cost
function to distribute the tasks of an application to HRCS. In this paper, an on chip HRCS computing
platform is configured on Virtex 5 FPGA using Xilinx EDK. The real time applications JPEG, OFDM
transmitters are represented as task graph and then the task are distributed, statically as well dynamically,
to the platform HRCS in order to evaluate the performance of the designed task distribution model. Finally,
the performance of MLF algorithm is compared with existing static scheduling algorithms. The comparison
shows that the MLF algorithm outperforms in terms of efficient utilization of resources on chip and also
speedup an application execution.
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGijccsa
Load balancing techniques in cloud computing can be applied at different levels. There are two main
levels: load balancing on physical server and load balancing on virtual servers. Load balancing on a
physical server is policy of allocating physical servers to virtual machines. And load balancing on virtual
machines is a policy of allocating resources from physical server to virtual machines for tasks or
applications running on them. Depending on the requests of the user on cloud computing is SaaS (Software
as a Service), PaaS (Platform as a Service) or IaaS (Infrastructure as a Service) that has a proper load
balancing policy. When receiving the task, the cloud data center will have to allocate these tasks efficiently
so that the response time is minimized to avoid congestion. Load balancing should also be performed
between different datacenters in the cloud to ensure minimum transfer time. In this paper, we propose a
virtual machine-level load balancing algorithm that aims to improve the average response time and
average processing time of the system in the cloud environment. The proposed algorithm is compared to the
algorithms of Avoid Deadlocks [5], Maxmin [6], Throttled [8] and the results show that our algorithms
have optimized response times.
Deadline and Suffrage Aware Task Scheduling Approach for Cloud EnvironmentIRJET Journal
The document proposes a deadline and suffrage aware task scheduling approach for cloud environments. It discusses limitations of existing approaches that can cause system imbalances. The proposed approach considers both task deadlines and priorities assigned by user votes ("suffrage") to schedule tasks. It was tested using CloudSim simulator and found to outperform the basic min-min approach in reducing completion times and improving resource utilization and provider profits while still meeting task deadlines.
Scalable scheduling of updates in streaming data warehousesIRJET Journal
This document discusses scheduling updates in streaming data warehouses. It proposes a scheduling framework to handle complications like view hierarchies, data consistency, inability to preempt updates, heterogeneous update jobs from different data sources, and transient overload. It models the update problem as a scheduling problem where the objective is to minimize data staleness over time. It then presents several update scheduling algorithms and discusses how performance is affected by different factors based on simulation experiments.
The document proposes a Medium Job High Priority (MJHP) scheduling algorithm for job scheduling in cloud computing. It classifies jobs as high, medium, or low priority based on their computational complexity and level of parallelism. The MJHP algorithm prioritizes jobs with medium computational complexity and high parallelism. It assigns these medium priority jobs to the fastest available resources to optimize computational speed and reduce resource usage. Performance studies show that MJHP outperforms existing algorithms like First Come First Serve, Shortest Job Fastest Resource, and Priority Based Scheduling by achieving the highest throughput in the shortest time.
Lecture 6: Infrastructure & Tooling (Full Stack Deep Learning - Spring 2021)Sergey Karayev
The document discusses infrastructure and tooling for full stack deep learning. It provides an overview of the different components involved, including compute, data processing, experiment management, deployment, and software engineering practices. Specifically, it covers topics like GPU basics, cloud computing options, development versus training needs, popular programming languages and editors like Python and Jupyter Notebooks, and setting up development environments.
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET Journal
This document discusses a framework for dynamic resource allocation and efficient scheduling strategies in cloud computing platforms for high-performance computing (HPC). It proposes using a parallel genetic algorithm to find optimal allocation of virtual machines to physical resources in order to maximize resource utilization. The algorithm represents the resource allocation problem as an unbalanced job scheduling problem. It uses genetic operators like mutation and crossover to efficiently allocate requests for resources to idle nodes. Compared to a traditional genetic algorithm, the parallel genetic algorithm improves the speed of finding the best allocation and increases resource utilization. Future work could explore implementing dynamic load balancing and using big data concepts on the cloud.
This document provides a summary of a student's seminar paper on resource scheduling algorithms. The paper discusses the need for resource scheduling algorithms in cloud computing environments. It then describes several types of algorithms commonly used for resource scheduling, including genetic algorithms, bee algorithms, ant colony algorithms, workflow algorithms, and load balancing algorithms. For each algorithm type, it provides a brief introduction, overview of the basic steps or concepts, and some examples of applications where the algorithm has been used. The paper was submitted by a student named Shilpa Damor to fulfill requirements for a degree in information technology.
Power Consumption in cloud centers is increasing
rapidly due to the popularity of Cloud Computing. High power
consumption not only leads to high operational cost, it also leads
to high carbon emissions which is not environment friendly.
Thousands of Physical Machines/Servers inside Cloud Centers
are becoming a commonplace. In many instances, some of the
Physical Machines might have very few active Virtual Machines,
migration of these Virtual Machines, so that, less loaded Physical
Machines can be shutdown, which in-turn aids in reduction of
consumed power has been extensively studied in the literature.
However, recent studies have demonstrated that, migration of
Virtual Machines is usually associated with excessive cost and
delay. Hence, recently, a new technique in which the load
balancing in cloud centers by migrating the extra tasks of
overloaded Virtual Machines was proposed. This task migration
technique has not been properly studied for its effectiveness
w.r.t. Server Consolidation in the literature. In this work, the
Virtual Machine task migration technique is extended to address
the Server Consolidation issue. Empirical results reveal excellent
effectiveness of the proposed technique in reducing the power
consumed in Cloud Centers.
Little's Law can be used to estimate project lead times and plan project delivery. It applies because projects represent batches of work items flowing through a system. By measuring the throughput of individual work stages, like development, you can calculate the average lead time for a work item and estimate when a project of a given size will be complete. Additional buffers should be included to protect against variability in different stages of work and potential issues. Proper use of Little's Law provides a way to estimate project timelines and costs to aid planning and customer commitments.
Cooperative Task Execution for Apache SparkDatabricks
Apache Spark has enabled a vast assortment of users to express batch, streaming, and machine learning computations, using a mixture of programming paradigms and interfaces. Lately, we observe that different jobs are often implemented as part of the same application to share application logic, state, or to interact with each other. Examples include online machine learning, real-time data transformation and serving, low-latency event monitoring and reporting. Although the recent addition of Structured Streaming to Spark provides the programming interface to enable such unified applications over bounded and unbounded data, the underlying execution engine was not designed to efficiently support jobs with different requirements (i.e., latency vs. throughput) as part of the same runtime. It therefore becomes particularly challenging to schedule such jobs to efficiently utilize the cluster resources while respecting their requirements in terms of task response times. Scheduling policies such as FAIR could alleviate the problem by prioritizing critical tasks, but the challenge remains, as there is no way to guarantee no queuing delays. Even though preemption by task killing could minimize queuing, it would also require task resubmission and loss of progress, leading to wasted cluster resources. In this talk, we present Neptune, a new cooperative task execution model for Spark with fine-grained control over resources such as CPU time. Neptune utilizes Scala coroutines as a lightweight mechanism to suspend task execution with sub-millisecond latency and introduces new scheduling policies that respect diverse task requirements while efficiently sharing the same runtime. Users can directly use Neptune for their continuous applications as it supports all existing DataFrame, DataSet, and RDD operators. We present an implementation of the execution model as part of Spark 2.4.0 and describe the observed performance benefits from running a number of streaming and machine learning workloads on an Azure cluster.
Speaker: Konstantinos Karanasos
Reinforcement learning based multi core scheduling (RLBMCS) for real time sys...IJECEIAES
This document summarizes a reinforcement learning based multi-core scheduling (RLBMCS) algorithm for real-time systems. The algorithm uses reinforcement learning to dynamically assign task priorities and place tasks in a multi-level feedback queue to schedule tasks across multiple processor cores. It aims to optimize metrics like CPU utilization, throughput, turnaround time, waiting time, response time and deadline meet ratio. Tasks can transition between four states - initial, objective degradation, objective progression, and objective stabilization - based on changes to a multi-objective optimization function. The scheduler acts as the agent and assigns tasks to queues/actions based on task and system states to maximize the optimization function over time.
This document discusses various techniques for estimating software project costs, schedules, and sizes. It covers function point analysis, lines of code estimation, productivity models like COCOMO, and probabilistic techniques like PERT estimation. Key approaches mentioned include analogies, decomposition, mathematical models, mean schedule dates, and probability distributions.
The document discusses computer architecture and describes the seven dimensions of an Instruction Set Architecture (ISA). It also defines dependability and its two measures - reliability and availability. Some example performance measurements are provided along with the processor performance equation. Finally, it discusses measuring, reporting, and summarizing computer performance using benchmarks and benchmark suites.
Medea: Scheduling of Long Running Applications in Shared Production ClustersPanagiotis Garefalakis
MEDEA: Scheduling of Long Running Applications in Shared Production Clusters
EuroSys'18
https://lsds.doc.ic.ac.uk/sites/default/files/medea-eurosys18.pdf
This document proposes a new task scheduling algorithm called Dynamic Heterogeneous Shortest Job First (DHSJF) for heterogeneous cloud computing systems. DHSJF aims to improve performance metrics like reduced makespan and low energy consumption by considering the heterogeneity of resources and workloads. It discusses existing scheduling algorithms like Round Robin, First Come First Serve and their limitations. The proposed DHSJF algorithm prioritizes tasks with the shortest estimated completion time to optimize resource utilization and improve overall performance of the cloud computing system. Simulation results show that DHSJF provides better results for metrics like average waiting time and turnaround time as compared to Round Robin and First Come First Serve scheduling algorithms.
Embedded Linux Conference 2013
https://github.com/ystk/sched-deadline/tree/dlmiss-detection-dev
Real-time system need to meet deadline. In this point of view, the system is required two functions to have determinism. One is interrupt latency stabilization and the other one is processing time reservation. SCHED_DEADLINE has a feature to reserve CPU time in advance to ensure predictable behavior. However there is a lack of feature to control deadline missed processes.
In this presentation, we would like to discuss the requirement for the feature and also show a sample implementation to control deadline missed processes.
The document discusses various methods for measuring computer performance, including MIPS, CPI/IPC, benchmark suites, and speedup. It also covers principles like Amdahl's law, making the common case faster, locality of reference, and exploiting parallelism to improve performance.
A Novel Dynamic Priority Based Job Scheduling Approach for Cloud EnvironmentIRJET Journal
The document proposes a new dynamic priority-based job scheduling algorithm for cloud environments to optimize the problem of starvation. It assigns priority to jobs based on criteria like CPU requirements, I/O requirements, and job criticality. The algorithm aims to reduce wait time, turnaround time, and increase throughput and CPU utilization. It was tested against the Shortest Job First algorithm in CloudSim simulation software. The results showed improvements in wait time, turnaround time, and total finish time compared to the SJF algorithm.
This document presents a novel approach for scheduling real-time tasks in a heterogeneous multicore processor using fuzzy logic techniques for micro-grid power management. It proposes using two fuzzy logic-based scheduling algorithms: 1) Assign priority to tasks based on their execution time and deadline. 2) Assign higher priority tasks to the high-performance core for execution and lower priority tasks to the low-performance cores. The goal is to increase throughput, improve CPU utilization, and reduce overall power consumption for micro-grid systems. The algorithms were evaluated using test cases with different task parameters in MATLAB, which showed improvements in performance and power reduction.
This document discusses AIOps and its importance for operating Kubernetes at scale. It begins with an introduction of the speaker and then discusses some of the challenges of monitoring and managing infrastructure and applications as they grow in complexity. Specifically, it notes the explosion of metrics from containers and microservices that make problems harder to identify and isolate. It then introduces AIOps as an approach that can help with both reactive and proactive monitoring through techniques like correlation of metrics, what-if analysis, and optimization of resources. Examples are given of how AIOps has been applied at companies to improve performance and utilization through techniques like scheduling, placement, and controlled oversubscription of resources.
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsPooyan Jamshidi
This document provides background on Pooyan Jamshidi's research related to learning software performance models for dynamic and uncertain environments. It summarizes his past work developing techniques for modeling and optimizing performance across different systems and environments, including using transfer learning to reuse performance data from related sources to build more accurate models with fewer measurements. It also outlines opportunities for using transfer learning to adapt performance models to new environments and systems.
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...IRJET Journal
This document discusses using the Teaching Learning Based Optimization (TLBO) meta-heuristic technique for service request scheduling between users and cloud service providers. TLBO is a nature-inspired algorithm that mimics the teacher-student learning process. It is compared to other meta-heuristic algorithms like Genetic Algorithm. The key steps of TLBO involve initializing a population, evaluating fitness, selecting the best solution as teacher, and updating the population through teacher and learner phases until termination criteria is met. The document proposes using number of users and virtual machines as parameters for TLBO scheduling in cloud computing. MATLAB simulation results show the initial and final iterations converging to an optimal scheduling solution.
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGijccsa
Load balancing techniques in cloud computing can be applied at different levels. There are two main
levels: load balancing on physical server and load balancing on virtual servers. Load balancing on a
physical server is policy of allocating physical servers to virtual machines. And load balancing on virtual
machines is a policy of allocating resources from physical server to virtual machines for tasks or
applications running on them. Depending on the requests of the user on cloud computing is SaaS (Software
as a Service), PaaS (Platform as a Service) or IaaS (Infrastructure as a Service) that has a proper load
balancing policy. When receiving the task, the cloud data center will have to allocate these tasks efficiently
so that the response time is minimized to avoid congestion. Load balancing should also be performed
between different datacenters in the cloud to ensure minimum transfer time. In this paper, we propose a
virtual machine-level load balancing algorithm that aims to improve the average response time and
average processing time of the system in the cloud environment. The proposed algorithm is compared to the
algorithms of Avoid Deadlocks [5], Maxmin [6], Throttled [8] and the results show that our algorithms
have optimized response times.
Deadline and Suffrage Aware Task Scheduling Approach for Cloud EnvironmentIRJET Journal
The document proposes a deadline and suffrage aware task scheduling approach for cloud environments. It discusses limitations of existing approaches that can cause system imbalances. The proposed approach considers both task deadlines and priorities assigned by user votes ("suffrage") to schedule tasks. It was tested using CloudSim simulator and found to outperform the basic min-min approach in reducing completion times and improving resource utilization and provider profits while still meeting task deadlines.
Scalable scheduling of updates in streaming data warehousesIRJET Journal
This document discusses scheduling updates in streaming data warehouses. It proposes a scheduling framework to handle complications like view hierarchies, data consistency, inability to preempt updates, heterogeneous update jobs from different data sources, and transient overload. It models the update problem as a scheduling problem where the objective is to minimize data staleness over time. It then presents several update scheduling algorithms and discusses how performance is affected by different factors based on simulation experiments.
The document proposes a Medium Job High Priority (MJHP) scheduling algorithm for job scheduling in cloud computing. It classifies jobs as high, medium, or low priority based on their computational complexity and level of parallelism. The MJHP algorithm prioritizes jobs with medium computational complexity and high parallelism. It assigns these medium priority jobs to the fastest available resources to optimize computational speed and reduce resource usage. Performance studies show that MJHP outperforms existing algorithms like First Come First Serve, Shortest Job Fastest Resource, and Priority Based Scheduling by achieving the highest throughput in the shortest time.
Lecture 6: Infrastructure & Tooling (Full Stack Deep Learning - Spring 2021)Sergey Karayev
The document discusses infrastructure and tooling for full stack deep learning. It provides an overview of the different components involved, including compute, data processing, experiment management, deployment, and software engineering practices. Specifically, it covers topics like GPU basics, cloud computing options, development versus training needs, popular programming languages and editors like Python and Jupyter Notebooks, and setting up development environments.
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET Journal
This document discusses a framework for dynamic resource allocation and efficient scheduling strategies in cloud computing platforms for high-performance computing (HPC). It proposes using a parallel genetic algorithm to find optimal allocation of virtual machines to physical resources in order to maximize resource utilization. The algorithm represents the resource allocation problem as an unbalanced job scheduling problem. It uses genetic operators like mutation and crossover to efficiently allocate requests for resources to idle nodes. Compared to a traditional genetic algorithm, the parallel genetic algorithm improves the speed of finding the best allocation and increases resource utilization. Future work could explore implementing dynamic load balancing and using big data concepts on the cloud.
This document provides a summary of a student's seminar paper on resource scheduling algorithms. The paper discusses the need for resource scheduling algorithms in cloud computing environments. It then describes several types of algorithms commonly used for resource scheduling, including genetic algorithms, bee algorithms, ant colony algorithms, workflow algorithms, and load balancing algorithms. For each algorithm type, it provides a brief introduction, overview of the basic steps or concepts, and some examples of applications where the algorithm has been used. The paper was submitted by a student named Shilpa Damor to fulfill requirements for a degree in information technology.
Power Consumption in cloud centers is increasing
rapidly due to the popularity of Cloud Computing. High power
consumption not only leads to high operational cost, it also leads
to high carbon emissions which is not environment friendly.
Thousands of Physical Machines/Servers inside Cloud Centers
are becoming a commonplace. In many instances, some of the
Physical Machines might have very few active Virtual Machines,
migration of these Virtual Machines, so that, less loaded Physical
Machines can be shutdown, which in-turn aids in reduction of
consumed power has been extensively studied in the literature.
However, recent studies have demonstrated that, migration of
Virtual Machines is usually associated with excessive cost and
delay. Hence, recently, a new technique in which the load
balancing in cloud centers by migrating the extra tasks of
overloaded Virtual Machines was proposed. This task migration
technique has not been properly studied for its effectiveness
w.r.t. Server Consolidation in the literature. In this work, the
Virtual Machine task migration technique is extended to address
the Server Consolidation issue. Empirical results reveal excellent
effectiveness of the proposed technique in reducing the power
consumed in Cloud Centers.
Little's Law can be used to estimate project lead times and plan project delivery. It applies because projects represent batches of work items flowing through a system. By measuring the throughput of individual work stages, like development, you can calculate the average lead time for a work item and estimate when a project of a given size will be complete. Additional buffers should be included to protect against variability in different stages of work and potential issues. Proper use of Little's Law provides a way to estimate project timelines and costs to aid planning and customer commitments.
Cooperative Task Execution for Apache SparkDatabricks
Apache Spark has enabled a vast assortment of users to express batch, streaming, and machine learning computations, using a mixture of programming paradigms and interfaces. Lately, we observe that different jobs are often implemented as part of the same application to share application logic, state, or to interact with each other. Examples include online machine learning, real-time data transformation and serving, low-latency event monitoring and reporting. Although the recent addition of Structured Streaming to Spark provides the programming interface to enable such unified applications over bounded and unbounded data, the underlying execution engine was not designed to efficiently support jobs with different requirements (i.e., latency vs. throughput) as part of the same runtime. It therefore becomes particularly challenging to schedule such jobs to efficiently utilize the cluster resources while respecting their requirements in terms of task response times. Scheduling policies such as FAIR could alleviate the problem by prioritizing critical tasks, but the challenge remains, as there is no way to guarantee no queuing delays. Even though preemption by task killing could minimize queuing, it would also require task resubmission and loss of progress, leading to wasted cluster resources. In this talk, we present Neptune, a new cooperative task execution model for Spark with fine-grained control over resources such as CPU time. Neptune utilizes Scala coroutines as a lightweight mechanism to suspend task execution with sub-millisecond latency and introduces new scheduling policies that respect diverse task requirements while efficiently sharing the same runtime. Users can directly use Neptune for their continuous applications as it supports all existing DataFrame, DataSet, and RDD operators. We present an implementation of the execution model as part of Spark 2.4.0 and describe the observed performance benefits from running a number of streaming and machine learning workloads on an Azure cluster.
Speaker: Konstantinos Karanasos
Reinforcement learning based multi core scheduling (RLBMCS) for real time sys...IJECEIAES
This document summarizes a reinforcement learning based multi-core scheduling (RLBMCS) algorithm for real-time systems. The algorithm uses reinforcement learning to dynamically assign task priorities and place tasks in a multi-level feedback queue to schedule tasks across multiple processor cores. It aims to optimize metrics like CPU utilization, throughput, turnaround time, waiting time, response time and deadline meet ratio. Tasks can transition between four states - initial, objective degradation, objective progression, and objective stabilization - based on changes to a multi-objective optimization function. The scheduler acts as the agent and assigns tasks to queues/actions based on task and system states to maximize the optimization function over time.
This document discusses various techniques for estimating software project costs, schedules, and sizes. It covers function point analysis, lines of code estimation, productivity models like COCOMO, and probabilistic techniques like PERT estimation. Key approaches mentioned include analogies, decomposition, mathematical models, mean schedule dates, and probability distributions.
The document discusses computer architecture and describes the seven dimensions of an Instruction Set Architecture (ISA). It also defines dependability and its two measures - reliability and availability. Some example performance measurements are provided along with the processor performance equation. Finally, it discusses measuring, reporting, and summarizing computer performance using benchmarks and benchmark suites.
Medea: Scheduling of Long Running Applications in Shared Production ClustersPanagiotis Garefalakis
MEDEA: Scheduling of Long Running Applications in Shared Production Clusters
EuroSys'18
https://lsds.doc.ic.ac.uk/sites/default/files/medea-eurosys18.pdf
This document proposes a new task scheduling algorithm called Dynamic Heterogeneous Shortest Job First (DHSJF) for heterogeneous cloud computing systems. DHSJF aims to improve performance metrics like reduced makespan and low energy consumption by considering the heterogeneity of resources and workloads. It discusses existing scheduling algorithms like Round Robin, First Come First Serve and their limitations. The proposed DHSJF algorithm prioritizes tasks with the shortest estimated completion time to optimize resource utilization and improve overall performance of the cloud computing system. Simulation results show that DHSJF provides better results for metrics like average waiting time and turnaround time as compared to Round Robin and First Come First Serve scheduling algorithms.
Embedded Linux Conference 2013
https://github.com/ystk/sched-deadline/tree/dlmiss-detection-dev
Real-time system need to meet deadline. In this point of view, the system is required two functions to have determinism. One is interrupt latency stabilization and the other one is processing time reservation. SCHED_DEADLINE has a feature to reserve CPU time in advance to ensure predictable behavior. However there is a lack of feature to control deadline missed processes.
In this presentation, we would like to discuss the requirement for the feature and also show a sample implementation to control deadline missed processes.
The document discusses various methods for measuring computer performance, including MIPS, CPI/IPC, benchmark suites, and speedup. It also covers principles like Amdahl's law, making the common case faster, locality of reference, and exploiting parallelism to improve performance.
A Novel Dynamic Priority Based Job Scheduling Approach for Cloud EnvironmentIRJET Journal
The document proposes a new dynamic priority-based job scheduling algorithm for cloud environments to optimize the problem of starvation. It assigns priority to jobs based on criteria like CPU requirements, I/O requirements, and job criticality. The algorithm aims to reduce wait time, turnaround time, and increase throughput and CPU utilization. It was tested against the Shortest Job First algorithm in CloudSim simulation software. The results showed improvements in wait time, turnaround time, and total finish time compared to the SJF algorithm.
This document presents a novel approach for scheduling real-time tasks in a heterogeneous multicore processor using fuzzy logic techniques for micro-grid power management. It proposes using two fuzzy logic-based scheduling algorithms: 1) Assign priority to tasks based on their execution time and deadline. 2) Assign higher priority tasks to the high-performance core for execution and lower priority tasks to the low-performance cores. The goal is to increase throughput, improve CPU utilization, and reduce overall power consumption for micro-grid systems. The algorithms were evaluated using test cases with different task parameters in MATLAB, which showed improvements in performance and power reduction.
This document discusses AIOps and its importance for operating Kubernetes at scale. It begins with an introduction of the speaker and then discusses some of the challenges of monitoring and managing infrastructure and applications as they grow in complexity. Specifically, it notes the explosion of metrics from containers and microservices that make problems harder to identify and isolate. It then introduces AIOps as an approach that can help with both reactive and proactive monitoring through techniques like correlation of metrics, what-if analysis, and optimization of resources. Examples are given of how AIOps has been applied at companies to improve performance and utilization through techniques like scheduling, placement, and controlled oversubscription of resources.
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsPooyan Jamshidi
This document provides background on Pooyan Jamshidi's research related to learning software performance models for dynamic and uncertain environments. It summarizes his past work developing techniques for modeling and optimizing performance across different systems and environments, including using transfer learning to reuse performance data from related sources to build more accurate models with fewer measurements. It also outlines opportunities for using transfer learning to adapt performance models to new environments and systems.
Service Request Scheduling in Cloud Computing using Meta-Heuristic Technique:...IRJET Journal
This document discusses using the Teaching Learning Based Optimization (TLBO) meta-heuristic technique for service request scheduling between users and cloud service providers. TLBO is a nature-inspired algorithm that mimics the teacher-student learning process. It is compared to other meta-heuristic algorithms like Genetic Algorithm. The key steps of TLBO involve initializing a population, evaluating fitness, selecting the best solution as teacher, and updating the population through teacher and learner phases until termination criteria is met. The document proposes using number of users and virtual machines as parameters for TLBO scheduling in cloud computing. MATLAB simulation results show the initial and final iterations converging to an optimal scheduling solution.
The midterm presentation document provides an agenda and overview of Team AWE-K2's midterm presentation for their IS480 class. The presentation covers their project storyboard, technical challenges faced with 10 iterations, non-functional requirements around response time, a demonstration of their application, and details on project management including status, highlights, scope, schedule and metrics. Key technical challenges involved working with Google Maps API, retrieving complex data from Drupal, and optimizing performance.
Earlier stage for straggler detection and handling using combined CPU test an...IJECEIAES
This document summarizes a research paper that proposes a new framework called the combinatory late-machine (CLM) framework to facilitate early detection and handling of straggler tasks in MapReduce jobs. Straggler tasks significantly increase job execution time and energy consumption. The CLM framework combines CPU testing and the Longest Approximate Time to End (LATE) methodology to calculate a straggler tolerance threshold earlier. This allows for prompt mitigation actions. The paper reviews related work on straggler detection techniques and discusses the proposed methodology, which estimates task finish times based on progress scores. It aims to correlate straggler detection with system attributes like resource utilization that could cause delays.
Critical Chain Project Management & Theory of ConstraintsAbhay Kumar
Critical Chain Project Management (CCPM) uses aggressive task estimates and buffers to eliminate wasted time from practices like multitasking, student syndrome, and Parkinson's law. It identifies the critical path and adds a project buffer at the end to protect the deadline. CCPM is based on the Theory of Constraints (TOC), which involves identifying, exploiting, subordinating, and elevating constraints. CCPM and TOC are applied in both waterfall and agile projects by aggressively estimating tasks, avoiding multitasking on the critical path, monitoring buffer consumption, and using TOC to resolve impediments.
This document discusses performance analysis and parallel computing. It defines performance metrics like speedup, efficiency, and scalability that are used to evaluate parallel programs. Sources of parallel overhead like synchronization, load imbalance, and communication are described. The document also discusses benchmarks used to evaluate parallel systems like PARSEC and Rodinia. It emphasizes that overall execution time captures a system's real performance and depends on factors like CPU time, I/O, memory access, and interactions between programs.
This document provides an overview of time management and project scheduling concepts. It discusses common scheduling mistakes, the importance of scheduling, and strategies for effective time management. It also covers key project scheduling processes from the PMBOK Guide, including defining activities, sequencing activities, estimating durations, developing the schedule, and controlling the schedule. Methods like critical path analysis, resource leveling, crashing, and float are explained. The document concludes with definitions of important scheduling terms.
LEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTIONcscpconf
An operating system scheduler is expected to not allow processor stay idle if there is any
process ready or waiting for its execution. This problem gains more importance as the numbers
of processes always outnumber the processors by large margins. It is in this regard that
schedulers are provided with the ability to preempt a running process, by following any
scheduling algorithm, and give us an illusion of simultaneous running of several processes. A
process which is allowed to utilize CPU resources for a fixed quantum of time (termed as
timeslice for preemption) and is then preempted for another waiting process. Each of these
'process preemption' leads to considerable overhead of CPU cycles which are valuable resource
for runtime execution. In this work we try to utilize the historical performances of a scheduler
and predict the nature of current running process, thereby trying to reduce the number of
preemptions. We propose a machine-learning module to predict a better performing timeslice
which is calculated based on static knowledge base and adaptive reinforcement learning based
suggestive module. Results for an "adaptive timeslice parameter" for preemption show good
saving on CPU cycles and efficient throughput time.
Similar to Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications (20)
Accelerating distributed joins in Apache Hive: Runtime filtering enhancementsPanagiotis Garefalakis
Apache Hive is an open-source relational database system that is widely adopted by several organizations for big data analytic workloads. It combines traditional MPP (massively parallel processing) techniques with more recent cloud computing concepts to achieve the increased scalability and high performance needed by modern data intensive applications. Even though it was originally tailored towards long running data warehousing queries, its architecture recently changed with the introduction of LLAP (Live Long and Process) layer. Instead of regular containers, LLAP utilizes long-running executors to exploit data sharing and caching possibilities within and across queries. Executors eliminate unnecessary disk IO overhead and thus reduce the latency of interactive BI (business intelligence) queries by orders of magnitude. However, as container startup cost and IO overhead is now minimized, the need to effectively utilize memory and CPU resources across long-running executors in the cluster is becoming increasingly essential. For instance, in a variety of production workloads, we noticed that the memory bandwidth of early decoding all table columns for every row, even when this row is dropped later on, is starting to overwhelm the performance of single query execution. In this talk, we focus on some of the optimizations we introduced in Hive 4.0 to increase CPU efficiency and save memory allocations. In particular, we describe the lazy decoding (or row-level filtering) and composite bloom-filters optimizations that greatly improve the performance of queries containing broadcast joins, reducing their runtime by up to 50%. Over several production and synthetic workloads, we show the benefit of the newly introduced optimizations as part of Cloudera’s cloud-native Data Warehouse engine. At the same time, the community can directly benefit from the presented features as are they 100% open-source!
- The document discusses a thesis presentation on bridging the gap between serving and analytics in scalable web applications.
- It outlines challenges with resource efficiency and isolation in typical web app designs that separate online and offline tasks.
- The presentation proposes an in-memory web objects model to express both serving and analytics logic as a single distributed dataflow graph to improve resource utilization while maintaining service level objectives.
This document summarizes work on strengthening consistency in the Cassandra distributed key-value store. The researchers replaced Cassandra's replication mechanism with strongly consistent alternatives like Oracle BDB to improve data consistency. They also implemented a new membership protocol to rapidly propagate changes to clients, replacing Cassandra's gossip-based approach. An initial implementation on a cluster of 6 Cassandra nodes showed performance comparable to Cassandra for Yahoo's YCSB benchmark. Future work involves further evaluation of scalability and availability and adding elasticity capabilities.
This master's thesis proposes a distributed key-value store based on replicated LSM trees. The main contributions are a high-performance data replication primitive that combines the ZAB protocol with LSM tree implementation, and a technique for changing replication group leaders prior to heavy compactions to improve write throughput by up to 60%. Evaluation shows the system outperforms Apache Cassandra and Oracle NoSQL. Future work includes adding elasticity and optimizing Zookeeper load balancing.
The document provides an overview of Nagios, an open source network monitoring software. It discusses storage management challenges, what Nagios is, and provides tutorial topics on how to start a Nagios server, write storage service monitoring code, monitor local and remote storage, and handle events. The tutorial covers installing and configuring Nagios, defining hosts and services, writing check commands, installing NRPE for remote monitoring, and using event handlers to automate responses. Additional Nagios resources are also listed.
The document discusses using a wireless sensor network to improve data center management operations. It aims to automatically determine server locations, notify administrators of location changes, and determine server status even if the network is down. The proposed solution uses an auto-configuring Zigbee wireless sensor network and the open-source Nagios distributed monitoring system extended with a wireless sensor plugin to integrate sensor data and correlate events. An evaluation in an office and data center environment found the system could accurately detect server movement and identify failures even during network partitions.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications
1. NEPTUNE
Scheduling Suspendable Tasks
for Unified Stream/Batch Applications
SoCC, Santa Cruz, California, November 2019
Panagiotis Garefalakis
Imperial College London
pgaref@imperial.ac.uk
Konstantinos Karanasos
Microsoft
kokarana@microsoft.com
Peter Pietzuch
Imperial College London
prp@imperial.ac.uk
2. Unified application example
Panagiotis Garefalakis - Imperial College London 2
Inference Job
Low-latency
responses
Trained
Model
Historical
data
Real-time
data
Training Job
Iterate
Stream
Batch Application
3. Evolution of analytics frameworks
Panagiotis Garefalakis - Imperial College London 3
Batch frameworks
20142010 2018
Frameworks
with hybrid
stream/batch
applicationsStream frameworks
Unified
stream/batch
frameworks
Structured Streaming
4. Requirements
> Latency: Execute inference job with minimum delay
> Throughput: Batch jobs should not be compromised
> Efficiency: Achieve high cluster resource utilization
Stream/Batch application requirements
Panagiotis Garefalakis - Imperial College London 4
Challenge: schedule stream/batch jobs to
satisfy their diverse requirements
5. Stream/Batch application scheduling
Panagiotis Garefalakis - Imperial College London 5
2xT
Inference (stream) Job
2xT
3T TTraining (batch) Job
Stage1
T
Stage2
T
2x 2x
3T3T3T
Stage1
TT
Stage2
4x 3x
Application
Code
Driver
DAG Scheduler
submitApp Context
run job
6. Stream/Batch application scheduling
Panagiotis Garefalakis - Imperial College London 6
2xT
Inference (stream) Job
2xT
3T TTraining (batch) Job
3T
3T
3T
T T T T
4T
3T
executor1executor2
8T
T
T
T
Wasted
resources
Cores
2T 6T
Stage1
T
Stage2
T
2x 2x
3T3T3T
Stage1
TT
Stage2
4x 3x
> Static allocation: dedicate resources to each job
Resources can not be shared across jobs
7. Stream/Batch application scheduling
Panagiotis Garefalakis - Imperial College London 7
2xT 2xT
3T T
4T 8T2T 6T
Stage1
T
Stage2
T
2x 2x
3T3T3T
Stage1
TT
Stage2
4x 3x
> FIFO: first job runs to completion
3T
3T
3T
3T
T
T
T
T T
T
Long batch jobs increase stream job latency
Cores
T
Inference (stream) Job
Training (batch) Jobsharedexecutors
8. Stream/Batch application scheduling
Panagiotis Garefalakis - Imperial College London 8
2xT 2xT
3T T
4T 8T2T 6T
Stage1
T
Stage2
T
2x 2x
3T3T3T
Stage1
TT
Stage2
4x 3x
> FAIR: weight share resources across jobs
Cores
3T
3T
3T
3T
T
T
T
T
T
T
T
queuing
Better packing with non-optimal latency
Inference (stream) Job
Training (batch) Jobsharedexecutors
9. Stream/Batch application scheduling
Panagiotis Garefalakis - Imperial College London 9
2xT 2xT
3T T
4T 8T2T 6T
Stage1
T
Stage2
T
2x 2x
3T3T3T
Stage1
TT
Stage2
4x 3x
> KILL: avoid queueing by preempting batch tasks
Cores
3T
3T
3T
3T
T
T
T
T
T
T 3T
T 3T
Better latency at the expense of extra work
Inference (stream) Job
Training (batch) Jobsharedexecutors
10. Stream/Batch application scheduling
Panagiotis Garefalakis - Imperial College London 10
2xT 2xT
3T T
4T 8T2T 6T
Stage1
T
Stage2
T
2x 2x
3T3T3T
Stage1
TT
Stage2
4x 3x
> NEPTUNE: minimize queueing and wasted work!
Cores
Inference (stream) Job
Training (batch) Jobsharedexecutors
3T
3T
3T
3T
T
T
T
T
T
2T
2TT
T
11. > How to minimize queuing for latency-sensitive jobs
and wasted work?
Implement suspendable tasks
> How to natively support stream/batch applications?
Provide a unified execution framework
> How to satisfy different stream/batch application
requirements and high-level objectives?
Introduces custom scheduling policies
Challenges
Panagiotis Garefalakis - Imperial College London 11
12. > How to minimize queuing for latency-sensitive jobs
and wasted work?
Implement suspendable tasks
> How to natively support stream/batch applications?
Provide a unified execution framework
> How to satisfy different stream/batch application
requirements and high-level objectives?
Introduces custom scheduling policies
NEPTUNE
Execution framework for Stream/Batch applications
Panagiotis Garefalakis - Imperial College London 12
Support suspendable tasks
Introduce pluggable scheduling policies
Unified execution framework on top of
Structured Streaming
13. Typical tasks
Panagiotis Garefalakis - Imperial College London 13
Executor
Stack
Task run
Value
Context
Iterator
Function
> Tasks: apply a function to a partition of
data
> Subroutines that run in executor to
completion
> Preemption problem:
> Loss of progress (kill)
> Unpredictable preemption times
(checkpointing)
State
14. Suspendable tasks
Panagiotis Garefalakis - Imperial College London 14
Function
Context
Iterator
Coroutine
Stack
callyield
> Idea: use coroutines
> Separate stacks to store task
state
> Yield points handing over
control to the executor
> Cooperative preemption:
> Suspend and resume in
milliseconds
> Work-preserving
> Transparent to the user
Executor
Stack
Task run
Value
State
Context
https://github.com/storm-enroute/coroutines
15. Execution framework
Panagiotis Garefalakis - Imperial College London 15
> Idea: centralized scheduler with pluggable policies
> Problem: not just assign but also suspend and resume
ExecutorExecutor
DAG scheduler
Task Scheduler
Scheduling policy
Executor
Tasks
Low-pri job High-pri job
Running Paused
suspend &
run task
App + job priorities
LowHigh
Tasks
Incrementalizer
Optimizer
launch
task
metrics
16. Scheduling policies
Panagiotis Garefalakis - Imperial College London 16
> Idea: policies trigger task suspension and resumption
> Guarantee that stream tasks bypass batch tasks
> Satisfy higher-level objectives i.e. balance cluster load
> Avoid starvation by suspending up to a number of times
> Load-balancing (LB): takes into account executors’
memory conditions and equalize the number of tasks
per node
> Locality- and memory aware (LMA): respect task
locality preferences in addition to load-balancing
17. > Built as an extension to
2.4.0 (https://github.com/lsds/Neptune)
> Ported all ResultTask, ShuffleMapTask functionality
across programming interfaces to coroutines
> Extended Spark’s DAG Scheduler to allow job
stages with different requirements (priorities)
> Added additional Executor performance metrics as
part of the heartbeat mechanism
Implementation
Panagiotis Garefalakis - Imperial College London 17
18. > Cluster
– 75 nodes with 4 cores and 32 GB of memory each
> Workloads
– LDA: ML training/inference application uncovering
hidden topics from a group of documents
– Yahoo Streaming Benchmark: ad-analytics on a
stream of ad impressions
– TPC-H decision support benchmark
Azure deployment
Panagiotis Garefalakis - Imperial College London 18
19. Benefit of NEPTUNE in stream latency
Panagiotis Garefalakis - Imperial College London 19
> LDA: training (batch) job using all available resources, with
a latency-sensitive inference (stream) using 15% of resources
NEPTUNE achieves latencies comparable to
the ideal for the latency-sensitive jobs
LB
Neptune
LMA
Neptune
IsolationKILLFAIRFIFOStatic
allocation
37%
13%
61%
54%
99th
median
5th
20. Impact of resource demands in performance
Panagiotis Garefalakis - Imperial College London 20
Past to future
> YSB: increasing stream job resource demands while batch
job using all available resources
1.5%
Efficiently share resources with low impact on
throughput
21. NEPTUNE supports complex unified
applications with diverse job
requirements!
>Suspendable tasks using coroutines
>Pluggable scheduling policies
>Continuous unified analytics
Thank you!
Questions?
Panagiotis Garefalakis
pgaref@imperial.ac.uk
Summary
https://github.com/lsds/Neptune
23. Suspension mechanism effectiveness
Panagiotis Garefalakis - Imperial College London 23
> TPCH: Task runtime distribution for each query ranges from
100s of milliseconds to 10s of seconds
24. Suspension mechanism effectiveness
Panagiotis Garefalakis - Imperial College London 24
> TPCH: Continuously transition tasks from Paused to
Resumed states until completion
Suspendable tasks effectively pause and
resume with sub-millisecond latencies
25. Suspension mechanism effectiveness
Panagiotis Garefalakis - Imperial College London 25
> TPCH: Continuously transition tasks from Paused to
Resumed states until completion
Coroutine tasks have minimal performance
overhead by bypassing the OS
2 4 8 16 32 64
Parallelism
0.0
2.0
4.0
6.0
8.0
10.0
Pauselatency(ms)
Coroutines ThreadSync
26. > Run a simple unified application with
> A high-priority latency-sensitive job
> A low-priority latency-tolerant job
> Schedule them with default Spark and Neptune
> Goal: show benefit of Neptune and ease
of use
Demo
Panagiotis Garefalakis - Imperial College London 26
27. Suspendable tasks
Panagiotis Garefalakis - Imperial College London 27
val collect (TaskContext, Iterator[T]) =>
(Int, Array[T]) = {
val result = new
mutable.ArrayBuffer[T]
while (itr.hasNext) {
result.append(itr.next)
}
result.toArray
}
val collect (TaskContext, Iterator[T]) => (Int, Array[T]) ={
coroutine {(context: TaskContext, itr: Iterator[T]) => {
val result = new mutable.ArrayBuffer[T]
while (itr.hasNext) {
result.append(itr.next)
if (context.isPaused())
yieldval(0)
}
result.toArray
} }
Subroutine Coroutine
Editor's Notes
Hello everyone, I am Panagiotis and this talk is about Neptune: a new task execution framework for modern unified applications
This is joint work with Kostantinos Karanasos from Microsoft and my supervisor Peter Pietzuch at the LSDS group Imperial
Neptune is a data processing framework; and developers today... (click)
Want to use such frameworks to develop complex ML applications
This figure depicts such a production application implementing a real-time malicious behavior detection service
(c) consisting of a training job and an inference job
The training job using historical data to train a machine learning model
and the inference job performing latency-sensitive inference to detect malicious behavior
(click) using the previously trained model which is shared in memory
This type of unified application design allows such jobs to easily
Share application state,
application logic, Result consistency, even Sharing of computation
To support such applications, analytics frameworks such as.. that were traditionally dedicated to either batch or stream processing (click)
evolved to unify different use cases by exposing different APIs (DStreams and RDD API) but using these APIs jobs were still deployed as separate applications (no true unification)
(click) Only recently started exposing unified programming interfaces to seamlessly express different types of jobs as part of the same application
Spark Structured streaming and Flink Table API provide such programming interfaces enabling users to develop unified applications
Such unified apps are known with different names such as hybrid, continuous applications for the rest of the presentation I will refer to them as stream/batch
However despite the unified app support from the API point of view, there is very little done at scheduling side to capture and satisfy the diverse job requirements of those applications.
(click) this is a new challenge: how to schedule unified application jobs to satisfy their diverse requirements
To schedule such an application in a data processing framework today, the application code will be submitted to a centralized driver using a submit command, then the driver using the same context will run application jobs by the DAG scheduler -> The DAG scheduler will split each job into stages
In our example two stages each. Dashed rectangles represent those stages
Every stage consists of tasks running computation, taking time T – some training task are more computationally intensive and take longer (3T)
(c) The number of tasks of each stage is defined by the number of partitions it operates on: namely stage1 and 2 of inference job operate on 2 partitions while stage1 of the training job operates on 4.
For execution..
(c) A typical approach to satisfy requirements of the application is static resource allocation, where we dedicate a portion of resources to each job for example 3/4..
(c) we then schedule every stage of each job using the available resources
(c) until completion (c) (c)
In this scenario: even though stream job latency is low, it is still 2x higher than the optimal
(c) And more importantly we end up with a bunch of wasted resources as in static allocation...
(c) can not be shared across jobs they remain unused
A more resource efficient solution would be to use a unified framework such as Spark SS that enables sharing executors across jobs
By default jobs are scheduled in a FIFO fashion where the first submitted job runs to completion
(c) Assuming it is the training is submitted first it will use all resources
(c) until completion, and then the inference stages will be scheduled (c)
(c) However FIFO causes significant queuing delays especially when the jobs scheduled before the latency sensitive ones are long
stream job has worse latency than static allocation
An alternative is FAIR policy equally sharing resources across jobs (possibly weighted)
(c) FAIR first schedules two tasks from each job stage
(c) and then the next equal share until completion (c)
Even though FAIR packs resources better and reduces the stream job’s response time by
(c) 2T compared to FIFO, it cannot guarantee optimal latency for the stream job as it can not avoid queuing
(c) so FAIR achieves better packing..
To avoid queuing delays for stream tasks completely, it is possible to use non-work preserving preemption coupled with a strategy such as FAIR to kill batch tasks when needed
(c) In that scenario we would schedule tasks in a FAIR manner (c)
(c) And when stream tasks NEED to run we would preempt already running batch tasks
(c) Then we would need to restart all killed tasks losing any progress (c)
(c) KILL can achieve better latency BUT at the expense..
Given that batch and stream jobs share the same runtime, can we do better?
What we want is to do instead is
(c) to suspend low-priority batch tasks in favor of higher-priority stream tasks and
(c) resume them when free resources are available
There are a few challenges we need to solve – such as:
Neptune is our new execution framework tackling these challenges, with support for:
Suspendable tasks
Coroutines can suspend latency-tolerant tasks within ms and resume later avoiding loosing progress
(c) Providing a unified execution framework
Users express such applications using existing programming interfaces
(c) Introducing pluggable scheduling policies
To satisfy different requirements of stream/batch applications
Tasks in analytics frameworks such as Spark today are implemented as subroutines – applying a function on a partition of data given as an iterator
(c) To run the task the executor allocates space for the return value of the function in the executor call stack
(c) Adds a function reference
(c) Allocates space for the variables and intermediate state
(c) At the end of the execution the return value is written to the appropriate slot and we return to the executor callsite
However to preempt any of those tasks we either have to kill them loosing progress or checkpoint the intermediate state with unpredictable latency
Instead in Neptune we implement tasks using lightweight coroutines
(c)Coroutines use a separate stack to store their variables and add yield points after the processing of each record
(c) To run the task the executor now invokes a special call function which creates the coroutine stack instance to store the function reference, variables, and state
(c) To Preempt: the executor marks the task context to paused and the task then yields in the next record iteration
the executor can later resume the same task instance or invoke another function
Using coroutines in Neptune provides a transparent..
Support function composition which is fundamental for more complex logic
The problem now is that we don't just assign tasks to executors but also decide..
Nep supports pluggable policies deciding which and when tasks get suspended!
Lets see how a application is executed in Neptune:
(c) Users develop an application and mark jobs with priorities as low or high
(c) The app is going through an optimization phase creating a plan in a form of a DAG
(c) The DAG Scheduler computes the execution plan of each job of the application
(c) Tasks ready-to-execute are then passed on to the Task Scheduler
(c) TaskSched immediately executes a tasks if there are enough free resources
(c) Otherwise, the Policy can decide to suspend running batch tasks to free up resources for stream tasks to start their execution immediately
(c) The oldest suspended task is scheduled for execution when others terminate to avoid loosing progress
Of course: system challenges: task suspension should be fast and smart - and scheduling policy can takes into account multiple task and cluster properties
For the evaluation, we deployed Neptune in a 75 node azure cluster
For workloads we used…
(a latency-sensitive and a latency-tolerant instance)
First I want to show you how Neptune compares with existing approaches in terms of latency: unified LDA app with a latency-sensitive job and a latency-tolerant job
(c) We first run the stream job in total isolation to get the ideal case for stream latency
(c) We then run LDA with FIFO and FAIR as implemented in SPARK. We observe that FAIR reduces the 99th percentile latency by 37% compared to FIFO but the median is still is 2× higher than running in Isolation
(c) Adding preemption to FAIR improves median by 54% compared to FAIR, but its 99th percentile latency is 2x higher (cannot preempt more than a fair share of resources)
(c) We then run LDA with Neptune and two different policies
the NEP-LMA achieves a latency that is just 13% higher than Isolation for the 99th
NEPTUNE without cache awareness (NEP-LB) achieves 61% worse latency for the 99th percentile compared to NEP-LMA
(c) Finally running the jobs in different execs has tighter distribution but high tail latencies due to executor interference
(c) NEPTUNE achieves latencies comparable to the ideal for the latency-sensitive jobs
Since some applications require more resources than others.. We also measured the impact of increasing resources expressed as cores in Neptune performance
We run two instances of YSB..
(c) NEPTUNE maintains low latency across all percentages even though the tasks that must be suspended increase, showing the effectiveness of our preemption mechanism
(c) At the same time as the preempted batch tasks increase they take a hit in throughput
In this case however sharing 100% of the cluster resources only drops throughput drops by 1.5%
(c) As a result we achieve efficient share of resources with low impact on throughput
(c) This figure is also like a glimpse from past to future with 0% cores being the static allocation approach and 100% is the share everything approach which is the future!
NEPTUNE an execution framework that supports…
Implements suspendable tasks that can pause and resume in milliseconds
Supports smart scheduling policies deciding how to suspend
To measure the effectiveness of of suspension mechanism we use TPCH:
We run the TPCH benchmark on a cluster of 4 Azure machines and measure the task runtime distribution for each query (with grey).
Queries follow different task runtime distributions ranging from 100s of milliseconds to 10s of seconds,
we re-run the benchmark and continuously transition tasks from PAUSED to RESUMED state until completion while measuring the latency for each transition.
By continuously transitioning between states and triggering yield points, we want to measure the worst case scenario in terms of transition latency for each query tasks.
(c) Although queries follow different task runtime distributions Neptune manages to pause and resume tasks with sub-millisecond latencies
An exception is Q14 for which the 75th percentile for the pause latency is 100 ms – data reside in a single partition and the Parquet reader operates on a partition granularity
TPCH query 1
ThreadSync relies on the preemption of the OS scheduler and is implemented using thread wait and notify calls
Alternate from PAUSED to RESUMED states while increasing parallelism on the X-axis
As the parallelism increases the TheadSync latency increases 2.6x for 14 threads up to 600ms for 64 threads
OS scheduler must arbitrate between wait and run queues continuously – we bypass OS scheduler
Now let me show you how to use Neptune to run unified application
I am going to run two jobs, one with low and one with high priority
Goal: Compare Default Spark with Neptune and show effect on latency
To give you an idea how suspendable tasks are implemented in Neptune
The code snippet on the left shows the implementation of the collect action that receives the taskContext and a record iterator as arguments and returns all dataset elements
(c) The code snippet on the right shows the same logic implemented with coroutines
When the task context is marked as paused it yields a value to the executor
the executor can then resume the same task instance or invoke another task