This was my dissertation piece for my Software Engineering Degree.
The JVM CPU Cluster Balancer is a scalable, proof of concept system designed to distribute processes over a network to perform multiple tasks at once, in a language of high abstraction. Once distributed, workers return results to an access server, all while monitoring their respective CPUs for computational stress in terms of CPU usage. CPU’s incurring set stress then have their respective processes moved to a less intensive area in the cluster, balancing work overall.
A WebLogic Server cluster consists of multiple WebLogic Server instances running simultaneously and working together to provide increased scalability and reliability. A cluster appears to clients to be a single WebLogic Server instance. Server instances in a cluster can run on the same machine or different machines. Clusters provide high availability through application failover and scalability by adding additional server instances. Key elements of a cluster include load balancing of requests across server instances and replication of HTTP session and EJB states.
JavaOne 2016
JMS is pretty simple, right? Once you’ve mastered topics and queues, the rest can appear trivial, but that isn’t the case. The queuing system, whether ActiveMQ, OpenMQ, or WebLogic JMS, provides many more features and settings than appear in the Java EE documentation. This session looks at some of the important extended features and configuration settings. What would you need to optimize if your messages are large or you need to minimize prefetching? What is the best way to implement time-delayed messages? The presentation also looks at dangerous bugs that can be introduced via simple misconfigurations with pooled beans. The JMS APIs are deceptively simple, but getting an implementation into production and tuned correctly can be a bit trickier.
This document provides an introduction to Apache ActiveMQ. It discusses how ActiveMQ is a Java Message Service (JMS) and message-oriented middleware that provides asynchronous messaging. It supports cross-language clients and features high performance clustering of brokers for scalability and master-slave configurations for persistence and reliability. The document uses an example of sending a "Hello World" message to demonstrate basic usage of ActiveMQ.
The document discusses how WebLogic Server uses a single common thread pool that prioritizes and schedules work. The thread pool size adjusts automatically based on historical throughput statistics to maximize performance while reducing complexity compared to custom thread pools. Work is prioritized according to user-defined rules and run-time metrics like request processing time and rates.
This document provides information about servlets and the servlet API. It defines a servlet as a Java program that runs on a web server and responds to client requests. It discusses how servlets use interfaces like Servlet, ServletConfig, and ServletContext to access configuration and context information. It also describes how HTTPServlet handles HTTP requests and responses using methods like doGet and doPost. The document explains the servlet lifecycle of initialization, processing requests via service(), and destruction. It provides examples of using the HttpRequest and HttpResponse interfaces to read request data and send responses.
Phil Basford - machine learning at scale with aws sage makerAWSCOMSUM
The document discusses a machine learning endpoint architecture experiment conducted using Amazon SageMaker. Key aspects covered include:
- The reference architecture used Amazon SageMaker endpoints running Docker containers with inference engines like XGBoost and TensorFlow.
- An experiment tested endpoint scaling and performance under load using Artillery. It found endpoints automatically scaled to two instances and each could handle high request volumes, but starting a new instance took 7 minutes.
- Analysis of CloudWatch logs determined that instances handled load evenly and autoscaled as needed when an instance terminated.
The document discusses servlet fundamentals and the three-tier model for web applications. It describes the three tiers as the client side (web browser), server side (web server/application server), and database (DBMS) tiers. It explains how servlets allow separating the business logic from the user interface, and how they provide dynamic web content through the Java programming language. Common Gateway Interface (CGI) and its drawbacks are also summarized.
A WebLogic Server cluster consists of multiple WebLogic Server instances running simultaneously and working together to provide increased scalability and reliability. A cluster appears to clients to be a single WebLogic Server instance. Server instances in a cluster can run on the same machine or different machines. Clusters provide high availability through application failover and scalability by adding additional server instances. Key elements of a cluster include load balancing of requests across server instances and replication of HTTP session and EJB states.
JavaOne 2016
JMS is pretty simple, right? Once you’ve mastered topics and queues, the rest can appear trivial, but that isn’t the case. The queuing system, whether ActiveMQ, OpenMQ, or WebLogic JMS, provides many more features and settings than appear in the Java EE documentation. This session looks at some of the important extended features and configuration settings. What would you need to optimize if your messages are large or you need to minimize prefetching? What is the best way to implement time-delayed messages? The presentation also looks at dangerous bugs that can be introduced via simple misconfigurations with pooled beans. The JMS APIs are deceptively simple, but getting an implementation into production and tuned correctly can be a bit trickier.
This document provides an introduction to Apache ActiveMQ. It discusses how ActiveMQ is a Java Message Service (JMS) and message-oriented middleware that provides asynchronous messaging. It supports cross-language clients and features high performance clustering of brokers for scalability and master-slave configurations for persistence and reliability. The document uses an example of sending a "Hello World" message to demonstrate basic usage of ActiveMQ.
The document discusses how WebLogic Server uses a single common thread pool that prioritizes and schedules work. The thread pool size adjusts automatically based on historical throughput statistics to maximize performance while reducing complexity compared to custom thread pools. Work is prioritized according to user-defined rules and run-time metrics like request processing time and rates.
This document provides information about servlets and the servlet API. It defines a servlet as a Java program that runs on a web server and responds to client requests. It discusses how servlets use interfaces like Servlet, ServletConfig, and ServletContext to access configuration and context information. It also describes how HTTPServlet handles HTTP requests and responses using methods like doGet and doPost. The document explains the servlet lifecycle of initialization, processing requests via service(), and destruction. It provides examples of using the HttpRequest and HttpResponse interfaces to read request data and send responses.
Phil Basford - machine learning at scale with aws sage makerAWSCOMSUM
The document discusses a machine learning endpoint architecture experiment conducted using Amazon SageMaker. Key aspects covered include:
- The reference architecture used Amazon SageMaker endpoints running Docker containers with inference engines like XGBoost and TensorFlow.
- An experiment tested endpoint scaling and performance under load using Artillery. It found endpoints automatically scaled to two instances and each could handle high request volumes, but starting a new instance took 7 minutes.
- Analysis of CloudWatch logs determined that instances handled load evenly and autoscaled as needed when an instance terminated.
The document discusses servlet fundamentals and the three-tier model for web applications. It describes the three tiers as the client side (web browser), server side (web server/application server), and database (DBMS) tiers. It explains how servlets allow separating the business logic from the user interface, and how they provide dynamic web content through the Java programming language. Common Gateway Interface (CGI) and its drawbacks are also summarized.
The document discusses reactive programming and how it can help solve issues with asynchronous programming. Reactive programming uses observable data streams and asynchronous operations to make asynchronous code more readable and maintainable. It introduces reactive programming concepts like observables, operators, and schedulers. Examples are given of how reactive extensions like RxJS can simplify asynchronous tasks like autocomplete and handling click events. The benefits of building reactive systems and microservices using a reactive approach are also covered.
This document provides information about Java Database Connectivity (JDBC) and how to connect Java applications to databases. It discusses the four types of JDBC drivers, the interfaces in the JDBC API including DriverManager, Connection, Statement, and ResultSet. It also provides examples of registering drivers, establishing a database connection, executing queries, and closing the connection in five steps.
A practical look at the different strategies to deploy an application to Kubernetes. We list the pros and cons of each strategy and define which one to adopt depending on real world examples and use cases.
IBM IMPACT 2014 AMC-1866 Introduction to IBM Messaging CapabilitiesPeter Broadhurst
IBM Messaging provides market-leading capabilities for anywhere-to-anywhere integration across mobile, cloud, and enterprise platforms - from the simplest pair of applications requiring basic connectivity and data exchange, to the most complex business process management environments. Come to this session to understand the value and rationale of message/queuing and the IBM Messaging family of products; its key features and functions; and how it can be used to build a secure, flexible, and scalable messaging backbone for a business.
IBM IMPACT 2014 - AMC-1883 - Where's My Message - Analyze IBM WebSphere MQ Re...Peter Broadhurst
Every MQ infrastructure team member has been asked the question, and most developers who have worked with MQ have asked it: "Where is my message?". In this session we look into the tools that MQ provides to find your messages. We demonstrate how to analyze the MQ recovery log on distributed platforms to find out what happened to your persistent messages, with the assistance of a new tool. We also look at how to trace the route messages take through your MQ infrastructure, and how to generate and analyze activity reports showing the behavior of MQ applications.
Apache ActiveMQ - Enterprise messaging in actiondejanb
This document provides an overview of Apache ActiveMQ, an open source messaging platform. It discusses key ActiveMQ concepts like topics, queues, and messaging protocols. It also covers ActiveMQ enterprise features such as high availability, clustering, security, and monitoring. The document concludes by discussing ActiveMQ performance tuning, scaling, and future plans.
Service Discovery in Distributed System with DCOS & Kubernettes. - Sahil SawhneyKnoldus Inc.
This pdf walks you through how service discovery could help us to facilitate inter-service communication in a distributed environment.
We would be targetting how service discovery is achieved in Kubernetes and DC/OS, the leading distributed infra-facilitators
IBM Managing Workload Scalability with MQ ClustersIBM Systems UKI
This document discusses various clustering scenarios for WebSphere MQ, beginning with a simple initial setup and expanding in complexity. It addresses scenarios like workload balancing, high availability during failures, and location dependencies when applications and services are distributed across data centers separated by large distances. Key points covered include using queue aliases, cluster workload priorities, and the AMQSCLM monitoring tool to help direct messages to available instances of services and ensure responses can be routed properly even if client or queue manager failures occur.
Continuous deployment of polyglot microservices: A practical approachJuan Larriba
This document discusses a practical approach to continuous deployment of polyglot microservices. It introduces the author and describes how traditional companies are adopting DevOps practices. The approach focuses on being continuous, using multiple programming languages as needed, immutable infrastructure with containers, reliability through functional testing, automated deployments, and practical architecture. Kubernetes and OpenShift are discussed as platform options. Lessons learned include that Kubernetes alone often fits needs better than OpenShift, and external service discovery can replace ingress controllers when using an external router.
The document discusses clustering unikernel microservices in networking. It proposes abstracting the internal networking of unikernels working closely together so that multiple unikernel instances appear as a single host in the network layer. This is achieved by using tools like Open vSwitch and SDN controllers to define network flows that connect unikernel instances within groups and to the outside network in a secure manner. A Python script and JSON configuration file are used to automate deployment of the unikernel clusters and their network configuration.
IBM IMPACT 2014 - AMC-1882 Building a Scalable & Continuously Available IBM M...Peter Broadhurst
This document provides an overview of designing a scalable and highly available IBM MQ infrastructure. Key points include:
- Using a client/server architecture with MQ deployed separately from applications provides flexibility and allows MQ to be treated as critical infrastructure similar to a database.
- Each sender should connect to two queue managers and each receiver should have two listeners concurrently attached to provide redundancy and no single point of failure.
- Other topics covered include synchronous request/response, publish/subscribe messaging, limitations for ordered messages, and integrating with IBM Integration Bus.
The document emphasizes an active/active design philosophy with minimum two queue managers and discusses workload management strategies for sending and receiving messages across multiple queue managers.
IBM MQ - High Availability and Disaster RecoveryMarkTaylorIBM
IBM MQ provides capabilities to keep data safe and businesses running in the event of failures. This includes solutions for high availability (HA) and disaster recovery (DR) whether running on-premises or in hybrid cloud environments. HA aims to keep systems running through failures while DR focuses on recovering after an HA failure. Key HA technologies in IBM MQ include queue manager clusters, queue sharing groups, multi-instance queue managers, and HA clusters. These solutions provide redundancy to prevent single points of failure and enable fast failover. DR requires replicating data to separate sites which IBM MQ supports through various backup and replication features.
The document provides an overview of containers and Kubernetes. It discusses the need for containers due to microservices and infrastructure as code. It then covers technical details of containers like Dockerfiles, images, and registries. It also discusses Kubernetes and its components like kube-apiserver, etcd, and kubelet. Finally, it covers Kubernetes concepts like pods, services, deployments, and how they are configured.
The mysqlnd replication and load balancing pluginUlf Wendel
The mysqlnd replication and load balancing plugin for mysqlnd makes using MySQL Replication from PHP much easier. The plugin takes care of Read/Write splitting, Load Balancing, Failover and Connection Pooling. Lazy Connections, a feature not only useful with replication, help reducing the MySQL server load. Like any other mysqlnd plugin, the plugin operates mostly transparent from an applications point of view and can be used in a drop-in style.
Grokking TechTalk #16: React stack at loziGrokking VN
Speaker: Thinh Nguyen - Web Developer @ Lozi.vn
Bio: I don't know what I've done, but people keep blaming me when they can't access the website ¯\_(ツ)_/¯
Description: How we build Lozi website with React and a team of 2
This document discusses parameters for tuning the performance of WebLogic servers. It covers OS-level TCP parameters, JVM heap size and GC logging parameters, WebLogic server-level parameters like work managers, execute queues, and stuck threads, and JDBC and JMS pool parameters. It also provides an overview of different types of garbage collection in the HotSpot JVM.
This document provides an overview of server-side web programming and different technologies used to create dynamic web pages, including Common Gateway Interface (CGI), servlets, and JavaServer Pages (JSP). CGI allows building dynamic web sites by running programs on the server that can generate HTML responses. Servlets provide a Java-based alternative to CGI with improved performance, portability, and security. Servlets use a request-response model and are executed by a servlet container. JSP is a technology that simplifies web page programming by mixing static elements like HTML with scripting code.
This document summarizes a presentation about Alpakka, a Reactive Enterprise Integration library for Java and Scala based on Reactive Streams and Akka Streams. Alpakka provides connectors to various data sources and messaging systems that allow them to be accessed and processed using Akka Streams. Examples of connectors discussed include Kafka, MQTT, JMS, Elasticsearch and various cloud platforms. The document also provides an overview of Akka Streams and how they allow building responsive, asynchronous and resilient data processing pipelines.
Container Orchestration with Docker Swarm and KubernetesWill Hall
This presentation covers the basics of what container orchestration is providing pros and cons of Docker Swarm, Kubernetes and Amazon ECS and outlining the terms and tools you will need to successfully use them.
A Survey of Performance Comparison between Virtual Machines and Containersprashant desai
Since the onset of Cloud computing and its inroads into infrastructure as a service, Virtualization has become peak
of importance in the field of abstraction and resource management. However, these additional layers of abstraction provided by virtualization come at a trade-off between performance and cost in a cloud environment where everything is on a pay-per-use basis. Containers which are perceived to be the future of virtualization are developed to address these issues. This study paper scrutinizes the performance of a conventional virtual machine and contrasts them with the containers. We cover the critical
assessment of each parameter and its behavior when its subjected to various stress tests. We discuss the implementations and their performance metrics to help us draw conclusions on which one is ideal to use for desired needs. After assessment of the result and discussion of the limitations, we conclude with prospects for future research
Power-Efficient Programming Using Qualcomm Multicore Asynchronous Runtime Env...Qualcomm Developer Network
The need to support both today’s multicore performance and tomorrow’s heterogeneous computing has become increasingly important. Qualcomm® Multicore Asynchronous Runtime Environment (MARE) provides powerful and easy-to-use abstractions to write parallel software. This session will provide a deep dive into the concepts of power-efficient programming and how to use Qualcomm MARE APIs to get energy and thermal benefits for Android apps. Qualcomm Multicore Asynchronous Runtime Environment is a product of Qualcomm Technologies, Inc.
Learn more about Qualcomm Multicore Asynchronous Runtime Environment: https://developer.qualcomm.com/MARE
Watch this presentation on YouTube:
https://www.youtube.com/watch?v=RI8yXhBb8Hg
The document discusses reactive programming and how it can help solve issues with asynchronous programming. Reactive programming uses observable data streams and asynchronous operations to make asynchronous code more readable and maintainable. It introduces reactive programming concepts like observables, operators, and schedulers. Examples are given of how reactive extensions like RxJS can simplify asynchronous tasks like autocomplete and handling click events. The benefits of building reactive systems and microservices using a reactive approach are also covered.
This document provides information about Java Database Connectivity (JDBC) and how to connect Java applications to databases. It discusses the four types of JDBC drivers, the interfaces in the JDBC API including DriverManager, Connection, Statement, and ResultSet. It also provides examples of registering drivers, establishing a database connection, executing queries, and closing the connection in five steps.
A practical look at the different strategies to deploy an application to Kubernetes. We list the pros and cons of each strategy and define which one to adopt depending on real world examples and use cases.
IBM IMPACT 2014 AMC-1866 Introduction to IBM Messaging CapabilitiesPeter Broadhurst
IBM Messaging provides market-leading capabilities for anywhere-to-anywhere integration across mobile, cloud, and enterprise platforms - from the simplest pair of applications requiring basic connectivity and data exchange, to the most complex business process management environments. Come to this session to understand the value and rationale of message/queuing and the IBM Messaging family of products; its key features and functions; and how it can be used to build a secure, flexible, and scalable messaging backbone for a business.
IBM IMPACT 2014 - AMC-1883 - Where's My Message - Analyze IBM WebSphere MQ Re...Peter Broadhurst
Every MQ infrastructure team member has been asked the question, and most developers who have worked with MQ have asked it: "Where is my message?". In this session we look into the tools that MQ provides to find your messages. We demonstrate how to analyze the MQ recovery log on distributed platforms to find out what happened to your persistent messages, with the assistance of a new tool. We also look at how to trace the route messages take through your MQ infrastructure, and how to generate and analyze activity reports showing the behavior of MQ applications.
Apache ActiveMQ - Enterprise messaging in actiondejanb
This document provides an overview of Apache ActiveMQ, an open source messaging platform. It discusses key ActiveMQ concepts like topics, queues, and messaging protocols. It also covers ActiveMQ enterprise features such as high availability, clustering, security, and monitoring. The document concludes by discussing ActiveMQ performance tuning, scaling, and future plans.
Service Discovery in Distributed System with DCOS & Kubernettes. - Sahil SawhneyKnoldus Inc.
This pdf walks you through how service discovery could help us to facilitate inter-service communication in a distributed environment.
We would be targetting how service discovery is achieved in Kubernetes and DC/OS, the leading distributed infra-facilitators
IBM Managing Workload Scalability with MQ ClustersIBM Systems UKI
This document discusses various clustering scenarios for WebSphere MQ, beginning with a simple initial setup and expanding in complexity. It addresses scenarios like workload balancing, high availability during failures, and location dependencies when applications and services are distributed across data centers separated by large distances. Key points covered include using queue aliases, cluster workload priorities, and the AMQSCLM monitoring tool to help direct messages to available instances of services and ensure responses can be routed properly even if client or queue manager failures occur.
Continuous deployment of polyglot microservices: A practical approachJuan Larriba
This document discusses a practical approach to continuous deployment of polyglot microservices. It introduces the author and describes how traditional companies are adopting DevOps practices. The approach focuses on being continuous, using multiple programming languages as needed, immutable infrastructure with containers, reliability through functional testing, automated deployments, and practical architecture. Kubernetes and OpenShift are discussed as platform options. Lessons learned include that Kubernetes alone often fits needs better than OpenShift, and external service discovery can replace ingress controllers when using an external router.
The document discusses clustering unikernel microservices in networking. It proposes abstracting the internal networking of unikernels working closely together so that multiple unikernel instances appear as a single host in the network layer. This is achieved by using tools like Open vSwitch and SDN controllers to define network flows that connect unikernel instances within groups and to the outside network in a secure manner. A Python script and JSON configuration file are used to automate deployment of the unikernel clusters and their network configuration.
IBM IMPACT 2014 - AMC-1882 Building a Scalable & Continuously Available IBM M...Peter Broadhurst
This document provides an overview of designing a scalable and highly available IBM MQ infrastructure. Key points include:
- Using a client/server architecture with MQ deployed separately from applications provides flexibility and allows MQ to be treated as critical infrastructure similar to a database.
- Each sender should connect to two queue managers and each receiver should have two listeners concurrently attached to provide redundancy and no single point of failure.
- Other topics covered include synchronous request/response, publish/subscribe messaging, limitations for ordered messages, and integrating with IBM Integration Bus.
The document emphasizes an active/active design philosophy with minimum two queue managers and discusses workload management strategies for sending and receiving messages across multiple queue managers.
IBM MQ - High Availability and Disaster RecoveryMarkTaylorIBM
IBM MQ provides capabilities to keep data safe and businesses running in the event of failures. This includes solutions for high availability (HA) and disaster recovery (DR) whether running on-premises or in hybrid cloud environments. HA aims to keep systems running through failures while DR focuses on recovering after an HA failure. Key HA technologies in IBM MQ include queue manager clusters, queue sharing groups, multi-instance queue managers, and HA clusters. These solutions provide redundancy to prevent single points of failure and enable fast failover. DR requires replicating data to separate sites which IBM MQ supports through various backup and replication features.
The document provides an overview of containers and Kubernetes. It discusses the need for containers due to microservices and infrastructure as code. It then covers technical details of containers like Dockerfiles, images, and registries. It also discusses Kubernetes and its components like kube-apiserver, etcd, and kubelet. Finally, it covers Kubernetes concepts like pods, services, deployments, and how they are configured.
The mysqlnd replication and load balancing pluginUlf Wendel
The mysqlnd replication and load balancing plugin for mysqlnd makes using MySQL Replication from PHP much easier. The plugin takes care of Read/Write splitting, Load Balancing, Failover and Connection Pooling. Lazy Connections, a feature not only useful with replication, help reducing the MySQL server load. Like any other mysqlnd plugin, the plugin operates mostly transparent from an applications point of view and can be used in a drop-in style.
Grokking TechTalk #16: React stack at loziGrokking VN
Speaker: Thinh Nguyen - Web Developer @ Lozi.vn
Bio: I don't know what I've done, but people keep blaming me when they can't access the website ¯\_(ツ)_/¯
Description: How we build Lozi website with React and a team of 2
This document discusses parameters for tuning the performance of WebLogic servers. It covers OS-level TCP parameters, JVM heap size and GC logging parameters, WebLogic server-level parameters like work managers, execute queues, and stuck threads, and JDBC and JMS pool parameters. It also provides an overview of different types of garbage collection in the HotSpot JVM.
This document provides an overview of server-side web programming and different technologies used to create dynamic web pages, including Common Gateway Interface (CGI), servlets, and JavaServer Pages (JSP). CGI allows building dynamic web sites by running programs on the server that can generate HTML responses. Servlets provide a Java-based alternative to CGI with improved performance, portability, and security. Servlets use a request-response model and are executed by a servlet container. JSP is a technology that simplifies web page programming by mixing static elements like HTML with scripting code.
This document summarizes a presentation about Alpakka, a Reactive Enterprise Integration library for Java and Scala based on Reactive Streams and Akka Streams. Alpakka provides connectors to various data sources and messaging systems that allow them to be accessed and processed using Akka Streams. Examples of connectors discussed include Kafka, MQTT, JMS, Elasticsearch and various cloud platforms. The document also provides an overview of Akka Streams and how they allow building responsive, asynchronous and resilient data processing pipelines.
Container Orchestration with Docker Swarm and KubernetesWill Hall
This presentation covers the basics of what container orchestration is providing pros and cons of Docker Swarm, Kubernetes and Amazon ECS and outlining the terms and tools you will need to successfully use them.
A Survey of Performance Comparison between Virtual Machines and Containersprashant desai
Since the onset of Cloud computing and its inroads into infrastructure as a service, Virtualization has become peak
of importance in the field of abstraction and resource management. However, these additional layers of abstraction provided by virtualization come at a trade-off between performance and cost in a cloud environment where everything is on a pay-per-use basis. Containers which are perceived to be the future of virtualization are developed to address these issues. This study paper scrutinizes the performance of a conventional virtual machine and contrasts them with the containers. We cover the critical
assessment of each parameter and its behavior when its subjected to various stress tests. We discuss the implementations and their performance metrics to help us draw conclusions on which one is ideal to use for desired needs. After assessment of the result and discussion of the limitations, we conclude with prospects for future research
Power-Efficient Programming Using Qualcomm Multicore Asynchronous Runtime Env...Qualcomm Developer Network
The need to support both today’s multicore performance and tomorrow’s heterogeneous computing has become increasingly important. Qualcomm® Multicore Asynchronous Runtime Environment (MARE) provides powerful and easy-to-use abstractions to write parallel software. This session will provide a deep dive into the concepts of power-efficient programming and how to use Qualcomm MARE APIs to get energy and thermal benefits for Android apps. Qualcomm Multicore Asynchronous Runtime Environment is a product of Qualcomm Technologies, Inc.
Learn more about Qualcomm Multicore Asynchronous Runtime Environment: https://developer.qualcomm.com/MARE
Watch this presentation on YouTube:
https://www.youtube.com/watch?v=RI8yXhBb8Hg
The document proposes an architecture for a mashup container to execute mashups in virtualized environments according to the Platform as a Service (PaaS) model. It describes a monolithic solution where a single virtual machine runs a complete service execution platform (SEP) to execute mashup sessions with negligible performance degradation. Fault tolerance is provided by replicating the SEP virtual machine using the hypervisor-based fault tolerance mechanism of virtualized environments. Testing showed this approach enabled zero-downtime failover with only a small increase in latency and resource usage.
Kubecon 2023 EU - KServe - The State and Future of Cloud-Native Model ServingTheofilos Papapanagiotou
KServe is a cloud-native open source project for serving production ML models built on CNCF projects like Knative and Istio. In this talk, we’ll update you on KServe’s progress towards 1.0, the latest developments, such as ModelMesh and InferenceGraph, and its future roadmap. We’ll discuss the Kubernetes design patterns used in KServe to achieve the core ML inference capability, as well as the design philosophy behind KServe and how it integrates the CNCF ecosystem so you can walk up and down the stack to use features to meet your production model deployment requirements. The well-designed InferenceService interface encapsulates the complexity of networking, lifecycle, server configurations and allows you to easily add serverless capabilities to model servers like TensorFlow Serving, TorchServe, and Triton on CPU/GPU. You can also turn on full service mesh mode to secure your InferenceServices. We’ll walk through different scenarios to show how you can quickly start with KServe and evolve to a production-ready setup with scalability, security, observability, and auto-scaling acceleration using CNCF projects like Knative, Istio, SPIFFE/SPIRE, OpenTelemetry, and Fluid.
A Distributed Control Law for Load Balancing in Content Delivery NetworksSruthi Kamal
1. The document presents a novel load balancing algorithm for content delivery networks that aims to minimize load imbalance and metric movement costs.
2. It proposes estimating system state through probability distributions of node capacities and load to help peers schedule transfers without centralized control.
3. Each peer independently manipulates partial system information and reassigns virtual servers based on the approximated system state.
Efficient Resource Allocation to Virtual Machine in Cloud Computing Using an ...ijceronline
The focus of the paper is to generate an advance algorithm of resource allocation and load balancing that can deduced and avoid the dead lock while allocating the processes to virtual machine. In VM while processes are allocate they executes in queue , the first process get resources , other remains in waiting state .As rest of VM remains idle . To utilize the resources, we have analyze the algorithm with the help of First-Come, First-Served (FCFS) Scheduling, Shortest-Job-First (SJR) Scheduling, Priority Scheduling, Round Robin (RR) and CloudSIM Simulator.
Cloud computing is the set of distributed computing nodes. It is the use of computing resources that are delivered as a service over a network. Virtualization plays a crucial role in cloud computing. Typically VMs are offered in different types, each type have its own characteristics which includes number of CPU cores, amount of main memory, etc. and cost. Presently, static algorithms are being used for scheduling VM instances in cloud. Instead of these, an algorithm is proposed here which dynamically detects the load and then schedules the tasks. The main purpose of the proposed scheduling strategy is to find the minimally loaded computational node. Upon receiving task requests from the clients, server has to schedule these to a minimally loaded node among all available computing nodes.
This document provides an agenda and overview for an event-driven architecture workshop. The workshop will cover topics including event-driven architecture, change data capture, cloud-native integration, how cloud architectures have evolved, modern application elements like APIs, events and data, container-based application development, and integration patterns. Hands-on labs will allow attendees to work with technologies like Apache Kafka and Red Hat AMQ Streams.
Cloud vendor lock-in is one of
the major problems in cloud computing where the
customer is locked to a particular vendor so that it
will be difficult to migrate from one cloud to the
other. The problem is that once an app has been
developed based on a particular cloud service
provider’s API that apps is bound to that provider
as a result of which migration from one cloud to
the other becomes more complex because of
changes in architectures of different cloud service
vendors[2]. The problem can be solved by
providing a standardized way of interacting with
cloud service providers taking many factors into
consideration and by isolating each individual
module involved in the cloud service provider’s
API and bringing out the common things and
uniting them together so that in future any CSP
will have to obey that specific standards and build
their APIs without the need of creating a new
standard that makes migration from/to that CSP
complex.
RESEARCH ON DISTRIBUTED SOFTWARE TESTING PLATFORM BASED ON CLOUD RESOURCEijcses
In order to solve the low efficiency problem of large-scale distributed software testing , CBDSTP(
Cloud-Based Distributed Software Testing Platform) is put forward.This platform can provide continous
integration and automation of testing for large software systems, which can make full use of resources on
the cloud clients, achieving testing result s in the real environment and reasonable allocating testing jobs,
to resolve the Web application software configuration test, compatibility test and distributed test problems,
to reduce costs, improve efficiency. Through making MySQL testing on this prototype system, the
verification is made for platform architecture and job allocation effectiveness.
Genetic Algorithm for task scheduling in Cloud Computing EnvironmentSwapnil Shahade
This document proposes a modified genetic algorithm to schedule tasks in cloud computing environments. It begins with an introduction and background on cloud computing and task scheduling. It then describes the standard genetic algorithm approach and introduces the modified genetic algorithm which uses Longest Cloudlet to Fastest Processor and Smallest Cloudlet to Fastest Processor scheduling algorithms to generate the initial population. The implementation and results show that the modified genetic algorithm reduces makespan and cost compared to the standard genetic algorithm.
The document discusses load balancing algorithms for cluster computing environments. It proposes a fully centralized and partially distributed algorithm (FCPDA) that dynamically maps jobs to communicators (groups of processors) to improve response time and performance. The algorithm allows a communicator to take on additional jobs if it completes its initial job early. This approach aims to better balance the workload compared to other algorithms and reduce overall job completion time.
This summary provides the key details about a proposed load balancing algorithm in 3 sentences:
The document proposes a fully centralized and partially distributed load balancing algorithm that dynamically distributes tasks from a master processor to slave processors organized into communicators. The master processor monitors the workload and response time of each communicator to dynamically map additional tasks as communicators complete their work, improving resource utilization and response time. The algorithm forms a matrix to track the workload and response time of each communicator for different task types to aid the master processor in optimally balancing the load over time.
Building A Linux Cluster Using Raspberry PI #1!A Jorge Garcia
This document summarizes a project to build symmetric and asymmetric multiprocessing platforms using Raspberry Pi clusters. Sobel edge detection was used as a target application to analyze performance. Symmetric multiprocessing was achieved using a Raspberry Pi 2's four cores, while an asymmetric cluster was built by connecting four Raspberry Pis over Ethernet. Both SMP and ASMP programs achieved 4-10x better performance than sequential programs. The project explored design choices and parallelization challenges. A hybrid openMP and MPI program achieved better performance than a sequential program on high-end servers, showing low-cost parallel platforms can outperform expensive sequential ones.
The document describes a proposed grid computing framework that aims to make grid computing easier to deploy, use, and maintain. The framework would accept computational problems from users, distribute tasks to client machines based on dependencies and load balancing, collect and compile results from clients, and present outputs to the user. The framework is intended to address concerns with existing grid middleware being complicated and not accessible to all, and will be open source, Linux-based, and work on a moderately sized local area network.
Chat application through client server management system project.pdfKamal Acharya
This project focused on creating a chatting application with communication environment. The objective of our project is to build a chatting system to facilitate the communication between two or more clients to obtain an effective channel among the clients themselves. For the application itself, this system can serve as a link to reach out for all clients. The design of the system depends on socket concept where is a software endpoint that establishes bidirectional communication between a server program and one or more client programs. Languages that will be used for the development of this system: Java Development Kit (JDK): is a development environment for building applications and components using the Java programming language.
Node.js is an open-source JavaScript runtime environment that allows building scalable server-side and networking applications. It uses asynchronous, event-driven, non-blocking I/O which makes it lightweight and efficient for data-intensive real-time applications that run across distributed devices. Some key features of Node.js include excellent support for building RESTful web services, real-time web applications, IoT applications and scaling to many users. It uses Google's V8 JavaScript engine to execute code outside of a browser.
Three types of service model: SAAS, PAAS, IAAS
Four types of deployment model: Public, Private, Hybrid And community Cloud.
During the load balancing process, few issues are yet to be fully addressed. Couple of them are:
Some of the nodes are overutilized or some of the nodes are underutilized
Improper workload in Cloud environment results into overhead in resource utilization and in turn inefficient usage of energy
response time of jobs
communication cost of servers
maintain cost of VMs,
throughput and overload of any single node.
By addressing the concern of load balancing, we aim to address multiple facets of Cloud viz. (a) resource utilization (b) CPU time (c) Migration time.
Problem statement
Problem raised while dealing with load balancing
How to minimize the CPU time
How to increase the resource utilization &
How to decrease the energy consumption and Migration time etc.
Node.js is an event-driven, asynchronous JavaScript runtime that allows JavaScript to be used for server-side scripting. It uses an event loop model that maps events to callbacks to handle concurrent connections without blocking. This allows Node.js applications to scale to many users. Modules in Node.js follow the CommonJS standard and can export functions and objects to be used by other modules. The event emitter pattern is commonly used to handle asynchronous events. Node.js is well-suited for real-time applications with intensive I/O operations but may not be the best choice for CPU-intensive or enterprise applications.
The document discusses various models for distributed systems including architectural, interaction, failure, software layer, and process models. It describes key aspects of distributed systems like client-server, peer-to-peer, and mobile code architectures. Requirements for distributed designs like performance, quality of service, caching, and dependability are covered. Fundamental models for message passing between processes are defined including states, configurations, computation and delivery events.
Similar to An investigation into Cluster CPU load balancing in the JVM (20)
Artificia Intellicence and XPath Extension FunctionsOctavian Nadolu
The purpose of this presentation is to provide an overview of how you can use AI from XSLT, XQuery, Schematron, or XML Refactoring operations, the potential benefits of using AI, and some of the challenges we face.
Takashi Kobayashi and Hironori Washizaki, "SWEBOK Guide and Future of SE Education," First International Symposium on the Future of Software Engineering (FUSE), June 3-6, 2024, Okinawa, Japan
Most important New features of Oracle 23c for DBAs and Developers. You can get more idea from my youtube channel video from https://youtu.be/XvL5WtaC20A
E-commerce Development Services- Hornet DynamicsHornet Dynamics
For any business hoping to succeed in the digital age, having a strong online presence is crucial. We offer Ecommerce Development Services that are customized according to your business requirements and client preferences, enabling you to create a dynamic, safe, and user-friendly online store.
E-commerce Application Development Company.pdfHornet Dynamics
Your business can reach new heights with our assistance as we design solutions that are specifically appropriate for your goals and vision. Our eCommerce application solutions can digitally coordinate all retail operations processes to meet the demands of the marketplace while maintaining business continuity.
Zoom is a comprehensive platform designed to connect individuals and teams efficiently. With its user-friendly interface and powerful features, Zoom has become a go-to solution for virtual communication and collaboration. It offers a range of tools, including virtual meetings, team chat, VoIP phone systems, online whiteboards, and AI companions, to streamline workflows and enhance productivity.
Do you want Software for your Business? Visit Deuglo
Deuglo has top Software Developers in India. They are experts in software development and help design and create custom Software solutions.
Deuglo follows seven steps methods for delivering their services to their customers. They called it the Software development life cycle process (SDLC).
Requirement — Collecting the Requirements is the first Phase in the SSLC process.
Feasibility Study — after completing the requirement process they move to the design phase.
Design — in this phase, they start designing the software.
Coding — when designing is completed, the developers start coding for the software.
Testing — in this phase when the coding of the software is done the testing team will start testing.
Installation — after completion of testing, the application opens to the live server and launches!
Maintenance — after completing the software development, customers start using the software.
WhatsApp offers simple, reliable, and private messaging and calling services for free worldwide. With end-to-end encryption, your personal messages and calls are secure, ensuring only you and the recipient can access them. Enjoy voice and video calls to stay connected with loved ones or colleagues. Express yourself using stickers, GIFs, or by sharing moments on Status. WhatsApp Business enables global customer outreach, facilitating sales growth and relationship building through showcasing products and services. Stay connected effortlessly with group chats for planning outings with friends or staying updated on family conversations.
Hand Rolled Applicative User ValidationCode KataPhilip Schwarz
Could you use a simple piece of Scala validation code (granted, a very simplistic one too!) that you can rewrite, now and again, to refresh your basic understanding of Applicative operators <*>, <*, *>?
The goal is not to write perfect code showcasing validation, but rather, to provide a small, rough-and ready exercise to reinforce your muscle-memory.
Despite its grandiose-sounding title, this deck consists of just three slides showing the Scala 3 code to be rewritten whenever the details of the operators begin to fade away.
The code is my rough and ready translation of a Haskell user-validation program found in a book called Finding Success (and Failure) in Haskell - Fall in love with applicative functors.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
Odoo ERP software
Odoo ERP software, a leading open-source software for Enterprise Resource Planning (ERP) and business management, has recently launched its latest version, Odoo 17 Community Edition. This update introduces a range of new features and enhancements designed to streamline business operations and support growth.
The Odoo Community serves as a cost-free edition within the Odoo suite of ERP systems. Tailored to accommodate the standard needs of business operations, it provides a robust platform suitable for organisations of different sizes and business sectors. Within the Odoo Community Edition, users can access a variety of essential features and services essential for managing day-to-day tasks efficiently.
This blog presents a detailed overview of the features available within the Odoo 17 Community edition, and the differences between Odoo 17 community and enterprise editions, aiming to equip you with the necessary information to make an informed decision about its suitability for your business.
DDS Security Version 1.2 was adopted in 2024. This revision strengthens support for long runnings systems adding new cryptographic algorithms, certificate revocation, and hardness against DoS attacks.
Using Query Store in Azure PostgreSQL to Understand Query PerformanceGrant Fritchey
Microsoft has added an excellent new extension in PostgreSQL on their Azure Platform. This session, presented at Posette 2024, covers what Query Store is and the types of information you can get out of it.
8 Best Automated Android App Testing Tool and Framework in 2024.pdfkalichargn70th171
Regarding mobile operating systems, two major players dominate our thoughts: Android and iPhone. With Android leading the market, software development companies are focused on delivering apps compatible with this OS. Ensuring an app's functionality across various Android devices, OS versions, and hardware specifications is critical, making Android app testing essential.
What is Master Data Management by PiLog Groupaymanquadri279
PiLog Group's Master Data Record Manager (MDRM) is a sophisticated enterprise solution designed to ensure data accuracy, consistency, and governance across various business functions. MDRM integrates advanced data management technologies to cleanse, classify, and standardize master data, thereby enhancing data quality and operational efficiency.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
An investigation into Cluster CPU load balancing in the JVM
1. An Investigation into Cluster CPU load balancing
in the JVM
Calum James Beck
Submitted in partial fulfilment of
the requirements of Edinburgh Napier University
for the Degree of Bachelor of Engineering with
Honours in Software Engineering
School of Computing
3. Abstract
The JVM CPU Cluster Balancer is a scalable, proof of concept system designed
to distribute processes over a network to perform multiple tasks at once, in a
language of high abstraction. Once distributed, workers return results to an
access server, all while monitoring their respective CPUs for computational stress
in terms of CPU usage. CPU’s incurring set stress then have their respective
processes moved to a less intensive area in the cluster, balancing work overall.
The system works by enrolling Universal Clients (CPU’s waiting for work) to an
access server, which then requests processes to be sent from the users desired
Process Server. Each Process comes in the form of a Process Definition
complying with the Agent interface, self-contained in an object. During run time,
the Process Definition object acts as a subtype of the process manager,
assuming responsibility for saving and restoring the state of the process.
Each Client has four Process Nodes which it can delegate work to. The selected
Process Node then connects to the received Process using two internal channels
and runs using an instance of a Process Manager. During runtime, the Client also
implements a Node Monitor which monitors the CPU usage of the Client in real
time. When a set percentage of stress is met (CPU usage), the Universal Client
informs the server that an alternative node is needed, on a different machine to
finish the instance of work.
The Process Definition then stops its runnable logic. The server searches through
enrolled Clients and sends the address of an underwhelmed CPU in the cluster to
the requesting Node. A dynamic TCP/IP channel is then created between the
node and the foreign Process Manager. The process object is then serialized
allowing transferal, in its paused state, and is resumed at the new client.
The system is developed using pre-set processes to ensure repeatability of
results and is based entirely in any system running the JVM.
This project results in a working system which can distribute work based on CPU
stress, but concludes that in order to be labelled complete, more functionality
needs to be added to find the system an adequate application.
Page | 3
4. The Java language, JCSP, the Groovy scripting language and the Sigar
Application Programming Interface (API), which provides pure C bindings to Java,
have been used in this project. All code, written and complied using Eclipse Mars
IDE.
Page | 4
5. Contents
1 INTRODUCTION................................................................................................12
1.1 Background..............................................................................................................................................13
1.2 Aims and Objectives................................................................................................................................14
1.3 Scope and Limitations ............................................................................................................................15
1.4 Structure of Dissertation.........................................................................................................................16
2 BACKGROUND, KEY COMPONENTS AND THEORY....................................17
2.1 Data and Task Parallelism......................................................................................................................17
2.2 Hoare’s Communicating Sequential Processes (CSP).........................................................................17
2.3 Channels...................................................................................................................................................18
2.4 Groovy......................................................................................................................................................19
2.5 Communicating Sequential Processes for Java (JCSP).......................................................................19
2.6 Channel Mobility in JCSP......................................................................................................................19
3 METHODOLOGY...............................................................................................21
3.1 Monitoring CPU Usage...........................................................................................................................21
3.2 Process Creation and Distribution ........................................................................................................24
3.3 Process Movement associated Methods.................................................................................................26
4 INITIAL EXPERIMENTS....................................................................................29
4.1 Monitoring CPU usage............................................................................................................................29
5 ARCHITECTURAL DESIGN..............................................................................34
5.1 Central Repository..................................................................................................................................34
5.2 Ring System with Travelling Agents.....................................................................................................36
5.3 Work & Node Manager System.............................................................................................................37
5.4 Network Structure Analysis...................................................................................................................38
6 INTRODUCING PROCESS MOVEMENT ........................................................39
6.1 Java Memory Model ..............................................................................................................................39
6.2 Moving processes within a JVM............................................................................................................40
Page | 5
6. 6.3 Thread Serialization impossible with current JVM.............................................................................40
6.4 Adapting Process definitions as Agents.................................................................................................42
6.5 Sending process definitions in current state.........................................................................................43
7 PROTOTYPE.....................................................................................................44
7.1 Design.......................................................................................................................................................44
7.2 Components.............................................................................................................................................46
7.3 Experiment Setup....................................................................................................................................55
7.4 Results ......................................................................................................................................................56
7.5 Comparative Analysis.............................................................................................................................56
7.6 Local Concurrency Vs Distributed........................................................................................................58
8 CONCLUSION...................................................................................................59
8.1 Has the Project met its Aim and Objectives?.......................................................................................59
8.2 Deployment Analysis and Critique........................................................................................................60
8.3 Further Research and Work..................................................................................................................61
8.4 Reflective Statements..............................................................................................................................64
9 REFERENCES...................................................................................................67
A. Searched Terms........................................................................................................................................70
B. Meeting Diagrams ....................................................................................................................................72
........................................................................................................................................................................74
........................................................................................................................................................................75
........................................................................................................................................................................76
........................................................................................................................................................................77
........................................................................................................................................................................78
........................................................................................................................................................................79
........................................................................................................................................................................80
........................................................................................................................................................................81
........................................................................................................................................................................82
........................................................................................................................................................................83
C. Github analytics........................................................................................................................................84
Page | 6
8. List of Figures
FIGURE 1. BASIC CONCEPT OF PROCESS MIGRATION...............................14
FIGURE 2. JAVA BEANS STRUCTURE.............................................................22
FIGURE 3. BASIC JNI INTERFACE PROCESS..................................................23
FIGURE 4. VISUAL REPRESENTATION OF VALUE GENERATOR.................24
FIGURE 5. SERVER-CLIENT PATTERN DIAGRAM..........................................26
FIGURE 6. VISUAL REPRESENTATION OF AGENT RUNNING IN PROCESS
MANAGER............................................................................................................28
FIGURE 7. MK I: HOST NODE SYSTEM DIAGRAM..........................................35
FIGURE 8. NODE RING NETWORK DIAGRAM.................................................36
FIGURE 9. WORK AND NODE MANAGER NETWORK DIAGRAM..................37
FIGURE 10. LOGICAL VIEW OF JAVA MEMORY RELATIOSN (JENKOV,
N.D.)......................................................................................................................39
FIGURE 11. JAVA MEMORY MODEL INTERACTION WITH CPU MEMORY
MODEL (JENKOV, N.D.).....................................................................................41
FIGURE 12. ORDER OF EVENTS FOR CONNECTING TO AGENT................42
FIGURE 13. METHOD AND CONTENTS OF PROCESS (THIS)........................43
FIGURE 14. FINAL PROTOTYPE, SERVER-CLIENT NETWORK.....................45
FIGURE 15. ANY2ONE CHANNEL CONCEPT...................................................48
FIGURE 16. INTERNAL CONNECTION MECHANISMS OF AGENT.................50
FIGURE 17. SERVER INTERACTION DIAGRAM FOR PROTOTYPE...............55
FIGURE 18. TABLE OF EXPERIMENT RESULTS.............................................56
FIGURE 19. TEST RESULTS GRAPH; CPU USAGE AND TIME SPENT.........57
Page | 8
10. List of Screenshots
SCREENSHOT 1. WINDOWS 10 TASK MANAGER AND RESOURCE
MANAGER............................................................................................................29
SCREENSHOT 2. CONSOLE LOG: BASE READING OF CPU USAGE ON
CLIENT 1...............................................................................................................31
SCREENSHOT 3. CONSOLE LOG: CIENT 2 AFFECTING CLIENT 1 CPU
READINGS............................................................................................................32
SCREENSHOT 4. CLIENT INITIALISING UI.......................................................51
SCREENSHOT 5. SERVER NOT STARTED OR CRASHED ERROR MESSAGE
...............................................................................................................................51
SCREENSHOT 6. CONSOLE LOG: NODE REGISTERED ON SERVER..........52
SCREENSHOT 7. BASIC USER UI......................................................................52
SCREENSHOT 8. CONSOLE LOG: NODE SHOWING READY.......................52
SCREENSHOT 9. CONSOLE LOG: NODE DOING WORK AND RELEASING
PROCESS NODE 1 WHEN FINISHED................................................................53
SCREENSHOT 10. CONSOLE LOG: WHEN PROCESS 4 STARTS, CPU IS
HIGH (62%), AGENT IS CONTACTED (I AM READING), THE PROCESS IS
DISCONNECTED, SENT (LETS GO) AND PROCESS NODE 4 IS RELEASED
...............................................................................................................................54
SCREENSHOT 11. CONSOLE LOG: SERVER DELETES ADDRESS..............54
Page | 10
11. Acknowledgements
Firstly, I would like to profusely thank Professor Jon Kerridge who has been an
invaluable source of confidence and knowledge throughout this whole project. He
has been a guide and kept me steadfast in what needed to be completed through
challenging times.
Secondly, I’d like to thank Doctor Kevin Chalmers who has always been
compassionate and a nurturing presence throughout my time in University, from
my first to fourth year.
I would also like to personally thank Charlotte Leask for her constant support and
eternal patience throughout the whole process.
Page | 11
12. 1 Introduction
As the world tends towards the finite end of physical enhancements in computing, it
is the aim to continue increased speeds and finding new methods of surpassing
these limitations.
In the past, the first step in augmenting any computer in terms of speed and
performance has been reducing transistors size and increasing speed henceforth.
Co-founder of Intel Gordon E. Moore stated that the number of transistors able to fit
on a processor would double every 18 months, fundamentally increasing the speed
of computer for at least the next decade. This model of thought is still used regularly
in the computing industry today, however it was initially stated in 1965 and since
then, many things have changed.
The problem we are met with today is distance, heat and conduction. The physical
size we are hitting on distance between cache memory and cores is reducing, more
and more. We are starting to hit almost instantaneous transmissions and this comes
with another set of problems. Heat is generated when a CPU core is pushed to
compute at the rates we demand and can require more intricate ways to cool the
system, and this can all be down to bad allocation of resources.
We hence need to look at how we balance our work. Software needs to reflect the
modern multitasking environment that we have come to expect and hence, must
change in order to cope with increasing demand as hardware cannot be relied on to
be the sole supporter in this venture. I plan to build a system which allows a proper
allocation of resources available and increased the efficiency of hardware use in
order to achieve a faster, reliable system. 1
This project endeavours to meet these needs with a system which distributes
processes over a cluster of computers, regulating work based on CPU load. This is a
1
Taken from IPO
Page | 12
13. means of using Idle CPU’s without exceeding a threshold impeding on the users
everyday use.
The final product aims to be proof of concept that load balancing is possible in a high
level language, in a portable environment. Hence, it displays the means and
capabilities required to further develop a fully, automated system for everyday users
with access to multiple devices with Java compatibility.
1.1 Background
Most processing enhancing implementations fall under Cloud computing; outsourcing
processing to external data centres, platform services or application hosting, whilst
remotely managing computer resources (Winias & Brown, n.d.). However, not all
businesses have access to scalable hardware architectures, these architectures
being expensive to build, run, and upkeep.
Shifting foci to performance, creating efficient software diminishes the need for in-
depth management of system architectures and is a fundamental code of conduct
for emerging professional IT bodies (such as the British Computing Society).
However, different programming languages support different levels of control on a
system. Programming in languages of high abstraction do not fundamentally afford
the same level of efficiency low level languages can attain, and low level languages
are platform specific and do not pertain to portable methods.
So taking advantage of current user environments rather than reimplementation of
code or hardware is, logically, the most cost effective and least disruptive route. This
can be done by effectively managing processing loads; maximising processing
resource capabilities.
Utilising idle CPU resources on a network of computers (cluster) is fundamentally
guaranteed to speed up processing and work all around. In order to do so, these
resources must be directed to work together towards a common goal (i.e. Task
parallelism).
Many current system such as Incredibuild implement this parallel design for build
environments, working with low level code to facilitate high level build concepts.
Page | 13
14. (Xoreax Software Ltd., n.d.) With such high profile clients such as Microsoft, Google,
IBM and Disney using their product to maximise their system use, it’s obvious that
this task distribution method is proven to work.
However, for the average user or start up business, system specifics might still prove
elusive. So why not implement this distribution system in a portable, high level
language?
Java is a widely used platform, built to be compiled in memory, running in a Virtual
Machine aiming for multiplatform portability. According to Oracle, 97% of Enterprise
Desktops run Java alongside 3 Billion Mobile phones worldwide (Oracle, 2015).
Building a system in Java allows for the opportunity to port to multiple platforms with
relative ease, making the potential for networked devices joining the system
exponential.
It should be noted that in researching this area, there is very little on the subject of
load balancing in high level languages in a cluster environment within the last 6 – 10
years. Appendix A documents the search criteria used and the relevancy of results.
1.2 Aims and Objectives
The aim of this project is to distribute and regulate processes over multiple CPU’s in
a cluster setting using the Java programming language, with the Java Virtual
Machine (JVM) as the environment. This involves monitoring CPU usage in real time,
stopping processes which appear to overload set terminal and then moving them to
CPU’s experiencing less stress in the cluster.
Figure 1. Basic concept of process migration
Page | 14
15. The main objectives in order to create such a system, in practise, are outlined below:
1) Monitor CPU usage incurred by an instance of JVM.
2) Processes must have a way to be interrupted and saved in their current state.
3) Processes need to a have a way to move and reinitialise at different nodes, on
different CPU’s.
This report documents the steps taken to achieve these goals from inception to
completion. This project aims to provide a system which endeavours to successfully
manage load over several terminals in a cluster, using a language with a high level of
abstraction: Java.
1.3 Scope and Limitations
In order to provide a proof of concept system within the projects allotted time, certain
areas of the project had to be kept within reasonable limitations. In this case, a
limited amount of processes are programmed and sent automatically over the cluster
to ensure that overload can be attained at a percentage certainty of time. This means
the system does not afford user input yet and runs fairly autonomously.
In addition, to show the scalability of the system, it must be ensured that the
computer which will be distributing tasks runs at a proficient speed to facilitate access
from multiple user-end nodes with, preferably, one underperforming CPU.
As the system relies on communication, many options of transmission are available
but are kept only to TCP/IP network protocols. This form of communication was
chosen as it is a proven, reliable and a widely-used method which is supported by
virtually all OS and platforms that Java can be run on.
This project will also be using a Java scripting language called Groovy which
facilitates the use of ‘Communicating Sequential Processes for Java’ (JCSP). This
allows the manipulation of threads at a low level with high level abstraction resulting
Page | 15
16. in a parallelised system and can use TCP/IP protocols as its main mechanism for
communication between systems.
As the project is created to prove that Java can be utilised with a capacity to
distribute and balance a system over a cluster, all aspects of the system will be
implemented in Java, to the constraints of the JVM, whilst maintaining high level of
abstraction in the source code. Other programming languages will only be
considered when it is conceptually and physically impossible to implement the
requisites for completion with the author’s current knowledge and skills.
1.4 Structure of Dissertation
The structure of this document is as follows;
• Section 2 introduces the methodology, the theory and the practises behind the
message passing mechanics of the system which revolves around the JCSP.
• Section 3 discusses the methods implemented throughout the project as well
as the discussions made as a result of research, to reach the finished
prototype
• Section 4 will present the initial experiments conducted. This documents the
limitations and barriers which had to be overcome in order to develop a
functioning prototype.
• Section 5 describes the main incarnations of the system and how each
implementation lead to a better system
• Section 6 provides the mechanics behind moving processes and the
difficulties face in doing so
• Section 7 elaborates on and demonstrates the prototype system; reviewing
design and implementation as well as experimentation with the system.
• Section 8 details the results and evaluation of the system, and project. Section
8 concludes with a critical evaluation of the project covered by the paper
including short comings of the project and possible avenues of work on the
system which can be undertaken in the future.
Page | 16
17. 2 Background, Key Components and Theory
Throughout this report, the majority of components described have been taught
through, and defined by “Using Concurrency and Parallelism Effectively” I & II
(Kerridge, 2014), which builds upon Hoare’s Communicating Sequential Processes
(CSP) theory. Unless explicitly referenced otherwise, these are the main sources of
information disclosed herein. In this section the basic elements from which the
prototype product is derived, are explained.
2.1 Data and Task Parallelism
One of the driving forces in this project is concurrency and parallelism. Task
parallelism allows the user to run multiple processes simultaneously on the one CPU
or over a network. Sequential code follows a specified order, so programmers don’t
tend to think about the order of events in a system once it has been coded and
compiled.
In order to process tasks moving around the intended system, processes will have to
be fairly autonomous and removed from the main body of code. This means that
concurrent and parallel code with have to stop and synchronise with each other on
transfer, interact in timely a manner so as not to disrupt running processes, finish in
an expected order despite being intrinsically non-deterministic in nature due to
running on different platforms, at different speeds all whilst the possibility of migration
plays an active role.
2.2 Hoare’s Communicating Sequential Processes (CSP)
Hoares CSP concepts (Hoare, 2004) dictate that everything encapsulated in code
can be broken down into algebraic functions. By doing so, everything within
programming can be reduced to simple, understandable functions, rules and
patterns.
By doing so, all code can be reduced to smaller chunks which can be moved around
to suit the success of the formula. What you see, is what you get. The following
mechanisms facilitate this concept, and is the basis of the end prototype.
Page | 17
18. 2.2.1 Process
A Process is a piece of code that can be executed in parallel with other processes. A
network of processes form a solution to a single problem, with processes
communicating with each other using Channels (detailed in 2.3). Processes typically
contain repeating sequences of sequential code with communication interspersed.
Any process that is idle consumes no processor resources.
2.2.2 Timer
A Timer is a means of introducing time management into processes. Timers can be
read to find the current time and introduce delays or alarms for future events. They
can also be used in ALTs as guards for reading channels.
2.2.3 Alternatives (ALT)
Alternatives (ALT) allows selection of one ready guard from several possible guards.
Guards comprise of three different types: input communications, timers, or SKIPS
and dictate how a process should proceed. A guard is ready if input is ready, an
alarm time has passed, or SKIP is a defined guard. SKIPs are always ready and
allow guards to continuously run.
The ALT will wait until a guard is ready and then undertake the associated code. If
one guard is ready, then it undertakes the associated code. If more than one is
ready, it selects one according to predefined options and then obeys the code. These
options can include priority reading, if both are ready, or fair, turn based reading.
2.3 Channels
This is a main mechanic of the system described in this report, as the main aim is to
send processes over a cluster network. A Channel is a one-way, point-to-point,
unbuffered connection between two processes. Channels synchronise the processes
to pass data from one to another and do not use polling or loops to determine their
status, meaning no processing is consumed during transactions.
The first process attempts to communicate and goes idle when synchronising. The
second process attempting to communicate will then discover the situation,
Page | 18
19. undertake the data transfer and then both processes will continue in parallel, or
concurrently if they were both executed on the same processor. It does not matter
which process attempts communication first as the mechanism is symmetric.
When communication between processors takes place, the underlying system
creates a copy of the data object and begins transferal. As such, objects containing
process logic can be transferred, to be executed by a Process Manager, and run
asynchronously, which will form the basis of the project.
2.4 Groovy
The Groovy scripting language allows the programmer to write concurrent systems
with a high level of abstraction and is underpinned the four basic principles detailed
above.
2.5 Communicating Sequential Processes for Java (JCSP)
JCSP is based on Hoare’s basic algebraic functions, allowing virtual connection to be
created via NetChannelLocation structures sent between nodes. Using Java allows
the programmer the ability to send objects via serialisation methods; breaking down
the components into sequences of bytes to be transferred (Chalmers, Kerridge, &
Romdhani, A critique of JCSP Networking, 2008).
With this framework, objects containing code definitions can be sent along with a
control signal to recreate the object at the receiving end. Communicating Sequential
Processes for Java is the cornerstone of this project and allows us to build upon
Hoare’s concepts to create a simple to understand communication network.
2.6 Channel Mobility in JCSP
Channel Mobility refers to the dynamic capabilities that can be found when creating
self-propagating NetChannels and other communication models in this project.
Channels afford us a robustness of connection between the input and out end whilst
allowing sufficient models to a support the ubiquitous nature of the intended system.
(Chalmers, Investigating Communicating Sequential Processes For Java To Support
Ubiquitous Computing, 2008)
Page | 19
20. As the project does not endeavour to change these underlying mechanisms, a high
level of information is presented. However, it can be stated that Channel mobility is
paramount to attaining, transferring and moving processes successfully.
Page | 20
21. 3 Methodology
The main aim of the system is to create a way to send processes from one node to
another running in the same computing cluster. This would be initiated by rising CPU
usage at each terminal. As such, there were three main problem areas which needed
to be addressed;
1. How to get CPU usage at any given time from within a JVM runtime.
2. How to create and deliver processes around a dynamic network.
3. How to stop set process when CPU usage reaches a predetermined amount
and send to an underused node in the network
3.1 Monitoring CPU Usage
At the time of writing, there were no pure Java API’s available to gather CPU
information. The investigation continued as fact finding into gathering as much
system data as possible within Java.
3.1.1 MBeans
MBeans are managed Java objects, similar to JavaBeans, which can represent a
device, an application or any resource that needs to be managed.
Page | 21
22. Figure 2. Java Beans Structure
This means we can monitor any of the resources being used by an instance of the
JVM. However, as an MBean can be any type of object and can expose attribute of
any different type, each client has to implement class definitions each time an MBean
is called and can lead to high overheads in themselves when repeatedly queried.
3.1.2 OperatingSystemMXBean
MXBean is native to Java (1.6 upwards) and allows the user to utilize an MBean with
a reduced set of types, meaning there is not a requirement for model-specific
classes. This makes the MBean accessible by any local or remote clients; essentially
conforming to an interface.
OperatingSystemMXBean allows the user access to an interface developed for
retrieving system properties about the operating system on which the JVM is running.
Page | 22
23. This includes, free memory of the computer, allocated memory for the JVM and CPU
time dedicated to a task. MXBeans were the only native mechanism provide by Java
Management Extensions (JME) which could facilitate the objectives mentioned.
3.1.3 Java Native Interface (JNI)
The Java Native interface is a native programming interface that is part of the Java
Software Development Kit. JNI allows Java code the use of fragments and libraries
written in other languages such as C and C++.
While Java breaks code down into Objects to be interpreted, C allows for the use of
procedural code which is compiled and breaks down into functions. The JNI connects
Java Class Methods with C functions, fundamentally allowing the programmer to call
C functions at any given time.
Figure 3. Basic JNI interface process
This allows the user access to lower levels of programming and can read values,
such as CPU usage, from assembly code. Although this approach seems the most
enticing, it can lead to the destabilisation of a JVM instance through subtle C errors.
Writing small scripts may not pose a huge problem, but garbage collection is not
Page | 23
24. handled by the JVM in these instances and a basic understanding of memory
allocation must also be known. Additionally, using the JNI results in a system which
is not wholly portable as the code written in C is platform specific.
3.2 Process Creation and Distribution
As one of the main prerequisite of this system, a network architecture had to be
designed to facilitate communication.2
This section will focus on how data and work is
spawned to test the proof of concept system.
It should be noted that although the aim of the project is a proof of concept, the ideal
system would spawn multiple instances of work which would accumulate to a large
amount of CPU usage in order to adequately balance the system.
3.2.1 Value Generator
Here, different volumes of data are generated by a data generator and sent to a
Node to be processed. The perceived complexity of the data should be proportional
to the increase in CPU usage created.
Figure 4. Visual Representation of Value Generator
2
. Development and iterative design is documented in Section 5.
Page | 24
25. This would require a fixed process at each node initialisation to manipulate the
randomly generated data sets being produces by the generators. All interactions are
handled by channel interactions, as shown above in figure 4.
3.2.2 Random Process selection
In this instance, each node would have access to pre-set process definitions which
would generate varying loads. At run time, a timer would be initiated requesting a
random process to run, one of which would create a large spike in CPU usage. This
would allow reproduction of an overloaded state with a high degree of certainty while
demonstrating. The structure would be similar to the above but would not require the
DataGenerator as the initial input would remain the same.
3.2.3 Server Hosted Process Definitions
This method would require processes being hosted remotely at a specific IP location,
and requested by the client when needed. The client would access a server which
would have the network locations of all the relevant process servers. The request
would then be forwarded, with the clients’ location, to the process location and sent
via a TCP/IP channel back to the client.
Page | 25
26. Figure 5. Server-Client Pattern Diagram
This would work by using objects containing serializable process definitions sent over
channels.
3.3 Process Movement associated Methods
As copying large amounts of data around a system would prove to be inefficient, not
withholding the large overheads in processing and memory allocation, the system
has to handle data manipulation locally, within one processor.
This means that processes in their entirety have to be sent between nodes, to
complete the full process, sharing as little data during computation as possible. The
aim is to send initial parameters and results.
The methods implemented for this aspect of the system heavily rely on the JCSP
API, the underpinning functions for Groovy. Hence, the definitions and descriptions
pertaining to JCSP methods below are based and paraphrased from the API
specifications hosted by the University of Kent at Canterbury (htt1). Implementing
Page | 26
27. process movement is covered with more detail in chapter 6, documenting limitations
and boundaries.
3.3.1 JCSP Process Manager
The ProcessManager class enables a CSProcess to be spawned concurrently with
the process doing the spawning. This means we can have multiple processes
running and allows the nodes in the system to deal with multiple processes being
sent on the same channel.
Dealing with processes as they arrive, allows the system to pertain to a client-server
pattern, making chances of deadlock in the system (pertaining to this area), very
slim.
3.3.2 Process Definition Serialisation in Objects
In order to take advantage of the Process Manager capabilities, process definitions
need to be designed to be CSProcesses. In order to do so, a process is defined in
its’ entirety and encapsulated in an object.
In doing so, we ensure the objects class implements two interfaces; CSProcess and
Serializable.
3.3.2.1 CSProcess
According to the JCSP documentation, “a CSP process is a component that
encapsulates data structures and algorithms for manipulating that data” (htt). This
basically means the data involved is private and cannot be accessed outside the
object itself.
Essentially, each instance of the process is alive, executing its own algorithms on its
own data and its’ actions are defined by a single run method. To avoid race-hazards,
the processes in this system do not require outside data or interaction with other
running threads. Only primitive data types will be sent to activate switches or request
new data. No procedures outside of defined data manipulation take place within the
Process Manager.
Page | 27
28. 3.3.2.2 Serializable
A class with Serializable uses the java.io.Serializable interface and allows subtypes
of classes to be serialized for communication transfer. The interface itself doees not
have any methods but serves only to identify the semantics of being serializable.
It should be noted here that CS classes (not classes implementing this interface)
such as CSTimer do not conform to serializable semantics and will be covered later
in this document.
3.3.3 Agents
The Agent interface implements both CSProcess and Serializable but also adds
connect and disconnect methods. These are used to connect input and output
channels from the internal mechanisms of the sent process definition resources, to
an outside host.
Figure 6. Visual Representation of Agent running in Process Manager
The agent has two channels by which it connects to the host during runtime. This
means the data inside the agent CSProcess can be influenced from outside the
Process Manager. By exploiting the Agent interfaces, we can enable communication
from outside threads during run time giving agents access to two different code
structures.
Page | 28
29. 4 Initial Experiments
In order to evaluate which methods would lead to a successful system, the
methodologies aforementioned were investigated and implemented in different
circumstances, testing for compatibility with the project.
4.1 Monitoring CPU usage
Monitoring CPU usage would take part in two stages; designing code which will
generate high usage and code which can interpret the CPU usage by percentage.
Results would be compared in conjunction with the Task Manager and Resource
Manager native to Windows 10.
Screenshot 1. Windows 10 Task Manager and Resource Manager
4.1.1 Creating Work
Creating work consisted of doing two different functions which would change
intermittently to test increases in CPU usage. Small work creates an int value,
comprising of a basic multiplication operation followed by a timer to create time
between set operations. CSTimers, as part of JCSP, work as a guard for the code,
acting as an ALT, meaning there is no processing wasted during execution.
For larger CPU usage, a more complicated problem has been run to generate more
work, creating a long variable, as seen below:
Page | 29
30. Long j = (Math.pow((Math.pow((60339*339398/2*33323),2348958)),
30000000000))*(Math.pow((Math.pow((454339*339765645398/26*354563323),2348456459
58)), 3000004564500000))
4.1.2 Monitoring Work
A basic system was implemented to create expected, repeatable workloads on the
CPU that could be measured to inspect whether monitoring usage was successful.
The system of operations is shown as a process diagram in figure 7 below.
Figure 7. Test Process Diagram
Page | 30
31. The process is simple; a timer is set for a predetermined time in which a process of
high CPU usage is implemented. CPU usage at this point would be verified by
addressing the task manager seen in screenshot 1.
4.1.3 Accessing CPU Usage
Measuring CPU usage is a difficult achievement whilst using Java. Firstly, in order for
this project to succeed, we need to distinguish the actual work being done on a
processor as opposed to the memory usage of the JSP. The latter is very easily
accomplished with native Java commands but as any Java program is essentially
interpreted by the system as a ‘process’, it cannot access the necessary tools in
order to gain CPU usage insight in likeness of the Task Manager(screenshot 1).
4.1.3.1 Native Monitoring
There are ways to obtain CPU usage which do not offer real time performance
monitoring but can be based on timed events. For multi-threaded tasks,
ThreadMXBean methods can give you the CPU usage and user time for any running
thread. However, using operatingSystemMXBeans, (explained in Chapter 2, figure 2)
only returns the CPU usage for all JVMs running (i.e. it cannot distinguish between
processes of different PID and returns the CPU usage of all JVMs). In figure x, we
can see the relationship between two JVMs working concurrently.
Screenshot 2. Console Log: Base Reading of CPU usage on Client 1
Client 1 (right) is using independent code to monitor itself whilst Client 2 (left) is
waiting for work. operatingSystemMXBeans returns the use of the CPU with 1 being
100% usage and 0 being 0%. At the moment, on monitoring, the system sits at 12%
usage.
Page | 31
32. Screenshot 3. Console Log: Cient 2 affecting Client 1 CPU readings
However, as new processes are started in Client 2, Client 1 continues to monitor high
CPU consumption, proportional to the work of Client 1, despite having no work itself.
opratingSystemMXBeans are further influenced by any other Java application
running. Hence a way to distinguish between JVMs running has to be identified.
It should be mentioned that as of Java 9, there is a new process API that allows the
user to get the current process ID. However, on the date of writing, this was still in
Beta testing and Java 8 was opted for use due to its comparative stability.
4.1.3.2 JNI interface
C affords the low level functionality to physical components needed to identify a
JVMs Process Identifier (PID). PIDs are numbers which uniquely identify a process
while it runs and is used in Linux, Unix, Mac OS X and Windows.
The problem however, is system calls are still defined differently on each OS.
Language libraries need to be recompiled for the specific target operating system, to
utilize the particular underlying components of the operating system (kernel).
Page | 32
33. As this research was beginning to deviate from the original project scope, delving
further into low level code, an API was imported to give multi-platform compatibility.
4.1.4 Sigar API
Sigar is a multiplatform API for Java and other languages. It allows the user to
monitor Per-process memory, CPU, credential info, state, arguments and other
relevant information (MacEachern, n.d.). By incorporating Sigar, the program can
produce percentage numbers based on the amount of CPU usage attributed to the
PID of a JVM.
4.1.5 Transferring Objects
By connecting two nodes by a TCP/IP connection, we can send an object very easily.
Implementing a serializable interface, an empty object is sent to another node at a
defined IP. This was to ensure objects were being sent and not references.
If a read was successful, a printed statement would display in the eclipse console
stating “Success!”.
4.1.6 Running Process Definitions
As process definitions can be contained within object, a simple system can be
created using two nodes and instances of a process manager.
Process definitions are sent using a timer testing one process running then two
concurrently and the task manager is consulted to ensure process are being run
correctly.
Page | 33
34. 5 Architectural Design
Throughout the project, many different systems were designed to monitor processes
and set up an architecture of communication which could theoretically facilitate this.
The various designs are presented and critically evaluated below.
5.1 Central Repository
This design attempts to meet the aim of process movement. Each node has a
process node which creates and runs a process on the process Manager attached.
The results are then sent to a host node which keeps track.
Page | 34
35. Figure 7. MK I: Host Node System Diagram
Each node would monitor the CPU usage of the JVM. Once a certain level is met, the
process would then be packed and sent to another node.
5.1.1 Central Repository - Issues
The problem with this system is all channels must be created on initialisation,
meaning no room for scalability. It’s also essentially working on a ring topology and
more suited for a single system. This network is easily set up in a single JVM as well,
meaning only references are passed, rather than the actual objects.
Although good for initial tests (scaled back to two nodes and Host Node), the main
drawback to this design is the ring element itself. This design was expanded upon to
work with Agents below and the problems of ring networks is explored in more detail.
Page | 35
36. 5.2 Ring System with Travelling Agents
The Agent System opens up the network, allowing communication over different
JVMs. The processes are no longer spawned within the node, but sent by a manager
as Process Definitions.
The Manager then runs the process whilst a monitor reviews CPU usage. When
needed, an Agent is created, with the relevant process.
Figure 8. Node Ring Network Diagram
5.2.1 Ring and Agents - Issues
It was during this iteration that the underlying principles of threads were explored in
more detail and found to be non-serializable, meaning the running process could not
be sent with the Agent in its current state. This meant this design would
fundamentally not work. The agent could get the process definition, but in its
unedited state, not during processing.
Not only that but, when dealing with task parallelism, ring systems are inherently
prone to deadlock. As processes are created at nodes, the communication between
ring elements proved to be non-deterministic due to the random uncertainty as to
Page | 36
37. which processes were being spawned where, and when they exceeded pre-set CPU
usage, and needed to be moved.
If too many events were triggered, all of the processes involved in the ring would be
attempting to output at the same time, resulting in deadlock. In non-uniform network,
where computer architectures are different (providing varying computational power),
this problem would become more prevalent.
To alleviate this, we could have nodes probe the ring first with empty packets, waiting
for them to return but this would result in half the network activity on the ring being
empty data packets; a detriment to efficiency.
5.3 Work & Node Manager System
The work and node manager took the ring element out of the design and introduced
server client properties.
The problem with this design is the servers are very closely related and can end a
closed system. The final prototype changed this.
Figure 9. Work and Node Manager Network Diagram
Page | 37
38. 5.4 Network Structure Analysis
In order to minimise incidents of deadlock, the Client-Server pattern seemed most
logical to implement. A server orientated network permitted:
• Decreased chance of deadlock
• Process Discovery
o Nodes receive complete set of required process
o Allowing dynamic amendment of process definitions
• Process Control
o User not restricted to only one choice
o Timing of process delivery
• Centralised repository for client lists and results
• Scalability
o Users added by location (IP) rather than assigned place
Page | 38
39. 6 Introducing Process Movement
Processes movement was easily implemented when it occurred on the same
physical machine in the cases of the first two prototypes. However, complication
increases when functionality is extended to a network.
Process Definitions are easily sent in a static state, but getting the state of a process
in execution requires finding all relevant data saved in the JVM.
6.1 Java Memory Model
In order for Java to be architecture neutral, it is built to operate and exist solely within
memory (RAM). Hence, to mimic a computers infrastructure, the JVM inherently
includes its own memory model.
The Java memory model divides memory between thread stacks and the heap. It can
be seen logically in figure 14.
Figure 10. Logical view of Java Memory Relatiosn (Jenkov, n.d.)
Page | 39
40. Each thread running in the JVM has its own stack which contains information about
which methods have been called, point of execution and the local variables for set
methods. The local variable consist of primitive types and are fully stored within a
thread stack. Hence, they cannot be seen by any other components of the JVM
during execution.
The heap contains all objects created in the Java application. The main point of
contention for moving processes in the JVM, is the fact that all manipulation occurs
within a thread stack. If the object containing the process definition being worked on
is moved (even if thread is suspended during processing), it will be moved in its
original, unedited state.
6.2 Moving processes within a JVM
As all classes exist within a single JVM during runtime, hence initial tests for moving
processes were misleading. Simply suspending a thread and calling set thread in
another class leads to a seemingly successful process manoeuvre.
This is achieved by suspending the process manager (essentially a concurrent
thread) and sending it through a channel. In this case, as the channel connects two
host processes within a single JVM, it’s only the thread reference which is
communicated meaning it has technically remained in the same place and is only
being restarted; just by another process.
6.3 Thread Serialization impossible with current JVM
Each method ran in a Java program has a stack frame associated with it. The stack
frame holds the state of a method with three sets of data: the methods local
variables, the methods execution environment and the methods operand stack.
It would stand to reason that by copying the values at suspension, copying a thread
could be achieved. However, the thread object would be allocated with none of the
native implementation. The JVM emulates a machine for each instance a Java
program is started, and a thread run on one of these machines becomes intricately
tied into the internal mechanisms of the machine. The context of operations is simply
lost.
Page | 40
41. Reading the locations of the threads on the physical machine would prove difficult as
well. Not only would this require a separate language to access the data, but memory
allocation would have to be monitored from the inside of the JVM as well as outside.
Hardware memory does not distinguish between the heap and threads; hence parts
of thread stack can be present in CPU caches as well as the CPU register.
Figure 11. Java Memory model interaction with CPU Memory Model (Jenkov, n.d.)
Also, Java relies on C procedures for some of its native methods. If the stack were to
be copied, it may contain native Java methods that, in turn, have called C
procedures. This indicates a complicated mixture of Java constructs and C pointers
would have to be recorded.
At this point, not only does it increase the amount of data to be transferred at once
over a network, but goes against the ethos of this investigation to find a solution with
high abstraction. This is also why reconstructing byte code, instructions used by the
JVM to resemble Assembler instructions, and the monitoring the JVM instruction set,
have not gone under further investigation. 3
3
Using the Java Class file disassembler proved to be a cumbersome method to determine the
sequence of events and was essentially the lowest level format possible with Java.
Page | 41
42. 6.4 Adapting Process definitions as Agents
In order to move processes, we have to look at the object itself which is being edited.
As the supertype class, process manager, is not serializable, the subtype object must
assume responsibility for saving and restoring.
As the process definitions already contain a run function, the system must be
amended to stop the internal code from executing, and getting the edited values. This
means each process object must be created as a new instance so as to keep track of
its own local variables and have a method of communicating with the host process
whilst running concurrently.
Adapting the processes to conform to an Agent interface introduces two new
methods which will allow this: connect and disconnect (agent seen in figure 6). The
host is fitted with two new channels, generated at run time, which allow the agent to
connect when received. The basic order of events can be seen in Figure 16.
Figure 12. Order of Events for connecting to Agent
Page | 42
43. 6.5 Sending process definitions in current state
In Java, objects can refer to themselves simply by calling “this”, meaning once the
internal code has been paused and variables saved, the object itself can be
packaged and written to a channel as a serializable object to be run by a new
process manager.
Figure 13. Method and Contents of Process (this)
This way, as long as the process definition contains all the run code required, the
state of the process is reflected in the object state. This meets the requirements for
process movement and is a main part of the prototypes design.
Page | 43
44. 7 Prototype
7.1 Design
The final implementation extends the Server Client design by adding Process Nodes
to the universal client so multiple instances of a Process sent can be ran
concurrently, whilst connecting to their respective host.
It is based on the six paradigms for code mobility (Chalmers, Kerridge, & Romdhani,
2007):
• Client-server
o Client executes code on the server.
• Remote evaluation
o Remote node downloads code then executes it.
• Code on demand
o Clients download code as required.
• Process migration
o Processes move from one node to another.
• Mobile agents
o Programs move based on their own logic.
• Active networks
o packets reprogram the network infrastructure
Page | 44
45. In the case of this design, agents are begin manipulated as means of internal
communication as well as movement. The final design is seen in figure 18.
Figure 14. Final Prototype, Server-Client Network
The Universal node comprises of a node Monitor, which is periodically, checking the
CPU usage of the current JVM it is running in. In order to do so, a concurrent thread
is spawned on run time with the sole purpose of returning the current CPU usage.
Using Sigar, the CPU usage is checked every 10 milliseconds and if it above a
certain threshold, a new node request is sent to the Access Server.
The Node Monitor has four Process Nodes which are connected by two one2one
Channels. Each Process node runs a Process Manager for incoming processes to
connect with. At any given point, if either the Client or Server are waiting, they do so
idle, consuming no processing power.
Page | 45
46. Process Movement is handled mostly by nodes to avoid over reliance of the servers
involved. If a process has to be stopped, it is directly sent from the Process Manager
running it, and transferred straight to a new Client, rather than via the Access Server.
This essentially allows the system to move processes in the most direct manor
conceived.
The system conforms to a Client-Server pattern between the Universal Node and the
Access Server. They are connected at initialisation by an any2net (toAccess) and
numberedNet2One (processRecieve) Channel. This is also true for the relationship
between the Access Sever and Process Servers, however there is only one
connection for interaction as the Process Servers have nothing to return.
7.2 Components
Detailed below are all the component which connect the system together as well as
their role in the whole process.
7.2.1 Nodes
In the context of this system, Nodes are autonomous, concurrently running
processes. They control connectivity to the process locations, deal with work and
monitor CPU usage.
7.2.2 Node Monitor
The Node Monitor initialises the user system and creates a connection to the Access
Server, adding its IP and port location on connection and removing set location when
disconnecting. Currently, the server address is hard coded but any server with the
same infrastructure could be added and defined by the user.
It self-monitors its respective instance of a JVM for CPU usage and keeps track of
which process nodes are in use.
The Node Monitor requests processes to be run and delegates the work to the
available Process Nodes asynchronously.
Page | 46
47. It can also stop process Nodes from continuing work when CPU load is too high. It
then selects the last Node activated, requests another Universal Client location from
the Access Server and sends the location to the Process Node.
7.2.3 Process Nodes
Process Nodes receive process definitions and put them to work using a Process
Manager. Each Process Node provides channel ends on which the Process
definitions can connect to facilitate interaction between the received process
definition and the host.
This connection allows the Process Node to inform Processes to stop and move
when a new channel location is received as well as alert the Node Manager as to
when a process has finished.
7.2.3.1 Process Manager
The Process Manager (detailed in section 3.3.1) runs the processes received
concurrently.
7.2.4 Channels
Channels comprise of two channel ends:
• A channel input where data is read into the system component
• A channel output where data is written out of the system component
Channels in this system pertain to be one to one connection. The only exception is
the stop line from Process Node to Process Manager. This connection is an any2one
connection where the input can come from any node but the output is a specific
channel end.
Page | 47
48. Figure 15. Any2One channel concept
7.2.5 Net Channel
Net Channels work in the same way as regular Channels but the output is directed to
a designated port at a new IP address.
7.2.5.1 Automatic Net Channels
Generated during runtime, Automatic Net Channels create a Channel Input on-the-fly
and use input IP addresses as its location.
7.2.6 Servers
The Servers keep track of Clients available and allows the Client hosts to initialise
waiting for processes to run.
Page | 48
49. 7.2.6.1 Access Server
The Access server has the IP location of the Process Servers, and connects users to
processes requested. The Process Servers IP addresses are stored and connected
whenever an instance of the associated process is requested by the user.
This server deals with user access requests (capabilities; in this system, an
interface), process requests, find other client requests and client dismissals.
7.2.6.2 Access Manager
The Access Manager registers new initialised Nodes onto the server
and keeps track of active clients. This is the basis on finding new
client locations when a client becomes overloaded.
7.2.6.3 Process Servers
The Process Severs provide Process Definitions an IP address and port at which
they are accessible by the Access Server. The Access Server must know which the
locations at initialisation in order to incorporate them into the Client capabilities.
However, the Process Definitions themselves can be amended and adapted during
runtime as the location is the only parameter needed in between requests.
7.2.7 Process Definitions
Process Definitions are objects with their own self-contained logic and variables
activated by a run method. They conform to the CSProcess and Serializable
interfaces.
7.2.7.1 Agent Definitions
Agents afford the same capabilities as other Process Definition but introduce connect
and disconnect methods. This allows Processes to travel with channels defined,
connecting on reception. It is up to the host process to establish the channel
connections.
Page | 49
50. 7.2.7.2 Agent Channels
The Agent Channels allow the host process to connect to the internal logic being run
by the Process Manager. The channels are defined in the host process to then be
connected on reception of the Agent (before running the agent process definition),
during host run time by the connect method.
The input and output to the Agent, and the input and output from the host are then
connected together as seen in figure 20.
Figure 16. Internal Connection Mechanisms of Agent
7.2.8 Request Identification
Request objects allow the Access Server to react in the manor required to process
the data received in the correct way. The simplest is ClientRequestData which
dictates that the string sent within the object corresponds with the service needed
(i.e. “Process Spawn” requires service B) and the address of the requesting client.
Other requests comprised of simple IP addresses required to be interpreted in
different ways. Address locations were packed into said objects to differentiate
between the contexts they were to be treated. These include:
• ClientLocation
Page | 50
51. o Registers the Client and send capabilities
• LeaveRequest
o Removes Client details from Access Server
• NodeRequest
o Request another Client be found with different IP to send processes to
• NewRequest
o Same as Node request used exclusively for Process Manager and
includes the Nodes ID
7.2.9 Implementation
The system runs in the following manner.
Migration
• Process Servers are initialised and followed by the Access Server at set
IPs
• The Universal Client then instantiates itself with a base IP address and
randomly generated port. It starts four process manager connected.
Screenshot 4. Client Initialising UI
• If the port matched another, an error message is show to try again (range 1 –
10,000)
Screenshot 5. Server Not Started or Crashed error message
Page | 51
52. • Client connects to Server and Server enrols client into list
Screenshot 6. Console log: Node registered on server
• Server then send backs the Client capabilities
Screenshot 7. Basic user UI
• The Client can then choose different processes to call
o It shows ready in the Console; as the system does not need to show
the general public its workings, the console in Eclipse is used to
monitor transactions
Screenshot 8. Console Log: Node showing ready
Page | 52
53. • The service needed and IP of the Client are then sent to the Access
Server, which then relays these values to the process server needed.
• The Process sever then send the process to the requesting node directly
• The Universal Client node then assigns the work to one of its free Process
Nodes and marks that node as unavailable
Screenshot 9. Console log: Node doing work and releasing Process Node 1 when finished
• When the first process is received, the node Manager then spawns a new
thread to monitor the CPU usage.
• The process (agent) is then connected to the Process Node and the
process is ran.
• Once finished, the Process Node is released to work again
• At any given point, a new node can become active
Stopping and Moving
• Once a node consumes too much CPU usage, the Manager notifies the
server that it needs a new node.
• Another node is chosen and the address returned to the requesting node
• The manager then selects an active node (last process manager started)
and send a message with the new address to the Process Node.
• The Process Node then interprets that type of object and stops the Agent
whilst simultaneously letting the Node Monitor know it can release that
Process Node
Page | 53
54. Screenshot 10. Console Log: When Process 4 starts, CPU is high (62%), agent is contacted (I
am reading), the Process is disconnected, sent (LETS GO) and Process Node 4 is released
• The Agent then packs itself and sends itself to the next node where it
continues
• When the node is closed, the server is alerted and removes it from its
active clients
Screenshot 11. Console Log: Server deletes address
To clarify, a server interaction diagram has been created to reflect to order of events
in fig x.
Page | 54
55. Figure 17. Server Interaction Diagram for Prototype
7.3 Experiment Setup
In order to test the validity of the system, the work described in 4.1.2 was completed
20,000 times for a total of 20 runs and timed for each 20 durations of work. The CPU
usage was also recorded using MXBeans (for accuracy) and averaged. The
experiment was conducted on computers with the specs below.
Page | 55
56. Hardware
• CPU - i7 4770 @3 .4GHz
• Ram - 16Gb DDR3
• GPU - NVIDIA NVS 510 (2047 MB)
• OS - Windows 7 Professional 64-bit
• Network Speed – 1GB/s
7.4 Results
These experiments were conducted 12 times and the results averaged, ignoring the
two polar, outlying values:
1. A single computer running the processes sequentially with processes hosted
locally.
2. A single computer running the processes concurrently over 4 process nodes
with processes hosted on process servers.
3. Two computers running the load balancing system with process from process
servers.
The results are detailed below (fig 21)
Workers Average time taken CPU Usage
1 CPU: 1 Sequential Worker 24.36 Seconds 12%
1 CPU: 4 Concurrent Workers 10.18 Seconds 87%
2 CPU: 8 Concurrent Workers 8.78 Seconds 46%
Figure 18. Table of Experiment Results
7.5 Comparative Analysis
By visualising the data collected, we can correlation between the amount of CPU’s,
time and work.
Page | 56
57. Figure 19. Test results Graph; CPU Usage and Time Spent
Speed:
• Increasing workers increases the speed of the work
o This is not proportional to the amount added, but a vast improvement
o Never Expected directly proportional speed up due to communication
overheads
• Adding additional CPU caused minor increase compared to increasing Native
Resources
o Due to synchronisation and distribution times, limited by connection
protocols (Network speed very fast)
o Speed Up still apparent
CPU Usage
• CPU usage for single process very low
Page | 57
58. o To be expected as the CPU is literally doing the least amount at a time
during execution of test
• CPU usage increases 7 times over for 4 workers
o Although the more CPU usage was expected to be consumed, it was
not expected to grow this much.
• Added CPU for balancing reduces CPU usage to almost half
o Considering how much difference there is compared to sequential and
concurrent methods, almost halving the stress is a great result
7.6 Local Concurrency Vs Distributed
The results trend toward better results in terms of time and processing consumption.
It however does not grow exponentially when more CPUs are added. It was assumed
when going into the experiments that there would be a boundary for performance
based solely on communication times.
Judging from the sharp change in CPU usage however, we can conclude that the
system does balance the load whilst increasing efficiency in processing. This is
logical with more workers, doing more things.
With small amounts of work however, sequential processing will yield better results
due to the nature of saving small values and little processing needed, compared to
moving data around a network. However, small amounts of work would not be what
the system was designed for.
Page | 58
59. 8 Conclusion
8.1 Has the Project met its Aim and Objectives?
The aim of this project was to create a system which can distribute work and regulate
set work over multiple computers, ensuring CPU usage does not exceed a specified
threshold on each terminal.
As the tests in 7.5 show, the functionality to facilitate regulation does exist in the
current prototype. The main objectives stated in 1.2 are recapped and addressed
below:
1) A method of monitoring CPU usage by implemented in the JVM over multiple
CPU’s must be implemented.
The Sigar API (and java Beans to a certain extent) afford this functionality. By
spawning a thread in the Universal Clients Node Monitor, the monitoring
function remains functional throughout execution. It is not affected by other
events and allows constant vigilance.
Although this project did set out to complete everything at a high level, there
was one barrier which could not be dealt with otherwise. It can be argued
through that most of the Java native methods run using the JNI is parse C
language, so it still conforms as implementation within the JVM.
2) Processes must have a way to be interrupted and saved in their current state.
With the system sending process definitions, the position running position of a
process using a Process Manager is reflected in the state of the object. By
delegating saving responsibility to the subtype in process management, we
can essentially pick up the work from a previous running instance.
Page | 59
60. As explored in chapter 6, it is impossible to serialize and send threads using
high level techniques, this method yields a large amount of efficiency,
providing variables are saved in a tolerable fashion.
3) Processes need to a have a way to move and reinitialise at different nodes on
different CPU’s.
Using the Serializable interface, Channels, Process Managers and objects
containing process definitions, this aspect of the system has been successfully
implemented and has been rigorously studied.
It can be concluded that objectives and aims have been accomplished. The system
outlined at inception has been completed, as a proof-of-concept, functional system
as long as the user controls the processes introduced. However, during development
and implementation, more aspects have been identified which need to be addressed
in order to label this project finished.
8.2 Deployment Analysis and Critique
8.2.1 CPU Monitoring Critique
With user supervision, the system can be seen to send, receive, run, stop and move
processes. The CPU monitoring gives adequate coverage and timely response to
spikes in CPU usage. Ideally, MXBeans should be used if implementation can be
guaranteed in an environment with no other instances on the JVM running, as the
results tend to be more accurate.
Using the JNI and C results in CPU polling roughly once for every thousand
instructions and gives insight to that instant of process CPU usage information.
Information available via Sigar (CPU usage time) does not update all the time, and being
instant, can sometime return 0 making viable readings even more infrequent. However,
the frequency and scope of accuracy are still adequate for this system to function.
8.2.2 Process Movement Critique
Within the time constraints, the project was built to prove that active process
migration could be achieved, and the mechanics, and theory behind the actual
Page | 60
61. process movement are sound. However, user end process management requires
more work.
The problem pertains to the amount of process nodes at each Universal Client. As
each process definition needs a manager to connect to, a process manager does not
suffice for the intended process interaction. So, if more than 4 processes are sent,
the Manager Node does not have the option to deal with the excess process read.
At this point the Client Server environment breaks down, as the Client is no longer
waiting for input, and a deadlock can occur if a process node is in a busy state at the
point of reception. Having redundant nodes on the system which receive overloading
processes in this case, could relive nodes, or simply having more process nodes
instantiated at run time.
Adapting this aspect of the system really depends on whether the user has the
intention to regulate large amounts of work in a cluster, or wants to use the program
in the background of home systems to automate smaller projects. The scalability
options of the system in these aspects is a great resource.
8.3 Further Research and Work
Aside from user testing, small patches and implementing a targeted application (such
as distributed raytracing), identified improvements in functionality have been listed
below.
8.3.1 Process Interaction
Once processes are distributed currently, the process sent must be a standalone
procedure. In this case, the main sever would have to be more involved, keeping
note of which processes have been distributed where. The list of current clients could
be expanded to be a list of lists, containing the Node address as well as the current
processes. If we consider one process at each node for simplicity, cross process
interaction could be implemented by doing the following:
Page | 61
62. Figure 20. Node interaction diagram
1) A Client would request additional data relevant to the process being ran from
the server.
2) The Server, knowing which processes are running in the overall system, would
find a node running the needed data and halt its procedure.
3) The required node would confirm it is ready to set up connection with the other
node. The requesting node must initiate the setup, to a node which is currently
paused, due to the nature of channels having to have an input end set up as a
pretence for communication.
4) The new node address would be sent to the initial client where relevant Net
Channels would be automatically created, similar to moving when moving
processes, for transfer and control mechanics.
5) The nodes when then act like a client and server. The Server node would then
send an initiation signal, causing the client node to run, and being transfer.
Page | 62
63. With the current infrastructure of the implemented prototype, with some configuration,
this new system could be successfully implemented. The framework of this design is
not hard to implement in theory, but the semantics and order of communication would
have to be thoroughly deliberated upon.
8.3.2 Process Node Quantities
This is simply allowing the user to define how many Process Nodes they would like to
initialise. In order to keep processing limits within a reasonable window, the users
processing capabilities would have to be assed, limiting the amount of concurrent
processes.
This would also require either the user or developer to have prior knowledge of
estimated processing power that each individual process can consumes, otherwise
the system could spend a lot of time moving processes.
8.3.3 User Defined Processes
Implementing user processes would have to have two specific points of contention:
1) Methods would have to be adapted to conform to Agent classes
2) Code must be runnable.
This means code would have to be scanned or tested during run time to ensure all
aspects are serializable. This could be done by creating a Test Node which,
comprising of a try, catch system which returns exceptions when met.
Having runnable code is the main function of the CSProcess class, so methods
would have to be identified at input. This could include an interface which asks for
variables and the associated process separately.
Another method, which involve having some knowledge of the system, would be
implementing a wrapper classes which could affix the required connect methods for
Agents if the user under stands CSProcesses.
Page | 63
64. 8.3.4 Extended Network to Internet
This method is easily implementable, but does not conform to the Aims of this report.
By simply changing the node and server IP Addresses to public IPs rather than local,
the systems scale can be opened up to user in any location.
The problem then lies with security. There are currently no security measures in
place during communication. Although the mechanisms of the system are not
common place in Java, objects are still a universally used data type.
8.3.5 Automated Process Delivery
As the system stands, the universal clients are in tasked with acquiring processes.
This was implemented to regulate the speed of requests and allow easier debugging.
Automated processes can be implemented by keep track of how many processes are
running at each node from the Server.
If the Server records a Client node with free process Nodes, it can continue to send
more processes to the under loaded area. Polling for CPU usage at completion of
tasks to indicate whether more processes are needed would result in a well-balanced
system overall, but would result in higher volumes of traffic.
As previously stated, these options should be aligned to the chosen application of the
system and could be user controls put in place at initialisation.
8.4 Reflective Statements
During this project there have been multiple setbacks, those which could be avoided
and those which were unforeseeable. With most large projects, developers will never
be truly happy in what they have accomplished. Despite meeting the initial aims of
this investigation and being relatively pleased with the finished product, there are still
areas which could have been addressed sooner and shortcomings which will not be
repeated in the future.
Page | 64
65. 1) Progress trail
The first objective during this project was establishing a method of progression
monitoring. In turn a blog was created to document progress. However, the first
incarnation was hacked after two weeks. 1
This was a major setback in the project and resulted in decline of adequate tracking.
In the future, security implementation for a public web space will adhered to.
More importantly, a structured, documented development diary will be a higher
priority in the future. Keeping track of developments and meetings would have led to
much more streamlined approach and a better implementation overall. This also
pertains to the week 7 report which took place in the form of a viva voce in the Napier
Games Lab in the first week of December
2) Inadequate background understanding
Going into this project, I believed I had the sufficient understanding of the
fundamental concepts and technologies involved to create this system. Searching for
previous attempts at the problem proved fruitless (see appendix A) indicating there
was not a lot of reading on the subject. The IPO, although has the same conceptual
ethos, talks about accessing system hardware from a High level language and was
naïve in considering some of the goals for the time permitted and the level of work
expected.
However, we never know the depth of our own ignorance as this proved true when
half way through the project, I realised that a thread, the main method of running
work, was not serializable.
With future development, I will ensure that I read not only papers on implementation,
but technical documentation on processes and data types to ensure I grasp the
conceptual limitations as well as technical limitations.
In summation, I have learned that preparation and the process, are just as important
as the actual development.
Page | 65
66. 3) Time Management
For some of the project, personal circumstances dictated lack of work, but time
management could have been much better from the start. Work days were
established as Wednesdays but this was not particularly adhered to at the start of the
project. A gannt chart was drafted but after personal circumstances interfered, it was
not reviewed until after half the allotted time had transpired.
More tests into the efficiency of the system can still be ran and should be considered
part of further work.
Page | 66
67. 9 References
1. Austin, P., & Welch, P. (2008). CSP for JavaTM (JCSP) 1.1-rc4 API
Specification. Retrieved from CSP for Java:
https://www.cs.kent.ac.uk/projects/ofa/jcsp/jcsp-1.1-rc4/jcsp-doc/
2. Austin, P., & Welch, P. (2008). Interface CSProcess. Retrieved from CSP for
Java: https://www.cs.kent.ac.uk/projects/ofa/jcsp/jcsp-1.1-rc4/jcsp-
doc/org/jcsp/lang/CSProcess.html
3. Chalmers, K. (2008). Investigating Communicating Sequential Processes For
Java To Support Ubiquitous Computing. Edinburgh Napier University.
Retrieved April 22, 2016, from
https://www.researchgate.net/publication/239568086_INVESTIGATING_COM
MUNICATING_SEQUENTIAL_PROCESSES_FOR_JAVA_TO_SUPPORT_U
BIQUITOUS_COMPUTING
4. Chalmers, K., Kerridge, J. M., & Romdhani, I. (2007, July 8-11). Mobility in
JCSP: New Mobile Channel and Mobile Process Models. Retrieved 04 24,
2016, from ResearchGate: https://www.researchgate.net
5. Chalmers, K., Kerridge, J. M., & Romdhani, I. (2008). A critique of JCSP
Networking. The thirty-first Communicating Process Architectures Conference,
(pp. 7-10). York: P.H. Welch et al. doi:DOI: 10.3233/978-1-58603-907-3-27
6. Doallo, R., Expósito, R. R., Ramos, S., Taboada, G. L., & Touriño, J. (2013,
May 1). Java in the High Performance Computing arena: Research, practice
and experience. Science of Computer Programming, 78(5), 425-444.
Retrieved April 22, 2016, from
http://www.sciencedirect.com/science/article/pii/S0167642311001420
7. Doallo, R., Taboada, G. L., & Juan, T. (2009, April). F-MPJ: scalable Java
message-passing communications on parallel systems. The Journal of
Supercomputing, 60(1), 117-140. Retrieved April 22, 2016, from
http://link.springer.com/article/10.1007/s11227-009-0270-0
8. Funika, W., Godowski, P., & Pęgiel, P. (2008). A Semantic-Oriented Platform
for Performance Monitoring of Distributed Java Applications. Computational
Page | 67
68. Science – ICCS 2008, 5103 , 233-242. Retrieved April 22, 2016, from
http://link.springer.com/chapter/10.1007/978-3-540-69389-5_27#page-1
9. Hoare, C. A. (2004). Communicating Sequentual Processes. C.A.R. Hoare,
Prentice Hall International. Retrieved April 22, 2016, from
http://www.usingcsp.com/cspbook.pdf
10.Islam, N., & Shoaib, S. (2002, June 24). US Patent No. US 7454458 B2.
Retrieved April 22, 2016, from https://www.google.com/patents/US7454458
11.Jenkov, J. (n.d.). Retrieved from http://tutorials.jenkov.com/java-
concurrency/java-memory-model.html
12.Kerridge, J. (2014). Using Concurrency and Parallelism Effectively - 2nd
edition. BookBoon.
13.Lam, K. T., Luo, Y., & Wang, C.-L. (2010). Adaptive sampling-based profiling
techniques for optimizing the distributed JVM runtime. Parallel & Distributed
Processing (IPDPS), 2010 IEEE International Symposium on (pp. 1-11).
Atlanta: IEEE. doi:10.1109/IPDPS.2010.5470461
14.Lemos, J., Simão, J., & Veiga, L. (2011). A 2 -VM : A Cooperative Java VM
with Support for Resource-Awareness and Cluster-Wide Thread Scheduling.
On the Move to Meaningful Internet Systems: OTM 2011, 7044, 302-320.
Retrieved April 22, 2016, from
http://link.springer.com/chapter/10.1007%2F978-3-642-25109-2_20
15. MacEachern, D. (n.d.). (C. Technologies, Producer, & Hyperic) Retrieved from
https://support.hyperic.com/display/SIGAR/Home
16. Meddeber, M., & Yagoubi, B. (2010, September 22). Distributed Load
Balancing Model for Grid Computing. ARIMA Journal, 12. Retrieved April 22,
2016, from http://arima.inria.fr/012/pdf/Vol.12.pp.43-60.pdf
17. Olivier, S. (2008). Scalable Dynamic Load Balancing Using UPC. 2008 37th
International Conference on Parallel Processing. Portland: IEEE. Retrieved
April 22, 2016
18. Oracle. (2015 , 02 14). Learn About Java Technology. Retrieved from Java:
http://java.com/en/about/
19. Oracle. (2016). Interface OperatingSystemMXBean. Retrieved from Java™
Platform, Standard Edition 7:
https://docs.oracle.com/javase/7/docs/api/java/lang/management/OperatingSy
stemMXBean.html
Page | 68
69. 20. Shaw, B. (n.d.). Retrieved from
http://www.codeproject.com/Articles/30422/How-the-Java-Virtual-Machine-
JVM-Works
21. Winias, T. B., & Brown, J. S. (n.d.). Retrieved from
http://www.johnseelybrown.com/cloudcomputingpapers.pdf
22. Xoreax Software Ltd. (2016). Incredibuild. Retrieved from Incredibuild Beyond
Acceleration: https://www.incredibuild.com/
Page | 69
70. Appendix
A. Searched Terms
All results from 2005 were considered for inclusion.
Some results were duplicated for searches resulting in “0 Relevant” for later searches
Checked as of 22/04/2016
• “Load Balancing in Java” :
o “Distributed Load Balancing Model for Grid Computing” (Meddeber &
Yagoubi, 2010) – Focusses on modellling toppologies of Balancing with
basic information on system implementation
o “Scalable Dynamic Load Balancing Using UPC” (Olivier, 2008) – Uses
Unified Parallel C
o “Method and system for application load balancing” (US Patent No. US
7454458 B2, 2002) – Patent for similar system with no implementation.
Only conceptual with ambiguity in implementation.
• “CPU load balancing in Java” :
o “A Semantic-Oriented Platform for Performance Monitoring of
Distributed Java Applications” (Funika, Godowski, & Pęgiel, 2008) –
Platform for monitoring resources for online Java technologies
• “Java cluster computing”
o “Java in the High Performance Computing arena: Research, practice
and experience” (Doallo, Expósito, Ramos, Taboada, & Touriño, 2013)
– Looks into the methods facilitating the possibilities of High
Performance code using Java (Shared memory model, MPI etc...)
o “F-MPJ: scalable Java message-passing communications on parallel
systems” (Doallo, Taboada, & Juan, F-MPJ: scalable Java message-
passing communications on parallel systems, 2009) – Different MPI
implementation Document
• “Load balancing cluster computing Java” : 0 Relevant
• “CPU balancing cluster Java” : 0 Relevant
Page | 70
71. • “Load balancing cluster JVM” :
o “A 2
-VM : A Cooperative Java VM with Support for Resource-
Awareness and Cluster-Wide Thread Scheduling” (Lemos, Simão, &
Veiga, 2011) – Cluster infrastructure for Cloud computing systems
o “Adaptive sampling-based profiling techniques for optimizing the
distributed JVM runtime” (Lam, Luo, & Wang, 2010) – Builds a system
based on global variable for cluster, paying closed attention to thread
stacks
• “Load balancing cluster JCSP” : 0 Relevant
• “Load balancing asynchronous cluster Java” : 0 Relevant
• “CPU monitoring load balance cluster Java” : 0 Relevant
• “Cluster process sending Java” : 0 Relevant
Page | 71
86. Initial Project Overview
Initial Project Overview
SOC10101 Honours Project (40 Credits)
Title of Project: CPU Load Balancer
Overview of Project Content and Milestones
The Main Deliverable(s):
I intend to create a system which monitors CPU core usage over a cluster of
computers and calls another terminal to take on more load when one is starting to
reach maximum capacity; increasing speed and efficiency overall.
The system will implement the use of Agents which will move around the system,
arriving at each node (processor or core in this case) and connect to their main
processing stack to ascertain the current efficiency. Once finished, the Agent
disconnects and then moves itself on to the next core in the system. Using multiple
agents will be a goal for the project and attaining basic concurrency will be the first
milestone event.
As such, the system will be designed and implemented using the GROOVY 2.3
libraries for Java. This allows the user to easily manipulate threads at a high level
through the predominant use of message passing. It is not certain whether a hybrid
of message passing and shared memory will be possible to attain as it is noted that
pure message passing has a large overhead for copying messages from one process
to another. This is not a problem at a high level of programming, but at CPU or even
GPU instructions speeds, it is worth mentioning at the point that it’s not certain
whether will have a positive or negative impact.
Testing in the system will include the use of software metrics to ensure results are
expected in certain situations such as the coherency of specific function calls at point
of load shifting. CPU usage will be constantly observed and compared with different
methodologies and will be documented and collated in full throughout the whole
report.
The final product will be discreet during use and will not increase overhead
processing between operations when Agents are idle or during their transit between
nodes. It will be easy initiate and close with a basic visual monitoring system for the
user including concrete feedback for changes or problems. It should automatically
detect the amount of cores in use and be proficient over different architectures
although intel based chips will be the basis for development. It is not obvious at the
moment whether the use of hyper threading in conjunction may be possible, but it will
be documented when attempted.
Page | 86
87. The Target Audience for the Deliverable(s):
As the system will spread over multiple computers, it will be hindered by physical
restraints and associated speeds ramifications. Hence, as proof of concept, the
system will handle large computation problems which will not be I/O dependent. As
such, the system will be used to aid with large computations or those in need of
make shift data farms.
The Work to be Undertaken:
• Design a system which allows concurrent processing in a cluster computing
environment
• Dealing with interaction with other devices over network
o Adapting system to work on Mobile Devices
• Comparative analysis of communication methods (i.e. Ethernet, Wi-Fi etc.)
o Analysis of result output in correlation with message passing
parameters
• Comparative tests on different hardware architectures
Additional Information / Knowledge Required:
• Java Language
o Groovy library knowledge
• Concurrent and Parallel architecture knowledge
• Fundamental Android understanding (for mobile development)
• CPU usage metrics
Information Sources that Provide a Context for the Project:
Background and Rationale:
Computer hardware has evolved and so has the amount we attempt to implement at
any given point. From the initial single core processors to the Octocores of today,
engineers have strived to have the most powerful computers, greater speeds.
However, over time it’s become apparent that the implementation methods we have
been working from and towards are starting to level off. In the past, the first step in
augmenting any computer in terms of speed and performance has been reducing
transistors size and increasing speed henceforth. Co-founder of Intel Gordon E.
Moore stated that the number of transistors able to fit on a processor would double
every 18 months, fundamentally increasing the speed of computer for at least the
Page | 87
88. next decade. This model of thought is still used regularly in the computing industry
today, however it was initially stated in 1965 and since then, many things have
changed.
The problem we are met with today is distance, heat and conduction. The
physical size we are hitting on distance between cache memory and cores is become
reduced, more and more. We are starting to hit almost instantaneous transmissions
and this comes with another set of problems. Heat is generated when a CPU core is
pushed to compute at the rates we demand and can require more intricate ways to
cool the system, and this can all be down to bad allocation of resources.
We hence need to look at how we balance our work. Software needs to reflect the
modern multitasking environment that we have come to expect and hence, must
change in order to cope with increasing demand as hardware cannot be relied on to
be the sole supporter in this venture. I plan to build a system which allows a proper
allocation of resources available and increased the efficiency of hardware use in
order to achieve a faster, reliable system.
The Importance of the Project:
This project will be proof of concept for using multiple computers in a personal
environment to complete large computational problems with little impact on
performance on a whole in a discreet manor.
The Key Challenge(s) to be Overcome:
The initial challenge will be to ascertain whether an agent can become active on CPU
usage getting to a certain level on a terminal. On activation, the agent will report to a
central repository of addresses and move to a new terminal with lower CPU usage.
From here it should be able to display message on this machine. This will be done as
outlined below:
• Use Monte Carlo algorithm to processes large computation
o Create Agent to look at CPU usage
o CPU usage should report high
o Have Agent report to another resource
o Println “I am overloaded”
o Then build an event handler that has access to the channel which is
waiting for input from the processor
From here, we can then move onto moving key data. The intention is to create a
central repository of agents which then looks for a node which does not have an
agent active. From here we can move resources to the new processor.
The biggest challenge to overcome if the above system is complete in due course is
to be able to implement on a single CPU. Using cores would be the ultimate goal to
spread even use on one terminal but in choosing Java as the main platform, the JVM
involved gives little potential for working over cores. Using a different language could
be an answer but would require a large amount of research and development. For
the time being, what is detailed in the main deliverables is the main aim.
Page | 88