Since the introduction in the early 1990s, Java has gained its popularity and emerged
as a great impact on IT-based sciences, engineering and commercial applications.
The Java is designed to be simple, object-oriented and user-friendly , so that it can
be easily programmed, learned and developed into applications. Besides, it is also
designed for creating highly secured and robust software application on various
platforms in heterogeneous and distributed networks. In order to run Java application
on different hardware and software platforms, the Java source code is first compiled
into architecture neutral intermediate format, i.e. bytecodes, generated by Java
Compiler . The generated machine independent bytecodes are then interpreted by
Java Virtual Machine (JVM) to the machine code for execution.
JVM is a specification of abstract machine for which Java compiler can generate
codes. It consists of stack based architecture. Specific implementation of JVM on
specific hardware or software platforms allows a same bytecodes to run on these
different platforms .
myfiles.java class files
Java Virtual Machine
native method invocation
Host Operating System
Diagram 1 Java Virtual Machine
Besides, Java technology supports concurrency through multithreading mechanism. It
allows parallel execution of instructions. The Java thread is provided in the language
library called Thread Class. A thread is a section of codes executed independently of
other threads of control within a single program. It creates the potential for parallel
programming with multithreading by using Java.
Computer clustering is a method of grouping multiple computers through fast local
area network to form a collaborative supercomputer environment. Clustering of
computers can perform differently in different applications, such as High-Availability
Cluster, Load-Balancing Cluster, High-Performance Cluster and Grid Computing.
Applications running on this group of computers will treat themselves running on a
single terminal, however, workload is shared among the clustered computers in
Nowadays, the conventional JVM is still lack of support for the cluster environment.
It is running as an instance in a local system where their workload cannot be shared
with other instances of JVMs within cluster.
This project intends to design a platform using Jikes Research Virtual Machine
(JikesRVM) for a single Java application to run on multiple terminals within a
cluster. The application has made explicit use of java threads by the programmer and
the platform intends to distribute the threads among the terminals without the
awareness of the programmer. The workload can be distributed and shared among the
terminals in the cluster to achieve higher performance. Besides, benchmarking suites
and tools will be developed to benchmark the performance of the system.
During the execution of the Java application, the threads are distributed and executed
within the cluster through TCP connection. Therefore, the application is expected to
gain higher performance than a single JVM while running in the networked JVMs.
Java class file
Java threads are distributed
to the peer node through
TCP connection and
(main node) (peer node)
Java class file
Java threads are distributed
to the peer node through
TCP connection and
(main node) (peer node)
Diagram 2 Basic model of the workload distribution
The primary idea of this project is firstly, the java class file is run on the modified
version of JVM, for my case, the JikesRVM. The main node is the node where the
multithreaded application is started. Before it starts distribute the workload, it creates
sessions with JVMs running on the peer nodes. These peer nodes are waiting for the
incoming request from the main node.
The main and peer communicates through TCP connection. All the setup of the
communication channels, I/O redirection, etc. will be done before the migration and
execution of threads.
Justification of study
With the ever-increasing popularity of World Wide Web, high-performance facilities
are shifting from supercomputer to network of stations. Network of computers are
usually deployed to achieve higher performance and availability over the single
computer. Besides, it is more cost effective than a single computer of comparable
speed and availability. Meantime, cluster computing has now becoming a norm for
providing high workload commercial applications. Therefore, efficient workload
balancing and thread migration are expected to play important role and widely
adopted in distributed system.
In the past, the performance of the Java programming language has been much worse
than other programming language such as C or C++. However, improvement in just-
in-time (JIT) compiler helps to boost performance of Java programs and enable Java
program to perform on par to C and C++ programs. Besides, it has also emerged as a
solution to unite Web, cluster, multiprocessor, and uniprocessor computing. Therefore,
Java programming language is now broadly used in high performance distributed
computing especially for server application.
Java offers a wide variety of interfaces and extensions for parallel and distributed
programming. This allows the Java as a language of High Performance Computing
Due to its platform independency, Java is suitable to develop highly secure and robust
server application that run within a cluster where each terminal in the cluster may
have different specification of hardware or software.
A lightweight, transparent and efficient Java thread migration mechanism
implemented at JVM level can help in shortening the execution time for multithreaded
Java application in cluster environment. It automatically exploits parallelism in
application by distributing threads, object and classes within the cluster.
Besides, server applications are mostly multi-threaded, with each thread servicing
different client that has limited interaction between them, therefore, it is believed that
the server applications can gain great improvement in performance with an efficient
thread migration mechanism in cluster.
1. To study and investigate existing workload distribution and thread migration
2. To study the performance of the platform running Java multithreaded application.
3. To design a platform where Java Virtual Machine within a cluster collaborates and
works together by sharing the workload.
Prior study shows that there are some projects are being done on the Java technology
on cluster and high performance computing especially in designing a distributed or
cluster aware JVM.
Java supports threads and provides concurrency constructs at the language level for
thread-based parallel computing . It is worth to study the possibility of extending the
conventional JVM to execute task concurrently in cluster as cluster computing is
becoming important in high performance computing. Execution of a single multi-
threaded Java program will span multiple machines.
There are some existing research projects which try to implement Java in distributed
Cluster Virtual Machine for Java (cJVM) 
This is a project from IBM Haifa Research Lab since 1999. The main objective of the
cJVM is to provide a single system image (SSI) of a conventional JVM while
running on a cluster. Java application can be run on the cJVM without any code
modifications. cJVM maintains a distributed heap among the JVMs within the cluster.
It uses the master-proxy model for object creation and the method shipping technique
for transparent remote object access. The workload distribution within the cluster is
conducted by means of remote thread creation. The cJVM runs on a cluster of IBM
IntelliStations running Windows NT which connected via a Myrinet switch. However,
this project is no longer active since year 2000. The cJVM achieves 80% higher
efficiency while running on a 4 cluster nodes by presenting a large set of
optimizations addressing caching, locality of execution and object placement .
Distributed JVM (dJVM) 
This is a project from Department of Computer Science of The Australian National
University. It objective is to provide a distributed JVM on a cluster which hide the
nature of the cluster from the Java application, i.e. SSI . The project is based on the
JikesRVM and the cluster consists of a 96 nodes, 192 processor machines, Bunyip
running Linux operating system . This project is the first implementation of
JikesRVM in distributed environment.
The project is under the Department of Computer Science of The University of Hong
Kong. It is a Java-based solution for integrating computing resources in a
heterogeneous environment . This implementation is also aim to hide the distributed
nature of the cluster from the application. Instead of using distributed heap as which
has been done by cJVM, it uses the concept of global thread space and global object
space, which is a sub space created through the support of a cluster enabled
infrastructure, i.e. Distributed Shared Memory (DSM) . The distribution of the
workload is realized by using the thread migration method .
Some important issues have to be addressed when designing either the workload
distribution mechanism in the JVM or the architecture of the cluster aware JVM
• Single System Image. Many studies address the importance of single system
image (SSI) in their implementation . John Zigman et al. and Yariv
Aridor et al. claim that their implementations hide the cluster from the
application, i.e. the application sees a traditional virtual machine, while their
system itself aware of the cluster. In M.J.M. Ma et al. implementation , to
bridge the cluster with the Java’s multithreading programming model, it
encapsulates system resources across the cluster in a single layer of
abstraction. Therefore, the user application running on the layer will see the
encapsulated resources as a single entity. All the migration and distribution
mechanisms are work without the awareness of the end user or the application
• Lightweight. In high performance computing, overheads are very sensitive to
the overall performance. Any overhead generated during the run time of the
application decrease the performance of the system. Therefore, runtime
overheads in terms of time and space to support thread migration should be
minimize . The overheads may occur due to certain circumstances, such as
message passing, class loading, object migration etc. Therefore, these
overheads should be considered seriously and be minimized when designing
the distributed JVM.
• Transparent. The implementation of the mechanism in the distributed JVM
should not introduce any special explicit migration call to Java threads . The
migration operation should transparent to the Java threads. Besides,
transparent thread migration makes the migrated thread appear as same as
traditional JVM threads to the other Java threads. Other threads will see the
migrated thread as same as a thread running in local system. Besides,
transparency also means that the migrated thread is no way to determine if it is
executing in which node .
• Balancing. All the threads must be distributed to utilize less loaded nodes and
the workload is span evenly within all the machines in cluster. By maintaining
balanced load within the nodes in the cluster, only the system will achieve
maximum gain in performance .
In this research, the system will be deployed on IBM’s Jikes Research Virtual
Machine (RVM) . The RVM is an open source project licensed under the CPL,
which has been approved by the Open Source Initiative (OSI) as a fully certified open
source license. Therefore, it is free, open source, distributed and freely redistributed.
This JVM aims to provide research communities with a flexible open test bed to
prototype new virtual machine technologies . It includes the latest virtual machine
technologies for dynamic compilation, adaptive optimization, garbage collection,
thread scheduling and synchronization. It has been deployed on many platforms such
as IA-32 Linux, PowerPC 32 and 64 AIX, PowerPC 32 and 64 Linux, PowerPC 32
OS X etc.
JikesRVM is previously developed under the Jalapeno research project in IBM
Watson’s Lab from December 1997 to October 2001. It is then open sourced by the
IBM in year 2001. There is a distinguish characteristic of JikesRVM compare to other
JVM is that it is implemented in Java. At its first release, the aim of this project is to
come out with a virtual machine for Java servers written in the Java language. Due to
this unique characteristic, transformation and optimization mechanism developed can
be used both on the application and on the JVM itself . The JVM is first self-
bootstrapped by running Java code on itself, without require a second virtual machine.
This implementation has provided additional degree of portability to the JVM to work
on different platforms.
The underlying operating system that hosts the JikesRVM in this research project is
Linux. Linux operating system is currently the most popular operating system among
the research communities. Therefore, all the terminals will use the Linux operating
system as the host to the JikesRVM.
As this project is aim to design a platform running within a cluster, the cluster consists
of a collection of homogeneous (same operating system and architecture) machines
connected locally by a network switch. Three Pentium PCs running Fedora Core
Linux will form the cluster in this project. These PCs are connected through a
switch and it is assume that other than the interconnect network, there are no other
physically shared resources between them.
For this project, the distribution system is build based on the existing JikesRVM
(version 2.4.6 and above). The JikesRVM will be modified by adding extra classes or
modifying the existing class to fit the requirement of the project. This is to make sure
that the modified JikesRVM can run on the cluster computer and the workload can be
distributed and shared among the terminals. More workload can be injected into any
of the terminals that running the modified version of JikesRVM and the workload will
be migrated and distributed autonomously according to a set of defined rules.
First of all, some communication mechanisms have to be implemented into the
JikesRVM so that the JVM can communicate within the cluster, all the
communications between the JVMs are done through TCP connection. Then, a
lightweight and transparent Java thread migration mechanism will be implemented at
the JVM level. This thread migration mechanism is the mean to migrate the workload
of the thread from one JVM to another within the cluster.
The workload can be shared and distributed among the machines in the cluster
automatically and performance of each machine will be optimized. There is a
mechanism will be implemented in the platform to load balance the workload among
the terminals. This mechanism is a decision making algorithm which will identify and
make decision on how the threads can be distributed according to a set of context it
The scalability of the final work will also be considered during the research. It means
that any workstation can join the cluster at anytime without any reconfiguration to the
whole cluster. Each terminal can join or exit from the cluster to share the workload
within the cluster to some degree of scalability. The degree of the scalability is
Benchmarking tools will be developed in this project to benchmark the performance
of the platform. It consists of some high computationally intensive application which
workload can be injected into the JVM and the result can be collected for analyze.
These applications consist of database searching application, data encryption and
decryption application, etc. The main criterion to benchmark the platform is the work
completion elapsed time.
Duration: Jun 2006 – May 2008 (2 Years)
Timeline: (refer to Appendix A)
 Gosling and McGilton (May 1996). The Java Language Environment. Sun
Microsystems Computer Company.
 Tim Lindholm and Frank Yellin (1999). The Java Virtual Machine Specification
(Second Edition). Sun Microsystems Computer Company.
 JikesRVM. JikesRVM HomePage. http://jikesrvm.org/
 IBM Haifa Labs. Cluster Virtual Machine for Java.
 Yariv Aridor, Michael Factor and Avi Teperman. cJVM: a Single System Image
of a JVM on a Cluster. In International Conference on Parallel Processing, pages
 Y.Aridor, M.Factor, A.Teperman, T.Eilam and A.Schuster. A High Performance
Cluster JVM Presenting a Pure Single System Image. In JAVA Grande, 2000.
 The Australian National University. Department of Computer Science (DCS),
Towards High-performance and Fault-tolerant Distributed Java Implementation.
 John Zigman and Ramesh Sankaranarayana. Designing a distributed JVM on a
cluster. In Proceedings of the 17th High Performance and Large Scale Computing
Conference, Nottingham, United Kingdom, 2003.
 System Research Group, Department of Computer Science, The University of
Hong Kong. JESSICA: Java-Enabled Single-System Image Computer
 M.J.M. Ma, C.-L. Wang and F.C.M. Lau. JESSICA: Java-Enabled Single-System-
Image Computing Architecture. Journal of Parallel and Distributed Computing,
Vol. 60, No. 10, 1194-1222, October 2000.
 Yariv Aridor, Michael Factor and Avi Terperman. Implementing Java on Clusters.
In Proceedings of the 7th International Euro-Par Conference Manchester on
Parallel Processing, pages 722-731, Rhodes, Greece, 2001.
 Wenzhang Zhu, Cho-Li Wang, and Francis C.M.Lau. Lightweight Transparent
Java Thread Migration for Distributed JVM. In International Conference on
Parallel Processing, pages 465-472, Kaohsiung, Taiwan, October 2003.
 M. J. M. Ma, C. L. Wang, and F. C. M. Lau. Delta execution: A preemptive Java
thread migration mechanism. Cluster Computing, Vol. 3, No. 2, pages 83-94,
 Alpern, B., Attanasio, C. R., Barton, J. J., Burke, M. G., Cheng, P., Choi, J.,
Cocchi, A., Fink, S. J., Grove, D., Hind, M., Hummel, S. F., Lieber, D., Litvinov,
V., Mergen, M. F., Ngo, T., Russell, J. R., Sarkar, V., Serrano, M. J., Shepherd, J.
C., Smith, S. E., Sreedhar, V. C., Srinivasan, H., and Whaley, J. The Jalapeño
virtual machine. IBM System Journal Vol. 39, Issue 1, pages 211 – 238, Jan.
 Red Hat, Inc. Fedora Project, sponsored by Red Hat. http://fedora.redhat.com/
Lam Hai Shuan
Faculty of Engineering
63100 Cyberjaya, Selangor.
Dr. Somnuk Phon-Amnuaisuk
Faculty of Information Technology
63100 Cyberjaya, Selangor.
Milestones 2006 2007 2008
Part 1: Literature Review and Study
Java Programming Language
Thread and Multithreading
Distributed Computing - Grid Computing and
JVM and Distributed JVM
Study on JikesRVM
Part 2: Research Proposal
Outline and Writing
Editing and Finalizing
Part 3: Design and Implement
3.1 Design Model for Migration of Thread Object
Basic migration model
Migration of thread involving no other object
Migration of thread involving primitive type
Migration of thread involving other object
(reference and array)
Other object cases such as during runtime
3.2 Cooperation between master and slave
Object synchronization between master and
Basic Communication model between master
Communication model for 1 master and 1 slave
Communication model for 1 master and many
Communication model for many master and
3.3 Decision Making of Thread Migration
3.4 Fault Tolerance and Error handling
Identifying the error may occurred during
Calculate fault tolerancy
Implementing error checking function to avoid
Part 4: Benchmarking and Fine Tuning
Calculate and fine tune the performance of
Benchmarking and result analyzing
Time difference between enabling and
disabling migration mechanism
Part 5: Thesis Writing
Design and Implementation
Editing and Finalizing