Research Projects Supported by
HKU's High Performance Computing Facilities

Department of Computer Science

Researcher:

Dr. Cho-Li Wang, Associate Professor (clwang@cs.hku.hk)

Project Title:

PG-JESSICA: A Profile-Guided Distributed Java Virtual Machine

Project Description:

JESSICA (Java-Enabled Single-System-Image Computing Architecture) is our on-going research project since 1998 and has earned us a reputation internationally; its goal is to establish a Java-based computing environment that enables programmers make use of distributed computing resources as if they were a single system.

Project Significance:

JESSICA itself is a distributed Java virtual machine (DJVM) that supports parallel execution of a multithreaded Java application in a networked cluster environment. The software implementation has evolved over years to incorporate the state-of-the-art runtime techniques and our research ideas. Since its second generation, JESSICA features a lightweight dynamic thread migration mechanism that exploits cluster-wide thread-level parallelism to achieve high performance computing (HPC). Java threads can freely move across node boundaries to aggregate computation power of cluster nodes and achieve more balanced workload. The system also implements distributed shared heap and global I/O support which allows migrated threads to access data objects in distributed memory or perform remote I/O spaces as if they were available locally. This clustering approach is totally transparent and user-friendly but requires strong runtime optimization techniques to sustain scalability for applications with intensive object sharing. In our current version, namely Profile-Guided JESSICA (or simply PG-JESSICA), we have built into JESSICA a special profiler subsystem. The profiler works at every node, samples objects in the heap, and scans thread stacks for frequently used objects during runtime. The online profiling results collected from each node, once combined, can depict the global inter-thread sharing profile that is useful to guide the cluster-wide thread scheduler to make favorable thread migrations for reaping locality benefits. Most of the communication costs can be saved by collocating highly correlated threads. We also ensure the cost of the profiler is minuscule by adaptively tuning the profiler’s sampling rate to strike a balance between preciseness and overhead.

BACK