Stack Allocation and Synchronization Optimizations for Java Using Escape Analysis

Copyright © (2003) by Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distrubuted for profit or commericial advantage. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee.

This article presents a framework for escape analysis for Java to determine (1) if an object
is not reachable after its method of creation returns, so that the object can be allocated on the
stack, and (2) if an object is reachable only from a single thread during its lifetime, allowing
unnecessary synchronization operations on that object to be removed. We introduce a new
program abstraction for escape analysis, the connection graph, that is used to establish reachability
relationships between objects and object references. We show that the connection graph can be
succinctly summarized for each method such that the same summary information may be used
in different calling contexts without introducing imprecision into the analysis. We present an
interprocedural algorithm that uses the above property to efficiently compute the connection
graph and identify the non-escaping objects for methods and threads. We prove the correctness
of this algorithm. Finally, we describe additional analysis (not yet incorporated in our current
implementation) that can be used to eliminate redundant storage synchronization operations
associated with locks in Java. The experimental results, from a prototype implementation of our
framework in the IBM High Performance Compiler for Java, are very promising. The percentage
of objects that may be allocated on the stack exceeds 70% of all dynamically created objects in
the user code in three out of the ten benchmarks (with a median of 19%), 11% to 92% of all
mutex lock operations on objects created in user code are eliminated in those ten programs (with
a median of 51%), and the overall execution time reduction ranges from 2% to 23% (with a median
of 7%) on a 333 MHz PowerPC workstation with 128 MB memory.

By: Jong-Deok Choi, Manish Gupta, Mauricio J. Serrano, Vugranam C. Sreedhar, Samuel P. Midkiff

Published in: ACM Transactions on Programming Languages and Systems , volume 25, (no 6), pages 876-910 in 2003

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc22340.pdf

Questions about this service can be mailed to reports@us.ibm.com .