A brief discussion of Java inter-application sharing and isolation taxonomy

Pete Soper

August 17, 2004

(minor edits August 21, 2004)

I've been viewing sharing and isolation as a broad spectrum of implementation possibilities for a while and, with the rest of the JSR-121 expert group, trying to keep the isolation APIs[1] general enough to avoid arbitrarily reducing choices by implementers. But I need to try to paint some of the spectrum by mentioning a few concrete implementation approaches and briefly explore how one might compare them with respect to isolation and metadata sharing.

The first metric of interest to supporters of JSR-121 is of course isolation, by which I mean the degree to which one application or app component sharing some part of the JRE infrastructure is likely to remain unaware of another. In the extreme case this is the degree to which one app could even detect the existence of another app, such as by measuring resource availability. On the other end of the spectrum are concerns of the kinds of unintended sharing consequences so familiar to users of classloaders ("I set this static and 'boing', the time zone changed for every applet in the system!").

So the first, metric, "isolation" might go something like this, with a "score" that is assigned just to suggest the likely ordering of merit of a few possible implementation approaches:

Degree of IsolationScore
higher is "better"
aclassloaders ("business as usual")0
breplication of statics plus disjoint heap roots for per-isolate state: apps coresident in same JRE instance1
c(b) + automatic native code isolation via clever OS linking loader scoping tricks2
dapp per JRE instance3
eapp per JRE instance + resource mgmt4
f(e) in Solaris Zone, BSD Jail, etc5

Recall that one guarantee of the JSR-121 specification is perfect isolation of Java state. Classloader approaches only eliminate visibility of some classes: others, those of the boot classpath for instance, are still shared. So a score of zero is assigned to classloaders for the isolation metric.

There are many more increments of isolation than these six: this is an extremely abbreviated list of possibilities. But it makes sense that with an arbitrary score of 5 one could presumably employ expert hackers to try to do naughty things with one application and they might be frustrated and ineffective attempting to detect or communicate with other applications. With a score of zero one applet can covertly send messages to another, monitor its activities, or suffer from its misbehavior quite easily.

The second metric, and one dear to the hearts of more and more Java implementers, is sharing. By this I mean "metadata sharing", NOT sharing of mutable application state such as the program variables or mutable objects. What I mean instead is sharing of "the other stuff" that is part of the execution environment and necessary to the application's running but that a clever implementation can arrange to be shared without applications being affected or even being aware of the optimization. A frequently used term for this is "metadata" but one might include certain immutable objects too, assuming the necessary heap redesign to support their sharing.

Now sharing of this kind (what Graham Hamilton called "inter-application sharing" in his JavaOne keynote in June) has a bearing on performance and performance is a perennial hot topic with any execution platform. And although ultimately, inter-application sharing may be totally orthogonal to isolation, where one goes the other always seems to become highly visible and important. So, for example, big rule number one with the class data sharing feature in Tiger was making it so that no isolation properties J2SE users rely on were compromised to cause a surprise, and being able to detect and disable sharing in any case where there was a risk of creating such a surprise at the point the application was run.

Here's a simple "degrees of sharing" table, again with arbitrary weighted scores suggesting an ordering that seems sensible to me at the moment:

Degree of SharingScore
higher is "better"
aapp per JRE instance0
b(a) + Tiger class data sharing1
c(b) + native code sharing2
dvery early fork of JRE3
eJRE fork after many classes loaded4
fapps in separate classloader scopes5
greplication of statics plus disjoint heap roots for per-isolate state: apps coresident in same JRE instance6

Approach (e) presumes custom heap design and support for library reinitialization to maintain sharing of shareable immutable objects and the "semantic contract" a user expects (this contract is discussed next). With this work it's possible that approach (e) might come close to the sharing levels of (f). There are many more combinations for this list and many more distinct sharing techniques. I'm also leaving out many details even with the list of approaches above. For instance IBM's persistent reusable JVM that caches class metadata includes native code sharing between instances of itself and might score in the 3-4 range, while the JanosVM research platform ("not really Java"[tm]) which has other specialized support for metadata sharing might score a 5 or 6.

Another important metric is the "execution semantic contract", by which I mean the degree to which a new application gets an execution environment that matches expectations. A high quality JRE will closely match what the user requests and the needs of the application with an execution environment that honors the implicit contract. For example, an application executed on a freshly booted computer might get just exactly what is expected, while an app launched with others inside its own classloader scope might be affected by artifacts the other apps have generated. An app that wanted to override a classpath but had its requested classpath simply appended to the search rules already in place for existing apps resident in the JRE might get a rude surprise. Here is a case where the requirements of J2ME and J2SE have sharp contrasts. With J2SE one can launch a Java application with myriad option combinations while with J2ME the choice is something closer to whether you feel the urge to tap your stylus on an application's icon or not. That is, the user gets the launcher capabilities J2ME provides and is happy, while J2SE users frequently place large demands on the launcher to set up a custom environment.

Here's an arbitrary ranking of the degree to which a semantic contract might be honored by different implementation approaches:

Degree of semantic complianceScore
higher is "better"
aapps in separate classloader scopes0
bvery early fork of JRE1
cJRE fork after many classes loaded but fall back to app per JRE instance if needed to avoid contract violation1-2
dreplication of statics plus disjoint heap roots for per-isolate state: apps coresident in same JRE instance but fall back to app per JRE instance if needed to avoid violation2
eapp per JRE instance2

So given this ranking we can say that an unsurprising J2SE implementation must have a minimum score of 2 in terms of this metric.

A final metric to discuss here is ease of implementation. This is important because it bears on what sharing and isolation techniques are practical for modest Java implementations and what techniques might require a large number of wizards a long time to create and maintain.

Here is a short list of scores for "degree of implementation difficulty" and this is just a suggestion of ordering: the jury is out for some of these approaches.

Degree of Implementation difficultyScore
higher is "better"
aapp per JRE instance0
bapps in separate classloader scopes1
cvery early fork of JRE2
dreplication of statics plus disjoint heap roots for per-isolate state: apps coresident in same JRE instance, transparent native code isolation3
e(d) but abstraction of global native state to per-isolate instances4
fJRE fork after many classes loaded (i.e. lots of classes and JRE state to reinitialize)5

Judging f to be harder than e might seem like a tough call. But to go through the JDK native code and arrange for global state to be per-isolate seems easier than going through all the JDK Java code and adding hooks for reinitialization or abstracting the state out entirely so it can be "reset" on behalf of a new application. It might also be easier to track down native state that is not per-isolate by a combination of conventions and custom tools, while determining and maintaining the correctness of class reinitialization seems more challenging to get right and a major headache to keep right. Taken to its logical conclusion, the approach SAP took with PAVM might be best when combined with aggressive metadata sharing.[2]

For the sake of brevity (and because of my ignorance), as with the other metrics, this list omits interesting work related to sharing and isolation implementation and glosses over a large number of details that have to be considered by Java implementers considering optimizations. There are also many other metrics making up a full taxonomy of Java implementations. So this discussion is not complete and is unlikely to be completely fair and balanced yet. But it's hopefully a start at developing a means of comparing Java implementations in terms of metadata sharing and application state isolation.



1. An easy to read overview of isolation is available on the JSR-121 interest list web page.
2. A brief description of SAP's PAVM system is in the interest list bibliography