Sonntag, 5. August 2018

Hands on Java 11's constantdynamic

With the intention of making the JVM more appealing to dynamic languages, the seventh version of the platform had introduced invokedynamic to its instruction set. Java developers do not normally take note of this feature as it is hidden in Java byte code. In short, by using invokedynamic it has become possible to delay the binding of a method call until its first invocation. This technique is for example used by the Java language to implement lambda expressions which are only manifested on demand upon their first use. Doing so, invokedynamic has evolved into an essential language feature which I have described in detail in a previous blog posting. With constantdynamic a similar mechanism was introduced to Java 11, only that it delays the creation of a constant value. This posting describes the purpose and the inner workings of this feature and shows how to generate code that makes use of this new instruction using the Byte Buddy library.

What are constant values in Java?


Before Java 5, constant values in a Java programs could only be strings or of a primitive type. Those constants were built into the language as literals and are even assumed by the javac compiler to reduce the size of a class file. For example, in the following code snippet the value of the only field is never actually read but instead copied to its use site during compilation:

class ConstantSample {
  final String field = “foo”;
  void hello() {
    System.out.print(field);
  }
}

Instead of reading the field within the hello method, the generated byte code will contain a direct reference to the constant value foo. As a matter of fact, the above class will not ever attempt to read the field’s value what can be validated by altering it using Java reflection after which invoking hello would still print foo.

To represent such constant values, any Java class file contains a constant pool which can be thought of as a table that writes out any constant values that exist within the scope of a class. This implies constants that are used within methods or as field values but also other immutable information that describes a class such as the class’s name or names of invoked methods and their declaring type names. Once a value is recorded in the class’s constant pool, values can be referenced by an offset pointing to a specific entry within the constant pool. Doing so, values that are repeated throughout a class only need to be stored once because an offset can of course be referenced multiple times.

Therefore, when the field is read in the above source code, javac emits a byte code that refers to the offset of the value foo in the constant pool instead of emitting a read instruction for the field. This can be done as the field is declared final where javac ignores the edge-case of a reflective value change. By emitting an instruction to read a constant, javac also saves some bytes compared to an instruction for a field read. This is what makes this optimization lucrative, especially since string and numeric values are fairly commonly in any Java class. Smaller class files help the Java runtime to load classes quicker and an explicit notion of constantness helps the the JVM’s JIT and AOT compilers to apply further optimizations.

The described reuse of offsets for the same constant also implies an identity of reused values. As a consequence of representing an equal string value by a single instance the following statement will assert true in Java:

assert “foo” == “foo”;

Under the hood, both values of foo point to the same constant pool offset in the defining class’s constant pool. Additionally, the JVM even deduplicates constant strings across classes by interning strings that are found in constant pools.

Limitations of constant pool storage


Such tabular representation of values within a class file’s constant pool works well for simple values such as strings and numeric primitives. But at the same time, it can have non-intuitive consequences when javac is not discovering a value as being constant. For example, in the following class the only field’s value is not treated as a constant within the hello method:

class NoConstantSample {
  final String field = “foo”.toString();
  void hello() {
    System.out.print(field);
  }
}

While the toString method is trivial for strings, this circumstance remains unknown to javac which does not evaluate Java methods. Therefore, the compiler can no longer emit a constant pool value as the input to the print statement. Instead, it must emit a field read instruction of the field which requires additional bytes as it was mentioned before. This time, if the field’s value was changed by using reflection, invoking hello would therefore also print the updated value.

Of course, this example is contrived. But it is not difficult to imagine how limiting the classical approach to constants in Java plays out in practice. For example, imagine an integer value that is defined as Math.max(CONST_A, CONST_B). Of course, the maximum of two compile-time constants would itself be constant. Yet, due to javac’s inability of evaluating Java methods, the derived value is not discovered as a constant but only computed at runtime.

Another problem of declaring constant values in a class file’s constant pool, is its limitation to simple values. Strings and numerical values are of course trivial to represent, but more complex Java objects require more flexibility than the classical approach. To support additional constants, the Java class file format already added class literal constants in Java 5 where values such as String.class would no longer be compiled to a call to Class.forName("java.lang.String") but to a constant pool entry containing a class reference. And also the Java 7 release added new constant pool types to the class file specification to allow for a constant representation of MethodType and MethodHandle instances.

In contrast to strings, classes and primitive values, the Java programming language does however not offer a literal for creating those latter constants. Rather, the possibility for such constants was added to better support invokedynamic instructions where javac required an efficient way of representation. In essence, a lambda expression is described by the lambda’s expressions type signature - a MethodType - and a reference to its implementation - a MethodHandle. If both values had to be created as explicit, non-constant arguments for each call to a lambda expression, the performance overhead of using such expressions would surely have outweighed their benefit.

While this solution eased some intermediate pain, it implied a dissatisfying perspective onto Java’s future with regards of adding further constant types. A constant pool entry’s type is encoded by a single byte what severely limits the total number of possible constant types in a class file. As an additional hassle, changes to the class file format require a cascading adjustment of any tool that processes class files what makes a more generic approach for expressing constant values desirable. By introducing constantdynamic, such a mechanism is finally supported by the Java virtual machine with the upcoming release of Java 11.

Introducing dynamic constants


A dynamic constant is not created by processing a literal expression but by invoking a so-called bootstrap method that produces the constant value as its result. This is fairly similar to the invokedynamic instruction that binds method call sites by invoking a bootstrap method during runtime where a pointer to a target implementation for the dynamically bound call site is returned. As key difference, a bootstrapped constant is however immutable whereas dynamically bound method calls can be redirected to another implementation at a later point.

In essence, bootstrap methods are nothing more but Java methods with some requirements to their signature. As a first argument, any bootstrapping method receives a MethodHandles.Lookup instance that is automatically provided by the JVM. Such lookups give access with the privileges of the class that a particular instance of the class represents. For example, when MethodHandles.lookup() is invoked from any class, the caller-sensitive method returns an instance that for example allows for reading private fields of the calling class what would not be possible for a lookup instance that was created from within another class. In the case of a bootstrap method, the lookup represents the class that defines the dynamic constant under creation rather then the class that is declaring the boostrap method. Doing so, the bootstrap methods can access the same information as if the constant was created from within the constant-defining class itself. As a second argument, the bootstrap method receives the constant’s name and as a third argument, it receives the constants expected type. A bootstrap method must be static or a constructor where the constructed value represents the constant.

In many cases, none of these three arguments are required for implementing a bootstrap method but their existence allows for the implementation of more generic bootstrapping mechanisms to facilitate allow for the reuse of bootstrap methods for the creation of multiple constants. If desired, the last two arguments can also be omitted when declaring a bootstrap method. Declaring a MethodHandles.Lookup type as the first parameter is however required. This is done to allow potentially allowing further invocation modes in the future where the first parameter serves as a marker type. This is another difference to invokedynamic which allows allows for the omission of the first parameter.

With this knowledge, we can now express the previous maximum of two constants that was previously mentioned as a derived constant. The value is computed trivially by the following bootstrap method:

public class Bootstrapper {
  public static int bootstrap(MethodHandles.Lookup lookup, String name, Class type) {
    return Math.max(CONST_A, CONST_B);
  }
}

Since the lookup instance that is the first argument comes with the privileges of the class that defines the constant, it would also be possible to acquire the values of CONST_A and CONST_B by using this lookup, even if they were not normally visible to the bootstrap method, for example because they were private. The class’s javadoc explains in detail what API needs to be used for locating a field and to read their values.

In order to create a dynamic constant, a bootstrap method must be referenced within a class’s constant pool as an entry of type dynamic constant. As of today, the Java language has no way of creating such an entry and to my knowledge no other language is currently making use of this mechanism either. For this reason, we will look into creating such classes using the code generation library Byte Buddy later in this article. In Java pseudo code which hints constant pool values in comments, a dynamic constant and its bootstrap method would however be referred to as follows:

class DynamicConstant {
  // constant pool #1 = 10
  // constant pool #2 = 20
  // constant pool #3 = constantdyamic:Bootstrapper.bootstrap/maximum/int.class
  final int CONST_A = [constant #1], CONST_B = [constant #2];
  void hello() {
    System.out.print([constant #3]);
  }
}

Once the hello method is executed for the first time, the JVM would resolve the specified constant by invoking the Bootstrapper.bootstrap method with maximum as the constant name and int.class as the requested type for the created constant. After receiving a result from the bootstrap method, the JVM would then substitute any reference to the constant with this result and never invoke the bootstrap method again. This would also be true if the dynamic constant was referenced at multiple sites.

Avoiding custom bootstrap methods


For most cases, creating a dynamic constant does not require the implementation of an individual bootstrap method. To cover the majority of use cases, the JVM-bundled class java.lang.invoke.ConstantBootstraps already implements several generic bootstrap methods that can be used for the creation of most constants. As center piece, the class’s invoke method allows to define a constant by providing a method reference as a factory for a constant value. To make such a generic approach work, bootstrap methods are capable of receiving any number of additional arguments which must themselves be constant values. Those arguments are then included as references to other constant pool entries while describing the entry of the dynamic constant.

Doing so, the above maximum can rather be computed by providing a handle to the Math.max method and the two constant values of CONST_A and CONST_B as additional arguments. The implementation of the invoke method in ConstantBootstraps will then invoke Math.max using the two values and return the result where the bootstrap method is roughly implemented as follows:

class ConstantBootstraps {
  static Object invoke(MethodHandles.Lookup lookup, String name, Class type,
          MethodHandle handle, Object[] arguments) throws Throwable {
    return handle.invokeWithArguments(arguments);
  }
}

When additional arguments are provided to a bootstrap method, they are assigned in their order to every additional method parameter. To allow for more flexible bootstrap methods such as the invoke method above, the last parameter can also be of an Object array type to receive any excess arguments, in this case the two integer values. If a bootstrap method does not accept a provided argument, the JVM will not invoke the bootstrap method but throw a BootstrapMethodError during the failed constant resolution.

Using this approach, the pseudo code to using ConstantBootstraps.invoke would no longer required an individual bootstrap method and rather look as in the following pseudo code:

class AlternativeDynamicConstant {
  // constant pool #1 = 10
  // constant pool #2 = 20
  // constant pool #3 = MethodHandle:Math.max(int,int)
  // constant pool #4 = constantdyamic:ConstantBootstraps.invoke/maximum/int.class/#3,#1,#2
  final int CONST_A = [constant #1], CONST_B = [constant #2];
  void hello() {
    System.out.print([constant #4]);
  }
}

Nested dynamic constants


As mentioned, the arguments of a bootstrap method are required to be other constant pool entries. With dynamic constants being stored in the constant pool, this allows for nesting dynamic constants what makes this feature even more flexible. This comes with the intuitive limitation that the initialization of dynamic constants must not contain a circles. For example, the following bootstrap methods would be called from top to bottom if the Qux value was resolved:

static Foo boostrapFoo(MethodHandles.Lookup lookup, String name, Class type) {
  return new Foo();
}

static Bar boostrapBar(MethodHandles.Lookup lookup, String name, Class type, Foo foo) {
  return new Bar(foo);
}

static Qux boostrapQux(MethodHandles.Lookup lookup, String name, Class type, Bar bar) {
  return new Qux(bar);
}

When the JVM is required to resolve the dynamic constant for Qux, it would first resolve Bar what would again trigger a previous initialization of Foo as each value depends on the previous.

Nesting dynamic constants can also be required when expressing values that are not supported by static constant pool entry types such as a null reference. Before Java 11, a null value could only be expressed as a byte code instruction but not as a constant pool value where the byte code neither implied a type for null. To overcome this limitation, java.lang.invoke.ConstantBootstraps offers several convenience methods such as nullValue that allows bootstrapping a typed null value as a dynamic constant instead. This null value can then be supplied as an argument to another bootstrap method this method expected null as an argument. Similarly, it is not possible to express a primitive type literal such as int.class in the constant pool which can only represent reference types. Instead, javac translates for example int.class to a read of the static Integer.TYPE field which resolves its value of int.class on startup by a native call into the JVM. Again, ConstantBootstraps offers the primitiveType bootstrap method to represent such values easily as dynamic constants instead.

Why should one care about constant values?


All of the above might sound like a technical finesse that does not add much to the Java platform beyond what static fields already provide. However, the potential of dynamic constants is large but still unexplored. As the most obvious use case, dynamic constants can be used to properly implement lazy values. Lazy values are typically used to represent expensive objects only on-demand when they are used. As of today, lazy values are often implemented by using so-called double checked locking, a pattern that is for example implemented by the scalac compiler for its lazy keyword:

class LazyValue {
  volatile ExpensiveValue value;
  void get() {
    T value = this.value;
    if (value == null) {
      synchronized (this) {
        value = this.value;
          if (value == null) {
            value = new ExpensiveValue();
          }
       }
     }
     return value;
  }
}

The above construct requires a volatile read on every read despite the fact that the value never changes once it is initialized. This implies an unnecessary overhead which can be avoided by expressing the lazy value as a dynamic constant that is only bootstrapped if it is ever used. Especially in the Java core libraries this can be useful for delaying the initialization of many values that are never used, for example in the Locale class which initializes values for any supported language despite the fact that most JVMs only ever use the running machines standard language. By avoiding the initialization of such excess values, the JVM can boot up quicker and avoid using memory for dead values.

Another important use case is the availability of constant expressions to optimizing compilers. It is easy to imagine why compilers prefer processing constant values over mutable values. For example, if a compiler can combine two constants, the result of this combination can permanently replace the previous values. This would of course not be possible if the original values could change over time. And while a just-in-time compiler might still assume that mutable values are factually constant at runtime, an ahead-of-time compiler is dependent on some explicit notion of constantness. By assuring that bootstrap methods are side-effect free, future Java version could for example allow for their compile-time evaluation where constantdynamic could serve as a light-weight macro mechanism to widen the scope of native images written in Java using Graal.

Will I ever work with this feature?


When invokedynamic was introduced in Java 7, this new byte code feature was unused from the perspective of the Java language. However, as of Java 8 invokedynamic instructions can be found in most class files as an implementation of lambda expressions. Similarly, Java 11 does not yet use the constantdynamic feature but one can expect that this will change in the future.

During the latest JVMLS several potential APIs for exposing constantdynamic were already discussed (which would also make invokedynamic accessible via an API). This would be especially useful for library authors to allows them to better resolve critical execution paths but could also unlock some potential to improve javac’s constant detection, for example to widen the scope of non-capturing lambda expressions where field or variable access could be substituted by reading a constant value if a constant value was discovered during compilation. Finally, this new mechanism offers potential for future language enhancements such as a lazy keyword that avoids the overhead of the current equivalents in alternative JVM languages.

The constantdynamic feature can also be useful to Java agents that often need to enhance existing classes with additional information. Java agents cannot normally alter a classes by for example adding static fields as this can both interfere with reflection-based frameworks and since class format changes are forbidden on most JVMs when redefining an already loaded class. Neither restriction does however apply to dynamic constants that are added during runtime where a Java agent can now easily tag classes with additional information.

Creating dynamic constants using Byte Buddy


Despite the lack of language support for constantdynamic, JVMs of version 11 are already fully capable of processing class files that contain dynamic constants. Using the byte code generation library Byte Buddy, we can create such class files and load them into an early access build of the JVM.

In Byte Buddy, dynamic constants are represented by instances of JavaConstant.Dynamic. For convenience, Byte Buddy offers factories for any bootstrap method that is declared by the java.lang.invoke.ConstantBoostraps class such as the invoke method that was discussed previously.

For an easy example, the following code creates a subclass of Callable and defines the return value of the call method as a dynamic constant of the sample class. To bootstrap the constant, we are supplying the constructor of Sample to the mentioned invoke method:

public class Sample {
  public static void main(String[] args) throws Throwable {
    Constructor<? extends Callabable<?>> loaded = new ByteBuddy()
      .subclass(Callable.class)
      .method(ElementMatchers.named("call"))
      .intercept(FixedValue.value(JavaConstant.Dynamic.ofInvocation(Sample.class.getConstructor())))
    .make()
    .load(Sample.class.getClassLoader())
    .getLoaded()
    .getConstructor();

    Callable<?> first = loaded.newInstance(), second = loaded.newInstance();
    System.out.println("Callable instances created");
    System.out.println(first.call() == second.call());
  }
  
  public Sample() { 
    System.out.println("Sample instance created"); 
  }
}

If you run the code, note how only one instance of Sample is created as it was explained in this article. Also note how the instance is only created lazily upon the first invocation of the call method and after the creation of the Callable instances.

To run the above code, you currently have to run Byte Buddy with -Dnet.bytebuddy.experimental=true to unlock support for this feature. This changes once Java 11 is finalized and ready for release where Byte Buddy 1.9.0 will be the first version to support Java 11 out of the box. Also, there are still some rough edges in the latest Byte Buddy release when dealing with dynamic constants. Therefore, it is best to build Byte Buddy from the master branch or to use JitPack. To find more about Byte Buddy, visit bytebuddy.net.