Samstag, 23. November 2013

A declarative content parser for Java

Recently, I worked on a project that required me to parse several files which came in their own file formats. To make things worse, the file format changed quite often such that the related code had to be adjusted quite often. In my opinion, object-oriented languages such as Java are not necessarily the sharpest tools for dealing with file parsing. For this reason, I tried to solve this problem with a declarative approach, the result of which I published on GitHub and on Maven Central.

This miniature tool provides help with parsing a custom content format by creating Java beans that extract this data where each bean resembles a single row of content of a specified source. The mapping of the input to a Java bean is based on regular expressions what allows great flexibility. However, the syntax of regular expressions is enhanced by properties expressions that allow the direct mapping of bean properties within this regular expression describing these contents.
This tool is intended to be as light-weight as possible and comes without dependencies. It is however quite extensible as demonstrated below.

A simple example

The tool is used by a single entry point, an instance of BeanTransformer which can read content for a specified bean. By doing so, the parser only needs to build patterns for a specified bean a single time. Therefore, this tool performs equally good as writing native content parsers, once it is set up.
As an example for the use of this tool, imagine you want to read in data from some sample file sample.txt containing the following data in an imaginary format:

##This is a first value##foo&&2319949,
##This is a second value##bar&&741981,
##This is a third value##&&998483,

This tool would now allow you to directly translate this file by declaring the following Java bean:

@MatchBy("##@intro@##@value@&&@number@,")
class MyBean {

    private String intro;

    @OptionalMatch
    private String value;

    private int number;

    // Getters and setters...
}

The @name@ expressions each represent a property in the bean which will be filled by the data found at this particular point within the regular expression. The expression can be escaped by preceeding the first @ symbol with backslashes as for example \\@name@. A back slash can be escaped in the same manner. With calling BeanTransformer.make(MyBean.class).read(new FileReader("./sample.txt")) you would receive a list of MyBean instances where each instance resembles one line of the file. All properties would be matched to the according property name that is declared in the bean.

Matching properties

The @MatchBy annotation can also be used for fields within a bean that is used for matching. This tool will normally discover a pattern by interpreting the type of a field that is referenced in the expression used for parsing the content. Since the @number@ expression in the example above references a field of type int, the field would be matched against the regular expression [0-9]+, representing any non-decimal number. All other primitive types and their wrapper types are also equipped with a predefined matching pattern. All other types will by default be matched by a non-greedy match-everything pattern. After extracting a property, the type of the property will be tried to be instantiated by:
  • Invoking a static function with the signature valueOf(String) defined on the type.
  • Using a constructor taking a single String as its argument
This default behavior can however be changed. A field can be annotated with @ResolveMatchWith which requires a subtype of a PropertyDelegate as its single argument. An instance of this class will be instantiated for transforming expressions from a string to a value and reverse. The subclass needs to override the only constructor of PropertyDelegate and accept the same types as this constructor. An optional match can be declared by annotating a field with @OptionalMatch. If no match can be made for such a field, the PropertyDelegate's setter will never be invoked (when using a custom PropertyDelegate).

Dealing with regular expressions

Always keep in mind that @MatchBy annotation take regular expressions as their arguments. Therefore, it is important to escape all special characters that are found in regular expressions such as for example .\\*,[](){}+?^$. Also, note that the default matching patterns for non-primitive types or their wrappers is non-greedy. This means that the pattern @name@ would match the line foo by only the first letter f. If you want to match the full line, you have to declare the matching expression as ^@name@$ what is the regular expression for a full line match. Be aware that using regular expressions might require you to define a @WritePattern which is described below. Using regular expressions allows you to specify matching constraints natively. Annotating a field with @MatchBy("[a-z]{1,5]") would for example allow for only matching lines where the property is represented by one to five lower case characters. Configuring mismatch handling is described below.

Writing beans

Similar to reading contents from a source, this utility allows to write a list of beans to a target. Without further configuration, the same pattern as in @MatchBy will be used for writing where the property expressions are substituted by the bean values. This can however result in distorted output since symbols of regular expressions are written as they are. Therefore, a user can define an output pattern by declaring @WritePattern. This pattern understands the same type of property expressions such as @name but does not use regular expressions. Remember that regular expressions must therefore not be escaped when a @WritePattern is specified. A property expression can however be escaped in the same manner as in a @MatchBy statement.

Handling mismatches

When a content source is parsed but a single line cannot be matched to the specified expression, the extraction will abort with throwing a TransformationException. Empty lines will however be skipped. This behavior can be configured by declaring a policy within @Skip. That way, a non-matched line can either ignored, throw an exception or be ignored for empty lines. An empty line a the end of a file is always ignored.

Builder

The BeanTransformer offers a builder which allows to specify different properties. Mostly, it allows to override properties that were declared in a specific bean such as the content pattern provided by @MatchBy, a @Skip policy or the @WritePattern. This allows the reuse of a bean for different content sources that contain the same properties but differ in their display. Also, this allows to provide a pattern at run time.

Performance considerations

All beans are constructed and accessed via field reflection. (It is therefore not required to define setters and getters for beans. A default constructor is however required. In the process, primitive types are accessed as such and not wrapped in order to avoid such overhead. Java reflection is usually considered to be slower than conventional access. However, modern JVMs such as the HotSpot JVM are efficient in detecting the repetitive use of reflection and compile such access into native byte code. Therefore, this tool should not perform worse than a hand-written matcher once the BeanTransformer is set up.

Extension points

Besides providing your own PropertyDelegates, it is possible to implement a custom IDelegationFactory which is responsible for creating (custom) PropertyDelegates for any field. The default implementation SimpleDelegationFactory provides an example for such an implementation. That way, it would be for example possible to automatically create suitable patterns for bean validation (JSR-303) annotations.

The code is licensed under the Apache Software License, Version 2.0.

Mittwoch, 13. November 2013

cglib: The missing manual

The byte code instrumentation library cglib is a popular choice among many well-known Java frameworks such as Hibernate (not anymore) or Spring for doing their dirty work. Byte code instrumentation allows to manipulate or to create classes after the compilation phase of a Java application. Since Java classes are linked dynamically at run time, it is possible to add new classes to an already running Java program. Hibernate uses cglib for example for its generation of dynamic proxies. Instead of returning the full object that you stored ina a database, Hibernate will return you an instrumented version of your stored class that lazily loads some values from the database only when they are requested. Spring used cglib for example when adding security constraints to your method calls. Instead of calling your method directly, Spring security will first check if a specified security check passes and only delegate to your actual method after this verification. Another popular use of cglib is within mocking frameworks such as mockito, where mocks are nothing more than instrumented class where the methods were replaced with empty implementations (plus some tracking logic).

Other than ASM - another very high-level byte code manipulation library on top of which cglib is built - cglib offers rather low-level byte code transformers that can be used without even knowing about the details of a compiled Java class. Unfortunately, the documentation of cglib is rather short, not to say that there is basically none. Besides a single blog article from 2005 that demonstrates the Enhancer class, there is not much to find. This blog article is an attempt to demonstrate cglib and its unfortunately often awkward API.

Enhancer

Let's start with the Enhancer class, the probably most used class of the cglib library. An enhancer allows the creation of Java proxies for non-interface types. The Enhancer can be compared with the Java standard library's Proxy class which was introduced in Java 1.3. The Enhancer dynamically creates a subclass of a given type but intercepts all method calls. Other than with the Proxy class, this works for both class and interface types. The following example and some of the examples after are based on this simple Java POJO:

public class SampleClass {
  public String test(String input) {
    return "Hello world!";
  }
}

Using cglib, the return value of test(String) method can easily be replaced by another value using an Enhancer and a FixedValue callback:

@Test
public void testFixedValue() throws Exception {
  Enhancer enhancer = new Enhancer();
  enhancer.setSuperclass(SampleClass.class);
  enhancer.setCallback(new FixedValue() {
    @Override
    public Object loadObject() throws Exception {
      return "Hello cglib!";
    }
  });
  SampleClass proxy = (SampleClass) enhancer.create();
  assertEquals("Hello cglib!", proxy.test(null));
}

In the above example, the enhancer will return an instance of an instrumented subclass of SampleClass where all method calls return a fixed value which is generated by the anonymous FixedValue implementation above. The object is created by Enhancer#create(Object...) where the method takes any number of arguments which are used to pick any constructor of the enhanced class. (Even though constructors are only methods on the Java byte code level, the Enhancer class cannot instrument constructors. Neither can it instrument static or final classes.) If you only want to create a class, but no instance, Enhancer#createClass will create a Class instance which can be used to create instances dynamically. All constructors of the enhanced class will be available as delegation constructors in this dynamically generated class.

Be aware that any method call will be delegated in the above example, also calls to the methods defined in java.lang.Object. As a result, a call to proxy.toString() will also return "Hello cglib!". In contrast will a call to proxy.hashCode() result in a ClassCastException since the FixedValue interceptor always returns a String even though the Object#hashCode signature requires a primitive integer.

Another observation that can be made is that final methods are not intercepted. An example of such a method is Object#getClass which will return something like "SampleClass$$EnhancerByCGLIB$$e277c63c" when it is invoked. This class name is generated randomly by cglib in order to avoid naming conflicts. Be aware of the different class of the enhanced instance when you are making use of explicit types in your program code. The class generated by cglib will however be in the same package as the enhanced class (and therefore be able to override package-private methods). Similar to final methods, the subclassing approach makes for the inability of enhancing final classes. Therefore frameworks as Hibernate cannot persist final classes.


Next, let us look at a more powerful callback class, the InvocationHandler, that can also be used with an Enhancer:

@Test
public void testInvocationHandler() throws Exception {
  Enhancer enhancer = new Enhancer();
  enhancer.setSuperclass(SampleClass.class);
  enhancer.setCallback(new InvocationHandler() {
    @Override
    public Object invoke(Object proxy, Method method, Object[] args) 
        throws Throwable {
      if(method.getDeclaringClass() != Object.class && method.getReturnType() == String.class) {
        return "Hello cglib!";
      } else {
        throw new RuntimeException("Do not know what to do.");
      }
    }
  });
  SampleClass proxy = (SampleClass) enhancer.create();
  assertEquals("Hello cglib!", proxy.test(null));
  assertNotEquals("Hello cglib!", proxy.toString());
}


This callback allows us to answer with regards to the invoked method. However, you should be careful when calling a method on the proxy object that comes with the InvocationHandler#invoke method. All calls on this method will be dispatched with the same InvocationHandler and might therefore result in an endless loop. In order to avoid this, we can use yet another callback dispatcher:

@Test
public void testMethodInterceptor() throws Exception {
  Enhancer enhancer = new Enhancer();
  enhancer.setSuperclass(SampleClass.class);
  enhancer.setCallback(new MethodInterceptor() {
    @Override
    public Object intercept(Object obj, Method method, Object[] args, MethodProxy proxy)
        throws Throwable {
      if(method.getDeclaringClass() != Object.class && method.getReturnType() == String.class) {
        return "Hello cglib!";
      } else {
        proxy.invokeSuper(obj, args);
      }
    }
  });
  SampleClass proxy = (SampleClass) enhancer.create();
  assertEquals("Hello cglib!", proxy.test(null));
  assertNotEquals("Hello cglib!", proxy.toString());
  proxy.hashCode(); // Does not throw an exception or result in an endless loop.
}


The MethodInterceptor allows full control over the intercepted method and offers some utilities for calling the method of the enhanced class in their original state. But why would one want to use other methods anyways? Because the other methods are more efficient and cglib is often used in edge case frameworks where efficiency plays a significant role. The creation and linkage of the MethodInterceptor requires for example the generation of a different type of byte code and the creation of some runtime objects that are not required with the InvocationHandler. Because of that, there are other classes that can be used with the Enhancer:
  • LazyLoader: Even though the LazyLoader's only method has the same method signature as FixedValue, the LazyLoader is fundamentally different to the FixedValue interceptor. The LazyLoader is actually supposed to return an instance of a subclass of the enhanced class. This instance is requested only when a method is called on the enhanced object and then stored for future invocations of the generated proxy. This makes sense if your object is expensive in its creation without knowing if the object will ever be used. Be aware that some constructor of the enhanced class must be called both for the proxy object and for the lazily loaded object. Thus, make sure that there is another cheap (maybe protected) constructor available or use an interface type for the proxy. You can choose the invoked constructed by supplying arguments to Enhancer#create(Object...).
  • Dispatcher: The Dispatcher is like the LazyLoader but will be invoked on every method call without storing the loaded object. This allows to change the implementation of a class without changing the reference to it. Again, be aware that some constructor must be called for both the proxy and the generated objects.
  • ProxyRefDispatcher: This class carries a reference to the proxy object it is invoked from in its signature. This allows for example to delegate method calls to another method of this proxy. Be aware that this can easily cause an endless loop and will always cause an endless loop if the same method is called from within ProxyRefDispatcher#loadObject(Object).
  • NoOp: The NoOp class does not what its name suggests. Instead, it delegates each method call to the enhanced class's method implementation.

At this point, the last two interceptors might not make sense to you. Why would you even want to enhance a class when you will always delegate method calls to the enhanced class anyways? And you are right. These interceptors should only be used together with a CallbackFilter as it is demonstrated in the following code snippet:

@Test
public void testCallbackFilter() throws Exception {
  Enhancer enhancer = new Enhancer();
  CallbackHelper callbackHelper = new CallbackHelper(SampleClass.class, new Class[0]) {
    @Override
    protected Object getCallback(Method method) {
      if(method.getDeclaringClass() != Object.class && method.getReturnType() == String.class) {
        return new FixedValue() {
          @Override
          public Object loadObject() throws Exception {
            return "Hello cglib!";
          };
        }
      } else {
        return NoOp.INSTANCE; // A singleton provided by NoOp.
      }
    }
  };
  enhancer.setSuperclass(MyClass.class);
  enhancer.setCallbackFilter(callbackHelper);
  enhancer.setCallbacks(callbackHelper.getCallbacks());
  SampleClass proxy = (SampleClass) enhancer.create();
  assertEquals("Hello cglib!", proxy.test(null));
  assertNotEquals("Hello cglib!", proxy.toString());
  proxy.hashCode(); // Does not throw an exception or result in an endless loop.
}

The Enhancer instance accepts a CallbackFilter in its Enhancer#setCallbackFilter(CallbackFilter) method where it expects methods of the enhanced class to be mapped to array indices of an array of Callback instances. When a method is invoked on the created proxy, the Enhancer will then choose the according interceptor and dispatch the called method on the corresponding Callback (which is a marker interface for all the interceptors that were introduced so far). To make this API less awkward, cglib offers a CallbackHelper which will represent a CallbackFilter and which can create an array of Callbacks for you. The enhanced object above will be functionally equivalent to the one in the example for the MethodInterceptor but it allows you to write specialized interceptors whilst keeping the dispatching logic to these interceptors separate.

 

How does it work?


When the Enhancer creates a class, it will set create a private static field for each interceptor that was registered as a Callback for the enhanced class after its creation. This also means that class definitions that were created with cglib cannot be reused after their creation since the registration of callbacks does not become a part of the generated class's initialization phase but are prepared manually by cglib after the class was already initialized by the JVM. This also means that classes created with cglib are not technically ready after their initialization and for example cannot be sent over the wire since the callbacks would not exist for the class loaded in the target machine.

Depending on the registered interceptors, cglib might register additional fields such as for example for the MethodInterceptor where two private static fields (one holding a reflective Method and a the other holding MethodProxy) are registered per method that is intercepted in the enhanced class or any of its subclasses. Be aware that the MethodProxy is making excessive use of the FastClass which triggers the creation of additional classes and is described in further detail below.

For all these reasons, be careful when using the Enhancer. And always register callback types defensively, since the MethodInterceptor will for example trigger the creation of additional classes and register additional fields in the enhanced class. This is specifically dangerous since the callback variables are also stored in the enhancer's cache: This implies that the callback instances are not garbage collected (unless all instances and the enhancer's ClassLoader is, what is unusual). This is in particular dangerous when using anonymous classes which silently carry a reference to their outer class. Recall the example above:


@Test
public void testFixedValue() throws Exception {
  Enhancer enhancer = new Enhancer();
  enhancer.setSuperclass(SampleClass.class);
  enhancer.setCallback(new FixedValue() {
    @Override
    public Object loadObject() throws Exception {
      return "Hello cglib!";
    }
  });
  SampleClass proxy = (SampleClass) enhancer.create();
  assertEquals("Hello cglib!", proxy.test(null));
}

The anonymous subclass of FixedValue would become hardly referenced from the enhanced SampleClass such that neither the anonymous FixedValue instance or the class holding the @Test method would ever be garbage collected. This can introduce nasty memory leaks in your applications. Therefore, do not use non-static inner classes with cglib. (I only use them in this blog entry for keeping the examples short.)

Finally, you should never intercept Object#finalize(). Due to the subclassing approach of cglib, intercepting finalize is implemented by overriding it what is in general a bad idea. Enhanced instances that intercept finalize will be treated differently by the garbage collector and will also cause these objects being queued in the JVM's finalization queue. Also, if you (accidentally) create a hard reference to the enhanced class in your intercepted call to finalize, you have effectively created an noncollectable instance. This is in general nothing you want. Note that final methods are never intercepted by cglib. Thus, Object#wait, Object#notify and Object#notifyAll do not impose the same problems. Be however aware that Object#clone can be intercepted what is something you might not want to do.

Immutable bean

cglib's ImmutableBean allows you to create an immutability wrapper similar to for example Collections#immutableSet. All changes of the underlying bean will be prevented by an IllegalStateException (however, not by an UnsupportedOperationException as recommended by the Java API). Looking at some bean

public class SampleBean {
  private String value;
  public String getValue() {
    return value;
  }
  public void setValue(String value) {
    this.value = value;
  }
}

we can make this bean immutable:

@Test(expected = IllegalStateException.class)
public void testImmutableBean() throws Exception {
  SampleBean bean = new SampleBean();
  bean.setValue("Hello world!");
  SampleBean immutableBean = (SampleBean) ImmutableBean.create(bean);
  assertEquals("Hello world!", immutableBean.getValue());
  bean.setValue("Hello world, again!");
  assertEquals("Hello world, again!", immutableBean.getValue());
  immutableBean.setValue("Hello cglib!"); // Causes exception.
}

As obvious from the example, the immutable bean prevents all state changes by throwing an IllegalStateException. However, the state of the bean can be changed by changing the original object. All such changes will be reflected by the ImmutableBean.

Bean generator

The BeanGenerator is another bean utility of cglib. It will create a bean for you at run time:

@Test
public void testBeanGenerator() throws Exception {
  BeanGenerator beanGenerator = new BeanGenerator();
  beanGenerator.addProperty("value", String.class);
  Object myBean = beanGenerator.create();
  
  Method setter = myBean.getClass().getMethod("setValue", String.class);
  setter.invoke(myBean, "Hello cglib!");
  Method getter = myBean.getClass().getMethod("getValue");
  assertEquals("Hello cglib!", getter.invoke(myBean));
}

As obvious from the example, the BeanGenerator first takes some properties as name value pairs. On creation, the BeanGenerator creates the accessors
  • <type> get<name>()
  • void set<name>(<type>)
for you. This might be useful when another library expects beans which it resolved by reflection but you do not know these beans at run time. (An example would be Apache Wicket which works a lot with beans.)

Bean copier

The BeanCopier is another bean utility that copies beans by their property values. Consider another bean with similar properties as SampleBean:

public class OtherSampleBean {
  private String value;
  public String getValue() {
    return value;
  }
  public void setValue(String value) {
    this.value = value;
  }
}

Now you can copy properties from one bean to another:

@Test
public void testBeanCopier() throws Exception {
  BeanCopier copier = BeanCopier.create(SampleBean.class, OtherSampleBean.class, false);
  SampleBean bean = new SampleBean();
  myBean.setValue("Hello cglib!");
  OtherSampleBean otherBean = new OtherSampleBean();
  copier.copy(bean, otherBean, null);
  assertEquals("Hello cglib!", otherBean.getValue());  
}

without being restrained to a specific type. The BeanCopier#copy mehtod takles an (eventually) optional Converter which allows to do some further manipulations on each bean property. If the BeanCopier is created with false as the third constructor argument, the Converter is ignored and can therefore be null.

Bulk bean

A BulkBean allows to use a specified set of a bean's accessors by arrays instead of method calls:

@Test
public void testBulkBean() throws Exception {
  BulkBean bulkBean = BulkBean.create(SampleBean.class, 
      new String[]{"getValue"}, 
      new String[]{"setValue"}, 
      new Class[]{String.class});
  SampleBean bean = new SampleBean();
  bean.setValue("Hello world!");
  assertEquals(1, bulkBean.getPropertyValues(bean).length);
  assertEquals("Hello world!", bulkBean.getPropertyValues(bean)[0]);
  bulkBean.setPropertyValues(bean, new Object[] {"Hello cglib!"});
  assertEquals("Hello cglib!", bean.getValue());
}

The BulkBean takes an array of getter names, an array of setter names and an array of property types as its constructor arguments. The resulting instrumented class can then extracted as an array by BulkBean#getPropertyBalues(Object). Similarly, a bean's properties can be set by BulkBean#setPropertyBalues(Object, Object[]).

Bean map

This is the last bean utility within the cglib library. The BeanMap converts all properties of a bean to a String-to-Object Java Map:

@Test
public void testBeanGenerator() throws Exception {
  SampleBean bean = new SampleBean();
  BeanMap map = BeanMap.create(bean);
  bean.setValue("Hello cglib!");
  assertEquals("Hello cglib", map.get("value"));
}

Additionally, the BeanMap#newInstance(Object) method allows to create maps for other beans by reusing the same Class.

Key factory 

The KeyFactory factory allows the dynamic creation of keys that are composed of multiple values that can be used in for example Map implementations. For doing so, the KeyFactory requires some interface that defines the values that should be used in such a key. This interface must contain a single method by the name newInstance that returns an Object. For example:

public interface SampleKeyFactory {
  Object newInstance(String first, int second);
}

Now an instance of a a key can be created by:

@Test
public void testKeyFactory() throws Exception {
  SampleKeyFactory keyFactory = (SampleKeyFactory) KeyFactory.create(Key.class);
  Object key = keyFactory.newInstance("foo", 42);
  Map<Object, String> map = new HashMap<Object, String>();
  map.put(key, "Hello cglib!");
  assertEquals("Hello cglib!", map.get(keyFactory.newInstance("foo", 42)));
}

The KeyFactory will assure the correct implementation of the Object#equals(Object) and Object#hashCode methods such that the resulting key objects can be used in a Map or a Set. The KeyFactory is also used quite a lot internally in the cglib library.

Mixin

Some might already know the concept of the Mixin class from other programing languages such as Ruby or Scala (where mixins are called traits). cglib Mixins allow the combination of several objects into a single object. However, in order to do so, those objects must be backed by interfaces:

public interface Interface1 {
  String first();
}

public interface Interface2 {
  String second();
}

public class Class1 implements Interface1 {
  @Override 
  public String first() {
    return "first";
  }
}

public class Class2 implements Interface2 {
  @Override 
  public String second() {
    return "second";
  }
}

Now the classes Class1 and Class2 can be combined to a single class by an additional interface:

public interface MixinInterface extends Interface1, Interface2 { /* empty */ }

@Test
public void testMixin() throws Exception {
  Mixin mixin = Mixin.create(new Class[]{Interface1.class, Interface2.class, 
      MixinInterface.class}, new Object[]{new Class1(), new Class2()});
  MixinInterface mixinDelegate = (MixinInterface) mixin;
  assertEquals("first", mixinDelegate.first());
  assertEquals("second", mixinDelegate.second());
}

Admittedly, the Mixin API is rather awkward since it requires the classes used for a mixin to implement some interface such that the problem could also be solved by non-instrumented Java.

String switcher

The StringSwitcher emulates a String to int Java Map:
@Test
public void testStringSwitcher() throws Exception {
  String[] strings = new String[]{"one", "two"};
  int[] values = new int[]{10, 20};
  StringSwitcher stringSwitcher = StringSwitcher.create(strings, values, true);
  assertEquals(10, stringSwitcher.intValue("one"));
  assertEquals(20, stringSwitcher.intValue("two"));
  assertEquals(-1, stringSwitcher.intValue("three"));
}

The StringSwitcher allows to emulate a switch command on Strings such as it is possible with the built-in Java switch statement since Java 7. If using the StringSwitcher in Java 6 or less really adds a benefit to your code remains however doubtful and I would personally not recommend its use.

Interface maker

The InterfaceMaker does what its name suggests: It dynamically creates a new interface.

@Test
public void testInterfaceMaker() throws Exception {
  Signature signature = new Signature("foo", Type.DOUBLE_TYPE, new Type[]{Type.INT_TYPE});
  InterfaceMaker interfaceMaker = new InterfaceMaker();
  interfaceMaker.add(signature, new Type[0]);
  Class iface = interfaceMaker.create();
  assertEquals(1, iface.getMethods().length);
  assertEquals("foo", iface.getMethods()[0].getName());
  assertEquals(double.class, iface.getMethods()[0].getReturnType());
}

Other than any other class of cglib's public API, the interface maker relies on ASM types. The creation of an interface in a running application will hardly make sense since an interface only represents a type which can be used by a compiler to check types. It can however make sense when you are generating code that is to be used in later development.

Method delegate

A MethodDelegate allows to emulate a C#-like delegate to a specific method by binding a method call to some interface. For example, the following code would bind the SampleBean#getValue method to a delegate:

public interface BeanDelegate {
  String getValueFromDelegate();
}

@Test
public void testMethodDelegate() throws Exception {
  SampleBean bean = new SampleBean();
  bean.setValue("Hello cglib!");
  BeanDelegate delegate = (BeanDelegate) MethodDelegate.create(
      bean, "getValue", BeanDelegate.class);
  assertEquals("Hello world!", delegate.getValueFromDelegate());
}

There are however some things to note:
  • The factory method MethodDelegate#create takes exactly one method name as its second argument. This is the method the MethodDelegate will proxy for you.
  • There must be a method without arguments defined for the object which is given to the factory method as its first argument. Thus, the MethodDelegate is not as strong as it could be.
  • The third argument must be an interface with exactly one argument. The MethodDelegate implements this interface and can be cast to it. When the method is invoked, it will call the proxied method on the object that is the first argument.
Furthermore, consider these drawbacks:
  • cglib creates a new class for each proxy. Eventually, this will litter up your permanent generation heap space
  • You cannot proxy methods that take arguments.
  • If your interface takes arguments, the method delegation will simply not work without an exception thrown (the return value will always be null). If your interface requires another return type (even if that is more general), you will get a IllegalArgumentException


Multicast delegate

The MulticastDelegate works a little different than the MethodDelegate even though it aims at similar functionality. For using the MulticastDelegate, we require an object that implements an interface:

public interface DelegatationProvider {
  void setValue(String value);
}

public class SimpleMulticastBean implements DelegatationProvider {
  private String value;
  public String getValue() {
    return value;
  }
  public void setValue(String value) {
    this.value = value;
  }
}

Based on this interface-backed bean we can create a MulticastDelegate that dispatches all calls to setValue(String) to several classes that implement the DelegationProvider interface:

@Test
public void testMulticastDelegate() throws Exception {
  MulticastDelegate multicastDelegate = MulticastDelegate.create(
      DelegatationProvider.class);
  SimpleMulticastBean first = new SimpleMulticastBean();
  SimpleMulticastBean second = new SimpleMulticastBean();
  multicastDelegate = multicastDelegate.add(first);
  multicastDelegate = multicastDelegate.add(second);

  DelegatationProvider provider = (DelegatationProvider)multicastDelegate;
  provider.setValue("Hello world!");

  assertEquals("Hello world!", first.getValue());
  assertEquals("Hello world!", second.getValue());
}

Again, there are some drawbacks:
  • The objects need to implement a single-method interface. This sucks for third-party libraries and is awkward when you use CGlib to do some magic where this magic gets exposed to the normal code. Also, you could implement your own delegate easily (without byte code though but I doubt that you win so much over manual delegation).
  • When your delegates return a value, you will receive only that of the last delegate you added. All other return values are lost (but retrieved at some point by the multicast delegate). 


Constructor delegate

A ConstructorDelegate allows to create a byte-instrumented factory method. For that, that we first require an interface with a single method newInstance which returns an Object and takes any amount of parameters to be used for a constructor call of the specified class. For example, in order to create a ConstructorDelegate for the SampleBean, we require the following to call SampleBean's default (no-argument) constructor:

public interface SampleBeanConstructorDelegate {
  Object newInstance();
}

@Test
public void testConstructorDelegate() throws Exception {
  SampleBeanConstructorDelegate constructorDelegate = (SampleBeanConstructorDelegate) ConstructorDelegate.create(
    SampleBean.class, SampleBeanConstructorDelegate.class);
  SampleBean bean = (SampleBean) constructorDelegate.newInstance();
  assertTrue(SampleBean.class.isAssignableFrom(bean.getClass()));
}


Parallel sorter

The ParallelSorter claims to be a faster alternative to the Java standard library's array sorters when sorting arrays of arrays:

@Test
public void testParallelSorter() throws Exception {
  Integer[][] value = {
    {4, 3, 9, 0},
    {2, 1, 6, 0}
  };
  ParallelSorter.create(value).mergeSort(0);
  for(Integer[] row : value) {
    int former = -1;
    for(int val : row) {
      assertTrue(former < val);
      former = val;
    }
  }
}

The ParallelSorter takes an array of arrays and allows to either apply a merge sort or a quick sort on every row of the array. Be however careful when you use it:
  •  When using arrays of primitives, you have to call merge sort with explicit sorting ranges (e.g. ParallelSorter.create(value).mergeSort(0, 0, 3) in the example. Otherwise, the ParallelSorter has a pretty obvious bug where it tries to cast the primitive array to an array Object[] what will cause a ClassCastException.
  • If the array rows are uneven, the first argument will determine the length of what row to consider. Uneven rows will either lead to the extra values not being considered for sorting or a ArrayIndexOutOfBoundException.
Personally, I doubt that the ParallelSorter really offers a time advantage. Admittedly, I did however not yet try to benchmark it. If you tried it, I'd be happy to hear about it in the comments.


Fast class and fast members

The FastClass promises a faster invocation of methods than the Java reflection API by wrapping a Java class and offering similar methods to the reflection API:

@Test
public void testFastClass() throws Exception {
  FastClass fastClass = FastClass.create(SampleBean.class);
  FastMethod fastMethod = fastClass.getMethod(SampleBean.class.getMethod("getValue"));
  MyBean myBean = new MyBean();
  myBean.setValue("Hello cglib!");
  assertTrue("Hello cglib!", fastMethod.invoke(myBean, new Object[0]));
}

Besides the demonstrated FastMethod, the FastClass can also create FastConstructors but no fast fields. But how can the FastClass be faster than normal reflection? Java reflection is executed by JNI where method invocations are executed by some C-code. The FastClass on the other side creates some byte code that calls the method directly from within the JVM. However, the newer versions of the HotSpot JVM (and probably many other modern JVMs) know a concept called inflation where the JVM will translate reflective method calls into native version's of FastClass when a reflective method is executed often enough. You can even control this behavior (at least on a HotSpot JVM) with setting the sun.reflect.inflationThreshold property to a lower value. (The default is 15.) This property determines after how many reflective invocations a JNI call should be substituted by a byte code instrumented version. I would therefore recommend to not use FastClass on modern JVMs, it can however fine-tune performance on older Java virtual machines.

cglib proxy

The cglib Proxy is a reimplementation of the Java Proxy class mentioned in the beginning of this article. It is intended to allow using the Java library's proxy in Java versions before Java 1.3 and differs only in minor details. The better documentation of the cglib Proxy can however be found in the Java standard library's Proxy javadoc where an example of its use is provided. For this reason, I will skip a more detailed discussion of the cglib's Proxy at this place.

A final word of warning

After this overview of cglib's functionality, I want to speak a final word of warning. All cglib classes generate byte code which results in additional classes being stored in a special section of the JVM's memory: The so called perm space. This permanent space is, as the name suggests, used for permanent objects that do not usually get garbage collected. This is however not completely true: Once a Class is loaded, it cannot be unloaded until the loading ClassLoader becomes available for garbage collection. This is only the case the Class was loaded with a custom ClassLoader which is not a native JVM system ClassLoader. This ClassLoader can be garbage collected if itself, all Classes it ever loaded and all instances of all Classes it ever loaded become available for garbage collection. This means: If you create more and more classes throughout the life of a Java application and if you do not take care of the removal of these classes, you will sooner or later run of of perm space what will result in your application's death by the hands of an OutOfMemoryError. Therefore, use cglib sparingly. However, if you use cglib wisely and carefully, you can really do amazing things with it that go beyond what you can do with non-instrumented Java applications.

Lastly, when creating projects that depend on cglib, you should be aware of the fact that the cglib project is not as well maintained and active as it should be, considering its popularity. The missing documentation is a first hint. The often messy public API a second. But then there are also broken deploys of cglib to Maven central. The mailing list reads like an archive of spam messages. And the release cycles are rather unstable. You might therefore want to have a look at javassist, the only real low-level alternative to cglib. Javassist comes bundled with a pseudo-java compiler what allows to create quite amazing byte code instrumentations without even understanding Java byte code. If you like to get your hands dirty, you might also like ASM on top of which cglib is built. ASM comes with a great documentation of both the library and Java class files and their byte code.

Note that these examples only run with cglib 2.2.2 and are not compatible with the newest release 3 of cglib. Unfortunately, I experienced the newest cglib version to occasionally produce invalid byte code which is why I considered an old version and also use this version in production. Also, note that most projects using cglib move the library to their own namespace in order to avoid version conflicts with other dependencies such as for example demonstrated by the Spring project. You should do the same with your project when making use of cglib. Tools such like jarjar can help you with the automation of this good practice.