Bytecode manipulation is a powerful tool in the arsenal of the Java developer. It can be used for tasks from compiling alternative programming languages to run in a JVM, to creating new classes on the fly at runtime, to instrumenting classes for performance analysis, to debugging, to altering or enhancing the capabilities of existing compiled classes. Traditionally, however, this power has come at a price: modifying bytecode has required an in-depth knowledge of the class file structure and has necessitated very low-level programming techniques. These costs have proven too much for most developers, and bytecode manipulation has been largely ignored by the mainstream.

The goal of the serp bytecode framework is to tap the full power of bytecode modification while lowering its associated costs. The framework provides a set of high-level APIs for manipulating all aspects of bytecode, from large-scale structures like class member fields to the individual instructions that comprise the code of methods. While in order to perform any advanced manipulation, some understanding of the class file format and especially of the JVM instruction set is necessary, the framework makes it as easy as possible to enter the world of bytecode development.

There are several other excellent bytecode frameworks available. Serp excels, however, in the following areas:

  • Ease of use. Serp provides very high-level APIs for all normal bytecode modification functionality. Additionally, the framework contains a large set of convenience methods to make code that uses it as clean as possible. From overloading its methods to prevent you from having to make type conversions, to making shortcuts for operations like adding default constructors, serp tries to take the pain out of bytecode development.
  • Power. Serp does not hide any of the power of bytecode manipulation behind a limited set of high-level functions. In addition to its available high-level APIs, which themselves cover the functionality all but the most advanced users will ever need, serp gives you direct access to the low-level details of the class file and constant pool. You can even switch back and forth between low-level and high-level operations; serp maintains complete consistency of the class structure at all times. A change to a method descriptor in the constant pool, for example, will immediately change the return values of all the high-level APIs that describe that method.
  • Constant pool management. In the class file format, all constant values are stored in a constant pool of shared entries to minimize the size of class structures. Serp gives you access to the constant pool directly, but most of you will never use it; serp's high-level APIs completely abstract management of the constant pool. Any time a new constant is needed, serp will automatically add it to the pool while ensuring that no duplicates ever exist. Serp also does its best to manipulate the pool so that the effects of changing a constant are as expected: i.e. changing one instruction to use the string "bar" instead of "foo" will not affect other instructions that use the string "foo", but changing the name of a member field will instantly change all instructions that reference that field to use the new name.
  • Instruction morphing. Dealing with the individual instructions that make up method code is the most difficult part of bytecode manipulation. To facilitate this process, most serp instruction representations have the ability to change their underlying low-level opcodes on the fly as the you modify the parameters of the instruction. For example, accessing the constant integer value 0 requires the opcode iconst0, while accessing the string constant "foo" requires a different opcode, ldc, followed by the constant pool index of "foo". In serp, however, there is only one instruction, constant. This instruction has setValue methods which use the given value to automatically determine the correct opcodes and arguments -- iconst0 for a value of 0 and ldc plus the proper constant pool index for the value of "foo".

Serp is not ideally suited to all applications. Here are a few disadvantages of serp:

  • Speed. Serp is not built for speed. Though there are plans for performing incremental parsing, serp currently fully parses class files when a class is loaded, which is a slow process. Also, serp's insistence on full-time consistency between the low and high-level class structures slows down both access and mutator methods. These factors are less of a concern, though, when creating new classes at runtime (rather than modifying existing code), or when using serp as part of the compilation process. Serp excels in both of these scenarios.
  • Memory. Serp's high-level structures for representing class bytecode are very memory-hungry.
  • Multi-threaded modifications. The serp toolkit is not threadsafe. Multiple threads cannot safely make modifications to the same classes the same time.
  • Project-level modifications. Changes made in one class in a serp project are not yet automatically propogated to other classes. However, there are plans to implement this, as well as plans to allow operations to modify bytecode based on specified patterns, similar to aspect-oriented programming.

Licence: BSD License