The difference between loading and initializing classes in Java with a curious example

Hello, today’s article will be about some intricacies of loading and initializing classes and a little bit about performance (quite a bit at the very end).

The reason for writing the article was question on StackOverflow. Open, but take your time to read the answer 😉

Next is the question itself:

I’m making an API wrapper library that supports multiple versions of the API. A public class was added in the recent API version. I’m trying to compile the wrapper against the latest API version and make it check in run time if the new class exists. And I’m trying to avoid reflection and instead catch NoClassDefFoundError and set a flag accordingly. It works until I add a method that returns the class which the non-existent class extends. Then my library fails to load. I mean: BaseClass exists; ChildClass does not exist; the method uses ChildClass internally. If the method returns BaseClass the library fails to load. If the method returns Object the library loads and the error is deferred and can be caught.

Here, to create the desired effect of the absence of a class, we ourselves remove ChildClass from the classpath of the application and suddenly we catch an exception NoClassDefFoundError:

java.lang.NoClassDefFoundError: snippet/TestLoading$ChildClass
    at java.base/java.lang.Class.forName0(Native Method)
    at java.base/java.lang.Class.forName(Class.java:377)
    at snippet.TestLoading.loadMyClass(TestLoading.java:31)
    at snippet.TestLoading.main(TestLoading.java:25)
Caused by: java.lang.ClassNotFoundException: snippet.TestLoading$ChildClass
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:606)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:168)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
    ... 4 more

Unexpected here is the absence of class loading conditions in the application code, which the reader was probably asked about in interviews.

I’m sure everyone has been asked at least once a question like “tell me about class loading”, which was answered something like this:
1) Class loading in Java is lazy
2) the virtual machine itself determines the order of loading and initialization of the classes it needs from the JRE
3) loading custom classes (i.e. not related to the JRE) occurs at the first:

If I had been interviewed, I would have answered the question in the same way (and I would have been wrong). Even experienced developers often confuse the concepts of loading and initializing a class (linking is also performed between them, which in turn includes verification (verification), preparation (preparation) and sometimes resolution (resolution). At first glance, there is not much difference, after all, if we create an object or access static fields / methods, then in any case we need to perform loading (strictly speaking, which is only pulling up the binary representation of the class into memory and creating an object of type Class). Why then separate these concepts? How can loading be performed without subsequent binding and initialization?

It turns out that even without a clear distinction between these concepts, we will not be able to answer the question posed at the beginning of the article.

Let’s open section 12.2 Java Language Specification (JLS):

Loading refers to the process of finding the binary form of a class or interface with a particular name, perhaps by computing it on the fly, but more typically by retrieving a binary representation previously computed from source code by a Java compiler, and constructing, from that binary form Class object to represent the class or interface (§1.4).

Will tell us about initialization section 12.4.1:

A class or interface T will be initialized immediately before the first occurrence of any one of the following:

  • T is a class and an instance of T is created.
  • a static method declared by T is invoked.
  • a static field declared by T is assigned.
  • a static field declared by T is used and the field is not a constant variable (§4.12.4).

I emphasize: the conditions for initializing a class are strictly defined, which cannot be said about the conditions for loading a class.

In our code, there are two references to ChildClass, but none of the class initialization conditions from JLS §12.4.1 are satisfied at runtime. Nevertheless, the VM loads the class, which at first glance makes no sense.

Separately, I note that the exception is thrown precisely from Class.forName(BaseClassReturner.class.getName())challenge Class.forName(ObjectReturner.class.getName()) runs without errors, although BaseClassReturnerAnd ObjectReturner refer to ChildClass.

Thus, we see three strange things:

  • class loading that occurs somewhere in the bowels of the VM and is caused by a certain mechanism that works even if there is no call to the target class at runtime
  • explicit class loading dependency (fast/lazy) on the method signature (which is not called anywhere)
  • escape from custom code NoClassDefFoundError (although the documentation Class.forName(String) does not include it at all)

Let’s figure it out. First, let’s simplify the original version BaseClassReturner.getObject()making it as similar as possible to ObjectReturner.getObject():

public static class BaseClassReturner {
  static {
    System.out.println("loaded: " + BaseClassReturner.class.getName());
  }

  public BaseClass getObject() {
    return new ChildClass();
  }
}

Bytecode:

// access flags 0x1
public getObject()Lorg/example/TestLoading$BaseClass;
 L0
  LINENUMBER 49 L0
  NEW org/example/TestLoading$ChildClass
  DUP
  INVOKESPECIAL org/example/TestLoading$ChildClass.<init> ()V
  ARETURN
 L1
  LOCALVARIABLE this Lorg/example/TestLoading$BaseClassReturner; L0 L1 0
  MAXSTACK = 2
  MAXLOCALS = 1

Compare with bytecode ObjectReturner.getObject():

// access flags 0x1
public getObject()Ljava/lang/Object;
 L0
  LINENUMBER 43 L0
  NEW org/example/TestLoading$ChildClass
  DUP
  INVOKESPECIAL org/example/TestLoading$ChildClass.<init> ()V
  ARETURN
 L1
  LOCALVARIABLE this Lorg/example/TestLoading$ObjectReturner; L0 L1 0
  MAXSTACK = 2
  MAXLOCALS = 1

Except type this the only difference between the methods is the return type, so let’s focus on that.

Initially, I went down the wrong path, thinking something like this: since BaseClassReturner.getObject() returns type BaseClassthen the virtual machine must check for the presence of all its descendants by loading them.

This assumption is clearly erroneous, since firstly, when loading/initializing classes, the virtual machine loads/initializes parent classes, not child ones; it would be stupid to sort through the entire classpath.

You can empirically test that the hypothesis is wrong by rewriting the code as follows:

public static class BaseClassReturner {
  static {
    System.out.println("loaded: " + BaseClassReturner.class.getName());
  }

  public BaseClass getObject() {
    return null;
  }
}

Now no exception is thrown, which proves that there is no inheritance hierarchy check.

It turns out that we need some additional data. Run the program with VM flags -Xlog:class+init,class+load to get a detailed initialization and class loading log:

loading: org.example.TestLoading$ObjectReturner...
[0.393s][info][class,init] Start class verification for: org.example.TestLoading$ObjectReturner
[0.393s][info][class,init] End class verification for: org.example.TestLoading$ObjectReturner
[0.393s][info][class,init] 770 Initializing 'org/example/TestLoading$ObjectReturner' (0x0000000800067450)
loaded: org.example.TestLoading$ObjectReturner
[0.397s][info][class,load] org.example.TestLoading$BaseClassReturner source: file:/C:/Users/STsypanov/IdeaProjects/test/target/classes/
loading: org.example.TestLoading$BaseClassReturner...
[0.397s][info][class,init] Start class verification for: org.example.TestLoading$BaseClassReturner
[0.398s][info][class,init] 771 Initializing 'java/lang/ReflectiveOperationException'(no method) (0x0000000800004028)
[0.398s][info][class,init] 772 Initializing 'java/lang/ClassNotFoundException'(no method) (0x0000000800004288)
[0.398s][info][class,init] 773 Initializing 'java/lang/LinkageError'(no method) (0x00000008000044f8)
[0.398s][info][class,init] 774 Initializing 'java/lang/NoClassDefFoundError'(no method) (0x0000000800004758)
[0.398s][info][class,init] Verification for org.example.TestLoading$BaseClassReturner has exception pending 'java.lang.NoClassDefFoundError org/example/TestLoading$ChildClass'
[0.398s][info][class,init] End class verification for: org.example.TestLoading$BaseClassReturner
[0.398s][info][class,load] java.lang.Throwable$PrintStreamOrWriter source: jrt:/java.base
[0.398s][info][class,load] java.lang.Throwable$WrappedPrintStream source: jrt:/java.base
[0.398s][info][class,init] 775 Initializing 'java/lang/Throwable$PrintStreamOrWriter'(no method) (0x00000008000a0ed8)
[0.398s][info][class,init] 776 Initializing 'java/lang/Throwable$WrappedPrintStream'(no method) (0x00000008000a10f0)
java.lang.NoClassDefFoundError: org/example/TestLoading$ChildClass
[0.399s][info][class,init] 777 Initializing 'java/lang/StackTraceElement'(no method) (0x0000000800010858)
[0.399s][info][class,load] java.lang.StackTraceElement$HashedModules source: jrt:/java.base
[0.399s][info][class,init] 778 Initializing 'java/lang/StackTraceElement$HashedModules' (0x00000008000a1320)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:315)
at org.example.TestLoading.loadMyClass(TestLoading.java:29)
at org.example.TestLoading.main(TestLoading.java:23)
Caused by: java.lang.ClassNotFoundException: org.example.TestLoading$ChildClass
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
... 4 more

This part is interesting here:

[0.397s][info][class,load] org.example.TestLoading$BaseClassReturner source: file:/C:/Users/STsypanov/IdeaProjects/test/target/classes/
loading: org.example.TestLoading$BaseClassReturner...
[0.397s][info][class,init] Start class verification for: org.example.TestLoading$BaseClassReturner
[0.398s][info][class,init] 771 Initializing 'java/lang/ReflectiveOperationException'(no method) (0x0000000800004028)
[0.398s][info][class,init] 772 Initializing 'java/lang/ClassNotFoundException'(no method) (0x0000000800004288)
[0.398s][info][class,init] 773 Initializing 'java/lang/LinkageError'(no method) (0x00000008000044f8)
[0.398s][info][class,init] 774 Initializing 'java/lang/NoClassDefFoundError'(no method) (0x0000000800004758)
[0.398s][info][class,init] Verification for org.example.TestLoading$BaseClassReturner has exception pending 'java.lang.NoClassDefFoundError org/example/TestLoading$ChildClass'
[0.398s][info][class,init] End class verification for: org.example.TestLoading$BaseClassReturner

Note that class initialization requires class verification, which throws NoClassDefFoundError And LinkageError. It turns out that the legs grow from the check of the contents of the class, described in section 4.10 specification virtual machine (JVMS), which in terms of instructions areturn reads:

An areturn instruction is type safe iff the enclosing method has a declared return type, ReturnTypethat is a reference type, and one can validly pop a type matching ReturnType off the incoming operand stack.

The return type is described as follows:

If the method returns a reference type, only an areturn instructions may be used, and the type of the returned value must be assignment compatible with the return descriptor of the method (§4.3.3)

It turns out that the point is to check the bytecode of the loaded class. Now let’s run the original program with -noverify, which disables the specified check. And lo and behold: the exception is no longer thrown.

Together with -XX:TieredStopAtLevel=1 this flag is set by default at startup
Spring Boot applications from “Ideas”, significantly speeding up the inclusion.


Practical conclusion: nothing prevents us from using this trick with large and slow docker-compose files containing many Java applications (both self-written and various zookeepers, eurekas, elastics, etc.).
In production, this flag should be used very carefully (or not at all), especially
if in your application a lot of classes are created on the fly, in addition to obvious security problems, this can cause, incl. virtual machine crash.

Conclusions:

  • class loading is not always accompanied by its checking and initialization
  • bytecode validation can cause class loading
  • in some cases, bytecode checking can be disabled to speed up application launches

Discussion on the core-libs-dev mailing list: https://mail.openjdk.org/pipermail/core-libs-dev/2023-May/106219.html

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *