Efficiency of Spring applications at runtime. Current state of affairs and plans for the future

In light of the recent release of Spring Framework 6.1 and Spring Boot 3.2, we'd like to share an overview of the efforts the Spring team is making to enable developers to optimize the runtime performance of their applications.

In this article we will look at:

Context

Let's start with the most important question: why should we care about improving the performance of applications deployed in the cloud? The first reason is cost optimization. Everyone wants to pay less for hosting. However, choosing a cheaper hosting usually means using less CPU time, memory and resources, which makes our applications “heavy to lift”. We also live in a world where application launches are likely to be tied to Kubernetes and containers, and their use requires careful attention on our part to JVM startup times, warm-up times, and memory management.

The Spring team's goal is to provide a variety of options (some of which can be combined) to optimize the load and scalability of millions of Spring applications. Our goal is to minimize the number of changes you need to make to your Spring application to take advantage of these improvements. But achieving these goals, unfortunately, is usually associated with compromises, which we will try to describe in as much detail as possible. We hope that this article will give you enough information to get a clearer understanding of how the topics covered apply to your application, and also give you an idea of ​​what trade-offs you'll have to make in each case.

The general requirement to benefit from improvements in runtime efficiency is to upgrade to Spring Boot 3, which is based on Spring Framework 6, which in turn uses Java 17 as the base version and requires migrating from Java EE (javax package) to Jakarta EE (jakarta package). After upgrading, you gain access to a set of new features that improve the runtime performance of your applications.

Spring MVC with Virtual Threads on JDK 21

Let's start with a recently released technology available starting with Java 21. Technology Virtual Threads is designed to reduce the cost of server applications written in the simple and popular thread-per-request style to scale with near-optimal hardware utilization.

Virtual Threads make blocking I/O operations less expensive and are ideal for Spring Web MVC applications using servlets. Spring MVC can take full advantage of this innovation, for example, on Tomcat or Jetty configured to use Virtual Threads. In most cases, you won't even need to change existing code to do this. In addition, this method naturally adapts to optimal performance without requiring fine-tuning of the thread pool configuration.

We also received feedback from the Spring community asking us not to force developers into a difficult choice between RestTemplate and reactive WebClient. So we decided to implement a “Virtual Threads-friendly modern HTTP client” called RestClient (which is of course also an attractive option without Virtual Threads) in Spring Framework 6.1. Spring Cloud Gateway and its related infrastructure can benefit from using Virtual Threads just as much as Spring MVC.

So what does this mean for WebFlux and the reactive stack?

We deliberately decided to have different solutions for both the blocking and reactive stacks in order to get the maximum benefit from the WebFlux reactive server while maintaining the Spring Web MVC stack (the most commonly used stack with a standard blocking thread architecture on start.spring.io) as compact as possible. Virtual Threads are great for improving the scalability of traditional web applications using Spring MVC based on a servlet container. On the other hand, WebFlux Server provides an optimal reactive stack, being seamlessly compatible with Netty's I/O setup, providing similar runtime benefits, and using a different programming model.

When you need parallel execution of threads at the application level (such as when sending multiple remote HTTP requests, potentially in streaming mode and then combining the results), Structured Concurrency approach in Project Loom may provide you with an interesting low-level building block for building an application in the future, but it's not the type of API that a typical Spring application developer might need (and this project is still in preview). For such cases WebFlux and reactive APIs like Reactor provide a currently unprecedented advantage along with Kotlin Coroutines and their approach Flowwhich provides us with an interesting combination of imperative and declarative approaches to development. RSocket is another example of a major benefit of the reactive interaction model.

Note that you are not required to choose just one, since Spring MVC also provides optional support for the reactive model. So, if you only need threads to run in parallel in some cases, you can simply use a Spring MVC stack with Virtual Threads configured and seamlessly include, for example, interaction with a reactive WebClient in your web controllers, and Spring MVC will adapt the reactive return values to asynchronous Servlet responses. This support for reactivity in Spring MVC is completely optional, and Reactor and Reactive Streams are only needed when actually using reactive endpoints, and when the HTTP stack is based on a Servlet container such as Tomcat or Jetty (not Netty).

We expect that Virtual Threads will become the most popular choice for those using Spring MVC and Java 21+ with fairly typical web scripts. In general, the Java ecosystem still needs to more fully adapt to Virtual Threads to avoid CPU lock-in (for example, in common JDBC driver implementations), but even this issue is expected to be resolved soon. To evaluate Virtual Threads, make sure you are using Spring Boot version 3.2 or higher by setting the property spring.threads.virtual.enabled V trueand also use the latest available versions of libraries and drivers.

Optimized container deployment with Spring and GraalVM Native Image

We continue to improve native support GraalVMwhich first appeared in Spring Boot 3. The main use case is to build an optimized container image using Buildpacks, which contains a minimal base layer of the operating system and your application, compiled into a platform-specific executable file thanks to Spring AOT transformations and the GraalVM platform-specific image compiler. No JVM distribution is required.

This approach allows you to deploy containers with a minimum size that start with peak performance in a few tens of milliseconds (usually 50 times faster in startup time than a standard JVM), as well as with lower memory consumption.

GraalVM follows on the heels of introducing new features in Java and, for example, provides first-class support for Virtual Threads: see Josh Long's recent blog post, All together now.

GraalVM's excellent runtime performance is made possible by making some trade-offs. Compiling a platform-specific image takes minutes, not seconds. To properly manage reflections, proxies, and other dynamic behavior of the JVM, additional metadata is required. Spring includes a significant amount of this metadata, but any real project will likely require additional information to work correctly (for example, dependencies specific to your organization). Finally, the combination of Spring AOT transformations and the platform-specific GraalVM image requires us to freeze the classpath parameter and states of the Spring Boot bean at build time. You will be able to change the URL or password to the database at runtime through configs, but you will not be able to change the database type or do anything that changes the structure of Spring beans.

Historically, another disadvantage has been limited peak performance due to lack of timely compilation, but the advent of Oracle GraalVM licensed under GraalVM Free Terms and Conditions (see restrictions imposed) casts doubt on this. You can subscribe to Buildpacks RFC #294 to track potential future support for this product, and also try it now with your production Spring Boot applications, using this simple Dockerfile as a starting point.

With instant startup and immediate access to peak performance, platform-specific Spring Boot applications can scale-to-zero. Let's figure out what this means.

Scaling to zero

Scaling to zero is a kind of generalization of the concept of serverless. Workloads can only run on serverless cloud platforms, but also on any Kubernetes cluster or cloud platform that provides zero-scale functionality when there is no request to process. Thanks to Kubernetes, you can use solutions such as Knative or KEDA to scale to zero. And you can scale to zero with any type of application, including a traditional web application. The most important characteristic of serverless architecture is not the technical implementation, but the pay-as-you-go model it enables.

There are various use cases where scaling to zero can be interesting. The JVM is great when we need to develop high-traffic web applications, but let's be honest, we also have to develop a lot of small applications for office tasks that are typically not used 24/7. Why should we pay for time when no one is using them? There are also staging environments, which are typically used only for a very short period of time, and microservices, for which caching provides the ability to disable some of them for an extended period. And let's also not forget about high availability, which forces us to keep two instances of each service running in case of an emergency, since the start-up time of our applications is too long to quickly recover from a failure.

But how do you scale to zero for projects that cannot accept the trade-offs required for a platform-specific GraalVM image?

JVM Checkpoint Recovery: How to Scale to Zero with Spring and Project CRaC

CRaC is an OpenJDK project that defines a new Java API that allows you to checkpoint and restore an application on the HotSpot JVM, developed by the Azul Systems team and also supported by the AWS Lambda and IBM OpenLiberty teams. It is based on CRIUa project that implements checkpoint saving and recovery in Linux.

The principle is this: you run your application almost as usual, but on a version of the JDK that has CRaC functionality enabled. Then at some point, potentially after the JVM has warmed up under the workload, you initiate a checkpoint save via an API call, executing a jcmd command, calling an HTTP endpoint, etc.

After this, the image of the running, warmed-up JVM in RAM is serialized to disk, allowing you to very quickly restore it later, potentially on another machine with a similar operating system and CPU architecture. The restored process retains all the features of the HotSpot JVM, including further JIT optimizations at runtime.

It's worth noting that checkpoint saving and restoring fits very well with the Spring application context lifecycle of stop and start phases. CRaC support in Spring Framework 6.1 mainly comes down to mapping the CRaC and Spring lifecycles onto each other, the rest of the support is not tied to CRaC and mostly relates to Spring lifecycle improvements aimed at more gracefully closing and opening sockets, files and pools streams. The goal here is to support multiple stop and start cycles in addition to the normal start and stop.

Like GraalVM, Project CRaC allows the application to scale to zero with instant launch in a few tens of milliseconds, even on small servers. And this is 50 times faster than a regular JVM cold start (as is the case with the platform-specific GraalVM image). But let's look at the trade-offs involved in this decision.

The first trade-off is that CRaC requires you to pre-run your application before it goes live. Should you run it on your CI/CD platform? With or without production services? This trade-off raises many non-trivial questions.

The second tradeoff involves the need to close and correctly recreate any resources associated with sockets, files, and thread pools based on the CRaC lifecycle. Spring Boot takes care of this for you within the supported task set. But some libraries do not yet support this, so it may take some time before support for all the technologies you use appears.

The third compromise, in our opinion, is the most unpleasant. It may be tempting to create an offline image ready for recovery. But any sensitive information loaded into memory during checkpoint startup will be serialized into snapshot files, which may lead to its leakage. For example, the password for a production database may leak.

A potential solution to this issue would be to perform a checkpoint without any production environment configuration and then update the application configuration during recovery. This can be done using Spring Cloud Context and Annotations @RefreshScope. The Spring team may look into this topic in the future to see if it makes sense to add more native support. You can also use the strategy of creating and storing image files as part of encrypted content directly in your Kubernetes platform, even if this requires deeper platform integration.

The last key characteristic is that CRaC only runs on Linux and requires fine tuning of Linux functionality to run without privileged mode.

Don't forget that we are at the very beginning of the history of the CRaC project and that Spring Boot 3.2 is the very first version that supports it. Some of these limitations may be lifted as checkpoint recovery technology evolves along with Spring's support for it. Please also refer to Spring Framework documentation And https://github.com/sdeleuze/spring-boot-crac-demoif you want to try this technology yourself.

A sneak peek at the future of OpenJDK with Spring AOT and Project Leyden

We saw two ways to scale our Spring-based applications to zero, using GraalVM or CRaC. Each of them forces us to make some compromises. What if there was another way to improve the runtime performance of Spring Boot applications with fewer limitations?

You may have heard of Project Leyden, a new OpenJDK project that aims to improve startup times, peak performance times, and reduce the space footprint of Java programs.

We recommend watching this report from Brian Goetz if you want to know more.

Project Leyden recently introduced the concept of “premain” optimizations (essentially Class Data Sharing + AOT on steroids) and interestingly, the Java Platform team found significant synergies with Spring Ahead-Of-Time optimizationswhich were originally created to provide support for platform-specific GraalVM images, but are already capable of providing 15% faster startup on the JVM.

While “premain” optimizations are largely experimental (currently experimental branch in the Leyden GitHub repository), the Spring team was recently able to record a faster startup of the Spring Petclinic demo project (2 to 4 times) by combining Spring AOT on the JVM and optimizations from Project Leyden, as well as faster warm-up, and virtually no limitations .

In their current form, unlike GraalVM and CRaC, these optimizations do not scale to zero because they do not allow applications to launch in tens of milliseconds. However, if we get significant improvements in JVM startup time and warm-up time with virtually no limitations, this solution has the potential for widespread use and can be combined with other Leyden capabilities in development that can be selected at will. We're excited to announce that we've started a collaboration between the Java Platform Group and the Spring team to see how far we can push the boundaries of what's possible using the premain approach from Project Leyden. Combined with Spring AOT's JVM-specific improvements, we expect further optimizations to apply to a wide range of Spring applications. We will release more information in the coming months.

Check out repositoryif you want to try it yourself.

Conclusion

Feedback from the Spring community around the world has been a key source of inspiration for the Spring team, along with pragmatic collaborations with companies such as Oracle, Bellsoft, Azul and many others.

We're working hard to support new features while minimizing the impact on Spring application development by providing clear upgrade paths for many types of applications. This is the biggest challenge, but also the most rewarding aspect of our strategic infrastructure efforts.

And one last thing. We welcome feedback on what you are most interested in using in your organization and for your projects. Do you think zero-scaling and the pay-as-you-use model are worth the limitations of using GraalVM or CRaC? Is the reduced memory usage provided by GraalVM's platform-specific image a key benefit for you? Do you think Spring AOT on JVM combined with Project Leyden has high potential? What is your point of view on Virtual Threads? Please tell us about it!

Join the Russian-speaking community of Spring Boot developers in telegram – Spring IOto stay up to date with the latest news from the world of Spring Boot development and everything related to it.

Waiting for everybody, join us

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *