Problem with N + 1 queries in JPA and Hibernate
Ahead of the course “Highload Architect” we invite you to attend an open lesson on the topic “Patterns of horizontal scaling of repositories”…
In the meantime, we share the traditional translation of a useful article.
Introduction
In this article, I will explain what the N + 1 query problem is with JPA and Hibernate and how best to fix it.
The N + 1 problem is not specific to JPA and Hibernate, and you may encounter it when using other data access technologies.
What is the N + 1 problem
The N + 1 problem occurs when the data access framework executes N additional SQL queries to get the same data that can be obtained from a single SQL query.
The larger the value of N, the more queries will be executed and the greater the impact on performance. And although slow request log may help you find slow queries, but it will not find the N + 1 problem, since each separate additional query is fast enough.
The problem lies in the execution of many additional requests, which in total are already being executed for a significant time, affecting performance.
Consider the following database tables: post (posts) and post_comments (comments on posts), which are related by a relationship one-to-many:
Let’s insert four lines into the post table:
INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 1', 1)
INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 2', 2)
INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 3', 3)
INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 4', 4)
And in the table post_comment
four child records:
INSERT INTO post_comment (post_id, review, id)
VALUES (1, 'Excellent book to understand Java Persistence', 1)
INSERT INTO post_comment (post_id, review, id)
VALUES (2, 'Must-read for Java developers', 2)
INSERT INTO post_comment (post_id, review, id)
VALUES (3, 'Five Stars', 3)
INSERT INTO post_comment (post_id, review, id)
VALUES (4, 'A great reference book', 4)
N + 1 problem with plain SQL
As already mentioned, the N + 1 problem can arise when using any data access technology, even when using SQL directly.
If you choose post_comments
using the following SQL query:
List<Tuple> comments = entityManager.createNativeQuery("""
SELECT
pc.id AS id,
pc.review AS review,
pc.post_id AS postId
FROM post_comment pc
""", Tuple.class)
.getResultList();
And later, decide to get the title of the associated post for each comment (post_comment):
for (Tuple comment : comments) {
String review = (String) comment.get("review");
Long postId = ((Number) comment.get("postId")).longValue();
String postTitle = (String) entityManager.createNativeQuery("""
SELECT
p.title
FROM post p
WHERE p.id = :postId
""")
.setParameter("postId", postId)
.getSingleResult();
LOGGER.info(
"The Post '{}' got this review '{}'",
postTitle,
review
);
}
You will get an N + 1 problem because instead of one SQL query, you did five (1 + 4):
SELECT
pc.id AS id,
pc.review AS review,
pc.post_id AS postId
FROM post_comment pc
SELECT p.title FROM post p WHERE p.id = 1
-- The Post 'High-Performance Java Persistence - Part 1' got this review
-- 'Excellent book to understand Java Persistence'
SELECT p.title FROM post p WHERE p.id = 2
-- The Post 'High-Performance Java Persistence - Part 2' got this review
-- 'Must-read for Java developers'
SELECT p.title FROM post p WHERE p.id = 3
-- The Post 'High-Performance Java Persistence - Part 3' got this review
-- 'Five Stars'
SELECT p.title FROM post p WHERE p.id = 4
-- The Post 'High-Performance Java Persistence - Part 4' got this review
-- 'A great reference book'
It is very easy to fix this problem with N + 1 queries. All you need to do is retrieve all the data you need with one SQL query, like this:
List<Tuple> comments = entityManager.createNativeQuery("""
SELECT
pc.id AS id,
pc.review AS review,
p.title AS postTitle
FROM post_comment pc
JOIN post p ON pc.post_id = p.id
""", Tuple.class)
.getResultList();
for (Tuple comment : comments) {
String review = (String) comment.get("review");
String postTitle = (String) comment.get("postTitle");
LOGGER.info(
"The Post '{}' got this review '{}'",
postTitle,
review
);
}
This time, only one SQL query is executed and all the data that we want to use in the future is returned.
N + 1 problem with JPA and Hibernate
When using JPA and Hibernate there are several ways to get the N + 1 problem, so it is very important to know how to avoid such situations.
Consider the following classes that map to the post and post_comments tables:
JPA mapping looks like this:
@Entity(name = "Post")
@Table(name = "post")
public class Post {
@Id
private Long id;
private String title;
//Getters and setters omitted for brevity
}
@Entity(name = "PostComment")
@Table(name = "post_comment")
public class PostComment {
@Id
private Long id;
@ManyToOne
private Post post;
private String review;
//Getters and setters omitted for brevity
}
FetchType.EAGER
Using explicit or implicit FetchType.EAGER
for JPA associations is a bad idea because a lot more data will be loaded than you need. Moreover, the strategy FetchType.EAGER
also prone to N + 1 problems.
Sorry associations @ManyToOne
and @OneToOne
by default use FetchType.EAGER
, so if your mappings look like this:
@ManyToOne
private Post post;
You are using FetchType.EAGER
and every time you forget to specify JOIN FETCH
when loading entities PostComment
using JPQL Query or Criteria API:
List<PostComment> comments = entityManager
.createQuery("""
select pc
from PostComment pc
""", PostComment.class)
.getResultList();
You are facing N + 1 problem:
SELECT
pc.id AS id1_1_,
pc.post_id AS post_id3_1_,
pc.review AS review2_1_
FROM
post_comment pc
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 1
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 2
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 3
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 4
Note the additional SELECT queries that came up because before returning the list of entities PostComment
it is necessary to extract the association with post
…
In contrast to the defaults used in the method find
of EntityManager
, JPQL Queries and Criteria API explicitly specify a fetch plan, which Hibernate cannot change by automatically applying JOIN FETCH
… So you have to do it manually.
If you don’t need an association with post
then out of luck: using FetchType.EAGER
there is no way to avoid getting it. So the default is better to use FetchType.LAZY
…
But if you want to use the association with post
, then you can use JOIN FETCH
to avoid the N + 1 problem:
List<PostComment> comments = entityManager.createQuery("""
select pc
from PostComment pc
join fetch pc.post p
""", PostComment.class)
.getResultList();
for(PostComment comment : comments) {
LOGGER.info(
"The Post '{}' got this review '{}'",
comment.getPost().getTitle(),
comment.getReview()
);
}
This time Hibernate will execute one SQL query:
SELECT
pc.id as id1_1_0_,
pc.post_id as post_id3_1_0_,
pc.review as review2_1_0_,
p.id as id1_0_1_,
p.title as title2_0_1_
FROM
post_comment pc
INNER JOIN
post p ON pc.post_id = p.id
-- The Post 'High-Performance Java Persistence - Part 1' got this review
-- 'Excellent book to understand Java Persistence'
-- The Post 'High-Performance Java Persistence - Part 2' got this review
-- 'Must-read for Java developers'
-- The Post 'High-Performance Java Persistence - Part 3' got this review
-- 'Five Stars'
-- The Post 'High-Performance Java Persistence - Part 4' got this review
-- 'A great reference book'
Learn more about why the strategy should be avoided FetchType.EAGER
, read in this article…
FetchType.LAZY
Even if you explicitly switch to using FetchType.LAZY
for all associations, then you can still run into the N + 1 problem.
This time the association with post is mapped like this:
@ManyToOne(fetch = FetchType.LAZY)
private Post post;
Now when you ask PostComment
:
List<PostComment> comments = entityManager
.createQuery("""
select pc
from PostComment pc
""", PostComment.class)
.getResultList();
Hibernate will execute one SQL query:
SELECT
pc.id AS id1_1_,
pc.post_id AS post_id3_1_,
pc.review AS review2_1_
FROM
post_comment pc
But if you later refer to this lazy-load association with post
:
for(PostComment comment : comments) {
LOGGER.info(
"The Post '{}' got this review '{}'",
comment.getPost().getTitle(),
comment.getReview()
);
}
You will get an N + 1 query problem:
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 1
-- The Post 'High-Performance Java Persistence - Part 1' got this review
-- 'Excellent book to understand Java Persistence'
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 2
-- The Post 'High-Performance Java Persistence - Part 2' got this review
-- 'Must-read for Java developers'
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 3
-- The Post 'High-Performance Java Persistence - Part 3' got this review
-- 'Five Stars'
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 4
-- The Post 'High-Performance Java Persistence - Part 4' got this review
-- 'A great reference book'
Since the association with post
is loaded lazily, accessing this association will execute an additional SQL query to get the required data.
Again, the solution is to add JOIN FETCH
to request JPQL
:
List<PostComment> comments = entityManager.createQuery("""
select pc
from PostComment pc
join fetch pc.post p
""", PostComment.class)
.getResultList();
for(PostComment comment : comments) {
LOGGER.info(
"The Post '{}' got this review '{}'",
comment.getPost().getTitle(),
comment.getReview()
);
}
And as in the example with FetchType.EAGER
, this JPQL query will generate one SQL query.
Even if you use FetchType.LAZY
and do not reference the child associations of a bidirectional relationship @OneToOne
, you can still get N + 1.
More on how to overcome the N + 1 c problem @OneToOne
-associations, read in this article…
Second level cache
The N + 1 problem can also occur when using a second-level cache to process collections or query results.
For example, if you run the following JPQL query using the query cache:
List<PostComment> comments = entityManager.createQuery("""
select pc
from PostComment pc
order by pc.post.id desc
""", PostComment.class)
.setMaxResults(10)
.setHint(QueryHints.HINT_CACHEABLE, true)
.getResultList();
If a PostComment
is not in the second-level cache, then N requests will be executed to get each individual PostComment
:
-- Checking cached query results in region: org.hibernate.cache.internal.StandardQueryCache
-- Checking query spaces are up-to-date: [post_comment]
-- [post_comment] last update timestamp: 6244574473195524, result set timestamp: 6244574473207808
-- Returning cached query results
SELECT pc.id AS id1_1_0_,
pc.post_id AS post_id3_1_0_,
pc.review AS review2_1_0_
FROM post_comment pc
WHERE pc.id = 3
SELECT pc.id AS id1_1_0_,
pc.post_id AS post_id3_1_0_,
pc.review AS review2_1_0_
FROM post_comment pc
WHERE pc.id = 2
SELECT pc.id AS id1_1_0_,
pc.post_id AS post_id3_1_0_,
pc.review AS review2_1_0_
FROM post_comment pc
WHERE pc.id = 1
Only entity IDs are stored in the query cache PostComment
… Thus, if entities PostComment
are not in the cache, they will be fetched from the database and you will receive N additional SQL queries.