Problem with N + 1 queries in JPA and Hibernate

Ahead of the course “Highload Architect” we invite you to attend an open lesson on the topic “Patterns of horizontal scaling of repositories”

In the meantime, we share the traditional translation of a useful article.


Introduction

In this article, I will explain what the N + 1 query problem is with JPA and Hibernate and how best to fix it.

The N + 1 problem is not specific to JPA and Hibernate, and you may encounter it when using other data access technologies.

What is the N + 1 problem

The N + 1 problem occurs when the data access framework executes N additional SQL queries to get the same data that can be obtained from a single SQL query.

The larger the value of N, the more queries will be executed and the greater the impact on performance. And although slow request log may help you find slow queries, but it will not find the N + 1 problem, since each separate additional query is fast enough.

The problem lies in the execution of many additional requests, which in total are already being executed for a significant time, affecting performance.

Consider the following database tables: post (posts) and post_comments (comments on posts), which are related by a relationship one-to-many:

Let’s insert four lines into the post table:

INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 1', 1)
  
INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 2', 2)
  
INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 3', 3)
  
INSERT INTO post (title, id)
VALUES ('High-Performance Java Persistence - Part 4', 4)

And in the table post_comment four child records:

INSERT INTO post_comment (post_id, review, id)
VALUES (1, 'Excellent book to understand Java Persistence', 1)
  
INSERT INTO post_comment (post_id, review, id)
VALUES (2, 'Must-read for Java developers', 2)
  
INSERT INTO post_comment (post_id, review, id)
VALUES (3, 'Five Stars', 3)
  
INSERT INTO post_comment (post_id, review, id)
VALUES (4, 'A great reference book', 4)

N + 1 problem with plain SQL

As already mentioned, the N + 1 problem can arise when using any data access technology, even when using SQL directly.

If you choose post_comments using the following SQL query:

List<Tuple> comments = entityManager.createNativeQuery("""
    SELECT
        pc.id AS id,
        pc.review AS review,
        pc.post_id AS postId
    FROM post_comment pc
    """, Tuple.class)
.getResultList();

And later, decide to get the title of the associated post for each comment (post_comment):

for (Tuple comment : comments) {
    String review = (String) comment.get("review");
    Long postId = ((Number) comment.get("postId")).longValue();
 
    String postTitle = (String) entityManager.createNativeQuery("""
        SELECT
            p.title
        FROM post p
        WHERE p.id = :postId
        """)
    .setParameter("postId", postId)
    .getSingleResult();
 
    LOGGER.info(
        "The Post '{}' got this review '{}'",
        postTitle,
        review
    );
}

You will get an N + 1 problem because instead of one SQL query, you did five (1 + 4):

SELECT
    pc.id AS id,
    pc.review AS review,
    pc.post_id AS postId
FROM post_comment pc
 
SELECT p.title FROM post p WHERE p.id = 1
-- The Post 'High-Performance Java Persistence - Part 1' got this review
-- 'Excellent book to understand Java Persistence'
    
SELECT p.title FROM post p WHERE p.id = 2
-- The Post 'High-Performance Java Persistence - Part 2' got this review
-- 'Must-read for Java developers'
     
SELECT p.title FROM post p WHERE p.id = 3
-- The Post 'High-Performance Java Persistence - Part 3' got this review
-- 'Five Stars'
     
SELECT p.title FROM post p WHERE p.id = 4
-- The Post 'High-Performance Java Persistence - Part 4' got this review
-- 'A great reference book'

It is very easy to fix this problem with N + 1 queries. All you need to do is retrieve all the data you need with one SQL query, like this:

List<Tuple> comments = entityManager.createNativeQuery("""
    SELECT
        pc.id AS id,
        pc.review AS review,
        p.title AS postTitle
    FROM post_comment pc
    JOIN post p ON pc.post_id = p.id
    """, Tuple.class)
.getResultList();
 
for (Tuple comment : comments) {
    String review = (String) comment.get("review");
    String postTitle = (String) comment.get("postTitle");
 
    LOGGER.info(
        "The Post '{}' got this review '{}'",
        postTitle,
        review
    );
}

This time, only one SQL query is executed and all the data that we want to use in the future is returned.

N + 1 problem with JPA and Hibernate

When using JPA and Hibernate there are several ways to get the N + 1 problem, so it is very important to know how to avoid such situations.

Consider the following classes that map to the post and post_comments tables:

JPA mapping looks like this:

@Entity(name = "Post")
@Table(name = "post")
public class Post {
 
    @Id
    private Long id;
 
    private String title;
 
    //Getters and setters omitted for brevity
}
 
@Entity(name = "PostComment")
@Table(name = "post_comment")
public class PostComment {
 
    @Id
    private Long id;
 
    @ManyToOne
    private Post post;
 
    private String review;
 
    //Getters and setters omitted for brevity
}

FetchType.EAGER

Using explicit or implicit FetchType.EAGER for JPA associations is a bad idea because a lot more data will be loaded than you need. Moreover, the strategy FetchType.EAGER also prone to N + 1 problems.

Sorry associations @ManyToOne and @OneToOne by default use FetchType.EAGER, so if your mappings look like this:

@ManyToOne
private Post post;

You are using FetchType.EAGER and every time you forget to specify JOIN FETCH when loading entities PostComment using JPQL Query or Criteria API:

List<PostComment> comments = entityManager
.createQuery("""
    select pc
    from PostComment pc
    """, PostComment.class)
.getResultList();

You are facing N + 1 problem:

SELECT
    pc.id AS id1_1_,
    pc.post_id AS post_id3_1_,
    pc.review AS review2_1_
FROM
    post_comment pc
 
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 1
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 2
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 3
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 4

Note the additional SELECT queries that came up because before returning the list of entities PostComment it is necessary to extract the association with post

In contrast to the defaults used in the method find of EntityManager, JPQL Queries and Criteria API explicitly specify a fetch plan, which Hibernate cannot change by automatically applying JOIN FETCH… So you have to do it manually.

If you don’t need an association with postthen out of luck: using FetchType.EAGER there is no way to avoid getting it. So the default is better to use FetchType.LAZY

But if you want to use the association with post, then you can use JOIN FETCHto avoid the N + 1 problem:

List<PostComment> comments = entityManager.createQuery("""
    select pc
    from PostComment pc
    join fetch pc.post p
    """, PostComment.class)
.getResultList();
 
for(PostComment comment : comments) {
    LOGGER.info(
        "The Post '{}' got this review '{}'",
        comment.getPost().getTitle(),
        comment.getReview()
    );
}

This time Hibernate will execute one SQL query:

SELECT
    pc.id as id1_1_0_,
    pc.post_id as post_id3_1_0_,
    pc.review as review2_1_0_,
    p.id as id1_0_1_,
    p.title as title2_0_1_
FROM
    post_comment pc
INNER JOIN
    post p ON pc.post_id = p.id
     
-- The Post 'High-Performance Java Persistence - Part 1' got this review
-- 'Excellent book to understand Java Persistence'
 
-- The Post 'High-Performance Java Persistence - Part 2' got this review
-- 'Must-read for Java developers'
 
-- The Post 'High-Performance Java Persistence - Part 3' got this review
-- 'Five Stars'
 
-- The Post 'High-Performance Java Persistence - Part 4' got this review
-- 'A great reference book'

Learn more about why the strategy should be avoided FetchType.EAGER, read in this article

FetchType.LAZY

Even if you explicitly switch to using FetchType.LAZY for all associations, then you can still run into the N + 1 problem.

This time the association with post is mapped like this:

@ManyToOne(fetch = FetchType.LAZY)
private Post post;

Now when you ask PostComment:

List<PostComment> comments = entityManager
.createQuery("""
    select pc
    from PostComment pc
    """, PostComment.class)
.getResultList();

Hibernate will execute one SQL query:

SELECT
    pc.id AS id1_1_,
    pc.post_id AS post_id3_1_,
    pc.review AS review2_1_
FROM
    post_comment pc

But if you later refer to this lazy-load association with post:

for(PostComment comment : comments) {
    LOGGER.info(
        "The Post '{}' got this review '{}'",
        comment.getPost().getTitle(),
        comment.getReview()
    );
}

You will get an N + 1 query problem:

SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 1
-- The Post 'High-Performance Java Persistence - Part 1' got this review
-- 'Excellent book to understand Java Persistence'
 
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 2
-- The Post 'High-Performance Java Persistence - Part 2' got this review
-- 'Must-read for Java developers'
 
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 3
-- The Post 'High-Performance Java Persistence - Part 3' got this review
-- 'Five Stars'
 
SELECT p.id AS id1_0_0_, p.title AS title2_0_0_ FROM post p WHERE p.id = 4
-- The Post 'High-Performance Java Persistence - Part 4' got this review
-- 'A great reference book'

Since the association with post is loaded lazily, accessing this association will execute an additional SQL query to get the required data.

Again, the solution is to add JOIN FETCH to request JPQL:

List<PostComment> comments = entityManager.createQuery("""
    select pc
    from PostComment pc
    join fetch pc.post p
    """, PostComment.class)
.getResultList();
 
for(PostComment comment : comments) {
    LOGGER.info(
        "The Post '{}' got this review '{}'",
        comment.getPost().getTitle(),
        comment.getReview()
    );
}

And as in the example with FetchType.EAGER, this JPQL query will generate one SQL query.

Even if you use FetchType.LAZY and do not reference the child associations of a bidirectional relationship @OneToOne, you can still get N + 1.

More on how to overcome the N + 1 c problem @OneToOne-associations, read in this article

Second level cache

The N + 1 problem can also occur when using a second-level cache to process collections or query results.

For example, if you run the following JPQL query using the query cache:

List<PostComment> comments = entityManager.createQuery("""
    select pc
    from PostComment pc
    order by pc.post.id desc
    """, PostComment.class)
.setMaxResults(10)
.setHint(QueryHints.HINT_CACHEABLE, true)
.getResultList();

If a PostComment is not in the second-level cache, then N requests will be executed to get each individual PostComment:

-- Checking cached query results in region: org.hibernate.cache.internal.StandardQueryCache
-- Checking query spaces are up-to-date: [post_comment]
-- [post_comment] last update timestamp: 6244574473195524, result set timestamp: 6244574473207808
-- Returning cached query results
  
SELECT pc.id AS id1_1_0_,
       pc.post_id AS post_id3_1_0_,
       pc.review AS review2_1_0_
FROM post_comment pc
WHERE pc.id = 3
  
SELECT pc.id AS id1_1_0_,
       pc.post_id AS post_id3_1_0_,
       pc.review AS review2_1_0_
FROM post_comment pc
WHERE pc.id = 2
  
SELECT pc.id AS id1_1_0_,
       pc.post_id AS post_id3_1_0_,
       pc.review AS review2_1_0_
FROM post_comment pc
WHERE pc.id = 1

Only entity IDs are stored in the query cache PostComment… Thus, if entities PostComment are not in the cache, they will be fetched from the database and you will receive N additional SQL queries.


More about the course “Highload Architect”

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *