SQL Server useBulkCopyForBatchInsert configuration property

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn’t that be just awesome?

Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.

So, enjoy spending your time on the things you love rather than fixing performance issues in your production system on a Saturday night!

Introduction

In this article, we are going to see how the SQL Server useBulkCopyForBatchInsert configuration property works when using JDBC, JPA, or Hibernate.

SQL Server PreparedStatement batching

When using JPA and Hibernate, the generated SQL statements are going to be executed using the JDBC PreparedStatement because prepared statements increase the likelihood of statement caching, and you to avoid SQL injection attacks.

By default, when persisting several Post entities:

for (long i = 1; i <= 10; i++) {
    postRepository.persist(
        new Post()
            .setId(i)
            .setTitle(String.format("Post no. %d", i))
    );
}

Hibernate generates the following SQL INSERT statements:

Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 1, 1)]
Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 2, 2)]
Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 3, 3)]
Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 4, 4)]
Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 5, 5)]
Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 6, 6)]
Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 7, 7)]
Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 8, 8)]
Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 9, 9)]
Query:["insert into post (title, id) values (?, ?)"], Params:[(Post no. 10, 10)]

By default, Hibernate uses the executeUpdate method of the JDBC PreparedStatement, so there’s no batching involved.

To enable batching, we need to provide the following Hibernate configuration properties:

spring.jpa.properties.hibernate.jdbc.batch_size=10
spring.jpa.properties.hibernate.order_inserts=true
spring.jpa.properties.hibernate.order_updates=true

When rerunning the previous example that was persisting 10 Post entities, Hibernate is going to execute a single JDBC INSERT:

Type:Prepared, Batch:True, QuerySize:1, BatchSize:10, 
Query:["
    insert into post (title, id) values (?, ?)
"], 
Params:[
    (Post no. 1, 1), (Post no. 2, 2), (Post no. 3, 3), 
    (Post no. 4, 4), (Post no. 5, 5), (Post no. 6, 6), 
    (Post no. 7, 7), (Post no. 8, 8), (Post no. 9, 9), 
    (Post no. 10, 10)
]

If you are using the IDENTITY entity identifier strategy, Hibernate won’t be able to batch the insert statement automatically.

Check out this article.

So, using the default SQL Server JDBC Driver settings, a single statement was sent to the database server.

However, when we inspect the SQL Server query log:

SELECT
    deqs.execution_count [Execution Time],
    dest.TEXT AS [Statement]
 FROM 
    sys.dm_exec_query_stats AS deqs
    CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
ORDER BY 
    deqs.last_execution_time

We can see that our INSERT is executed 10 times as if it were run in a for-loop:

| Execution Count | Statement                                                                    |
|-----------------|------------------------------------------------------------------------------|
| 2               | select next value for post_SEQ                                               |
| 10              | (@P0 varchar(8000),@P1 bigint)insert into post (title, id) values (@P0, @P1) |

Even if JDBC batching helped us reduce 9 network roundtrips as we called the database server only once and not 10 times, the statement is still executed 10 times on the database side.

SQL Server useBulkCopyForBatchInsert JDBC Driver setting

According to the SQL Server JDBC Driver documentation, we can use the useBulkCopyForBatchInsert to transform a batch of INSERT statements into a single multi-value INSERT.

We can enable this setting on the SQLServerDataSource, like this:

SQLServerDataSource dataSource = getSQLServerDataSource();

dataSource.setRewriteBatchedStatements(true);

When we rerun the previous test case that inserts 10 Post entities, we can see that now the SQL Server JDBC Driver uses a SQLServerBulkCopy operation instead of executing a PreparedStatement:

SQL Server useBulkCopyForBatchInsert configuration property

The SQLServerBulkCopy is not logged as prepared statements in the dm_exec_query_stats Dynamic Management View, so you will no longer see the operation logged there.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Depending on the INSERT statements being executed, you may see a performance improvement when enabling the useBulkCopyForBatchInsert setting.

However, prior to enabling it, take a look at the associated GitHub issues to make sure that there is no open issue that might affect you.

Since this feature is rather new, it’s important to have a solid suite of integration tests that can validate both the effectiveness and the efficiency of this setting. Therefore, the test suite should not fail after enabling this setting, and the performance metrics should confirm the expected performance improvement.

Transactions and Concurrency Control eBook

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.