Spring batch – a user experience

July 2021

After working with Spring Batch for a while and enjoying its simplicity for performing high-volume and high-performance batch jobs, I started being called “mister-data-loader” in the daily standup. After such honors I’d like to share my experience with the framework.

So, what is Spring Batch exactly and where or when would you use it? It’s an open source and lightweight framework supporting reusable components, that’s easily accessible and understandable for anyone who’s had some experience with the general Spring framework. It also benefits from all Spring framework advantages like separation of concern and ease of configuration. Briefly said: it aims to make repetitive processing of large volumes of data, and all the overhead associated with this, a lot easier than doing it from the ground up. A common example of a use case is the periodical syncing of data from an external datastore to your own database to prevent any loss of performance.

A batch process usually consists of reading some data from a certain source, processing it to another format and writing it to another source – a.k.a. a data-loader. Spring batch does this mostly using “chunk-oriented” processing (Figure 1), which means that items are read and processed one by one in memory until the number of items is equal to a predefined chunk size. Afterwards, the entire chunk is written at once. As I’ve found, this way of working can lead to some customization for the developer if you for example retrieve all your data at once from an API call.

To get an idea of how Spring Batch looks in an actual code format, I’ll give a high-level example of the configuration of a simple job. As should be clear by now, spring batch’s main components are: an itemReader, an itemProcessor, and an itemWriter bean, which can all be configured by including beans of their respective types in your configuration. You can just override their read(), write() or process() method and you are good to go. Furthermore, Spring Batch provides a large number of preconfigured ItemReaders and ItemWriters that can be used with minimal configuration like e.g. a JdbcPagingItemReader, KafkaItemReader, FlatFileItemWriter, JsonItemWriter, … You can then combine all of these, together with any callback interfaces (listeners – used for logging, tracing, …) and Spring’s PlatformTransactionManager into a Step bean.

You can even make the step fault tolerant by adding skip, or retry logic that is executed on certain exceptions, in case a step fails to execute. After initializing the step, you can combine all of the configured steps into a Job bean. When launching the job using a JobLauncher, additional jobParameters can be defined that can be obtained using the spring expression language (SPeL) in @StepScope or @JobScope, scoped beans. This is especially handy if the job needs to run multiple times with different properties.

After creating a job, it still needs to be tested. Luckily, Spring Batch comes with an integrated testing framework to make our lives easier. All there is to do for end-to-end testing a batch job or step over a traditional Spring test is to add @SpringBatchTest to the test class and @Autowire a JobLauncherTestUtils bean into the class. After that, a test can be as short as 2 lines:

Contradictory to my previous claims that testing is easy for complete jobs or steps in Spring Batch, things can get a little trickier when testing more low-level components that are not ran within a job context. For example, I’ve found that Mockito’s “@mockbean” does not like job-scoped beans and that getting jobParameters outside of your job context can get frustrating.

 

To conclude I can say that working with Spring Batch is in my experience very intuitive and you learn as you go. There are endless use cases, especially in enterprise applications, and the jobs can be triggered with a scheduler or even from a message queue. There is so much more to discover and to do with Spring Batch than I’ve summarized in this article. Almost everything is configurable, and Spring has provided excellent documentation to do so. Maybe try to think of using it on your next application ;).

Author: Tom Haeldermans

References
https://docs.spring.io/spring-batch/docs/current/reference/html/index.html