July 2021
After working with Spring Batch for a while and enjoying its simplicity for performing high-volume and high-performance batch jobs, I started being called “mister-data-loader” in the daily standup. After such honors I’d like to share my experience with the framework.
So, what is Spring Batch exactly and where or when would you use it? It’s an open source and lightweight framework supporting reusable components, that’s easily accessible and understandable for anyone who’s had some experience with the general Spring framework. It also benefits from all Spring framework advantages like separation of concern and ease of configuration. Briefly said: it aims to make repetitive processing of large volumes of data, and all the overhead associated with this, a lot easier than doing it from the ground up. A common example of a use case is the periodical syncing of data from an external datastore to your own database to prevent any loss of performance.
A batch process usually consists of reading some data from a certain source, processing it to another format and writing it to another source – a.k.a. a data-loader. Spring batch does this mostly using “chunk-oriented” processing (Figure 1), which means that items are read and processed one by one in memory until the number of items is equal to a predefined chunk size. Afterwards, the entire chunk is written at once. As I’ve found, this way of working can lead to some customization for the developer if you for example retrieve all your data at once from an API call.
Recent Comments