Scaling Spring Batch application with 30 Batch jobs #4700
Unanswered
abhinavsingh-ccm
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I’m working with a Spring Batch app that has around 30 jobs. Some jobs are dependent on others (e.g., Job B only runs after Job A completes), and all jobs run sequentially for ~500 accounts. To optimize, we’ve set this up as a StatefulSet and assigned accounts to specific pods (although the distribution isn’t perfect).
Eg:
Pod0 -> 50 accounts
Pod1 -> 50 accounts
...
Each Job has to be executed for all the accounts. That means, JobA has to be executed for all the 50 accounts on Pod0 and similarly in each pod.
Challenges:
Some accounts have a ton of data, slowing down other jobs. Long-running jobs consume most resources, causing delays in subsequent scheduled jobs. for eg: Pod0 has to execute a job name Job-A for 50 accounts. If one of the account has huge data size to process, it simply takes most of the time and delays the execution of the remaining 49 accounts.
What could be the best way to optimise this?
Also, we are thinking to go stateless, so any pod can pick up any job to improve flexibility. But I’m unsure how to set up HPA effectively—especially around what metrics to use to scale up/down based on job load. Because any pod's CPU and Memory will not be high if one account's job takes longer to process but it will delay the execution jobs for other accounts.
I’d love any advice on:
Good metrics for HPA in this setup
Ways to dynamically assign accounts across pods without impacting job dependencies
Note:
We are using external Postgres metadata job repository.
Beta Was this translation helpful? Give feedback.
All reactions