Skip to content

Conversation

@sbidoul
Copy link
Member

@sbidoul sbidoul commented Jan 2, 2026

In this PR we cleanly separate the job acquisition (i.e. verifying the job is in the exepected state, marking it started and locking it) from job execution.

We also avoid trying to start the job if it is already locked by using SKIP LOCKED and exiting early. Indeed, in such situations, the job is likely already being handled by another worker so there is no point trying to start it, so we exit early and let it be handled either by the other worker or the dead job requeuer.

Following-up on #859 (comment)

maybe fixes #858

@OCA-git-bot
Copy link
Contributor

Hi @guewen,
some modules you are maintaining are being modified, check this out!

@sbidoul sbidoul force-pushed the 18.0-refactor-job-acquisition-sbi branch 2 times, most recently from 265b06e to 8d0b9c9 Compare January 2, 2026 12:56
@sbidoul sbidoul added this to the 18.0 milestone Jan 2, 2026
@sbidoul sbidoul changed the title [18.0] queue_job refactor job acquisition [18.0] queue_job: refactor job acquisition Jan 2, 2026
@sbidoul sbidoul force-pushed the 18.0-refactor-job-acquisition-sbi branch 2 times, most recently from eff0043 to d6c907a Compare January 2, 2026 17:51
@sbidoul sbidoul marked this pull request as ready for review January 2, 2026 17:55
In this commit we cleanly separate the job acquisition (i.e. verifying the job is in the exepected state, marking it started and locking it) from job execution.

We also avoid trying to start the job if it is already locked by using SKIP LOCKED
and exiting early. Indeed in such situations the job is likely already being handled by another worker so there is no point trying to start it, so we exit early
and let it be handled either by the other worker or the dead job requeuer.
@sbidoul sbidoul force-pushed the 18.0-refactor-job-acquisition-sbi branch from d6c907a to 367ab80 Compare January 2, 2026 18:00
" to make this tests work",
)

return job
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to JobCommonCase

@sbidoul
Copy link
Member Author

sbidoul commented Jan 2, 2026

This is ready to review.

It can be tested with a small number of workers and a root channel capacity greater than the number of workers and test jobs of duration >= 10 sec.

If the test graph has enough parallelism, you will see warnings about dead jobs being requeued (the jobs in state enqueued that are not picked up by workers in due time), but each job should execute only once.

Example: odoo --workers=3, root:10 and http://odoo/queue_job/create_test_job?size=20&job_duration=10.

The _acquire_job method is now simple enough that it can be unit tested. Next I may try refactoring the perform job part buy its a bit scarier :) But first, merging 19.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

After requeueing a job, first run often fails to update date_done with concurrent error

2 participants