flip_api.fl_services.run_jobs
Functions
Reset all FLScheduler rows stuck in BUSY to AVAILABLE. |
|
|
Core logic to run FL jobs, with stale-BUSY scheduler recovery. |
|
Scheduled task to run jobs every minute. |
Module Contents
- flip_api.fl_services.run_jobs._recover_stale_busy_schedulers(db: sqlmodel.Session) int
Reset all FLScheduler rows stuck in BUSY to AVAILABLE.
Uses a single atomic UPDATE statement to avoid read-side races with check_for_queued_jobs (which uses with_for_update) and eliminates the N+1 query pattern of the previous row-by-row approach.
BUSY schedulers with no associated job, or whose job has been deleted, are unrecoverable unless cleaned up here. This prevents a single crash from permanently starving a net of new training jobs.
- flip_api.fl_services.run_jobs.run_jobs_core(db: sqlmodel.Session) None
Core logic to run FL jobs, with stale-BUSY scheduler recovery.
Resets any FLScheduler rows stuck in BUSY status (e.g. from a crashed previous job run) before attempting to pick an available net.
- flip_api.fl_services.run_jobs.run_jobs_scheduled_task() None
Scheduled task to run jobs every minute. This function is called by the scheduler.
- Raises:
HTTPException – If there is an error while running jobs.