flip_api.fl_services.run_jobs ============================= .. py:module:: flip_api.fl_services.run_jobs Functions --------- .. autoapisummary:: flip_api.fl_services.run_jobs._recover_stale_busy_schedulers flip_api.fl_services.run_jobs.run_jobs_core flip_api.fl_services.run_jobs.run_jobs_scheduled_task Module Contents --------------- .. py:function:: _recover_stale_busy_schedulers(db: sqlmodel.Session) -> int Reset all FLScheduler rows stuck in BUSY to AVAILABLE. Uses a single atomic UPDATE statement to avoid read-side races with check_for_queued_jobs (which uses with_for_update) and eliminates the N+1 query pattern of the previous row-by-row approach. BUSY schedulers with no associated job, or whose job has been deleted, are unrecoverable unless cleaned up here. This prevents a single crash from permanently starving a net of new training jobs. .. py:function:: run_jobs_core(db: sqlmodel.Session) -> None Core logic to run FL jobs, with stale-BUSY scheduler recovery. Resets any FLScheduler rows stuck in BUSY status (e.g. from a crashed previous job run) before attempting to pick an available net. .. py:function:: run_jobs_scheduled_task() -> None Scheduled task to run jobs every minute. This function is called by the scheduler. :raises HTTPException: If there is an error while running jobs.