flip_api.fl_services.services.fl_scheduler_service ================================================== .. py:module:: flip_api.fl_services.services.fl_scheduler_service Functions --------- .. autoapisummary:: flip_api.fl_services.services.fl_scheduler_service.remove_job flip_api.fl_services.services.fl_scheduler_service.remove_job_from_queue flip_api.fl_services.services.fl_scheduler_service.revert_scheduler_pickup flip_api.fl_services.services.fl_scheduler_service.get_net_by_model_id flip_api.fl_services.services.fl_scheduler_service.get_net_by_name flip_api.fl_services.services.fl_scheduler_service.get_slot_names_by_trust_ids flip_api.fl_services.services.fl_scheduler_service.get_nets flip_api.fl_services.services.fl_scheduler_service.resolve_backend flip_api.fl_services.services.fl_scheduler_service.check_for_available_net flip_api.fl_services.services.fl_scheduler_service.check_for_queued_jobs flip_api.fl_services.services.fl_scheduler_service.prepare_and_start_training flip_api.fl_services.services.fl_scheduler_service.get_required_training_details flip_api.fl_services.services.fl_scheduler_service.update_fl_scheduler Module Contents --------------- .. py:function:: remove_job(job_id: uuid.UUID, session: sqlmodel.Session) -> None Sets the job status to DELETED and clears the started timestamp. :param job_id: The ID of the job to remove. :type job_id: UUID :param session: SQLModel session. :type session: Session :returns: None :raises flip_api.utils.exceptions.NotFoundError: If no ``FLJob`` exists with the given ID. :raises DatabaseError: If the update fails at the DB layer. .. py:function:: remove_job_from_queue(model_id: uuid.UUID, session: sqlmodel.Session) -> None Sets the job status to DELETED for all jobs associated with the given model ID. :param model_id: The model ID whose jobs are to be removed. :type model_id: UUID :param session: SQLModel session. :type session: Session :returns: None :raises DatabaseError: If the update fails at the DB layer. .. py:function:: revert_scheduler_pickup(scheduler_id: uuid.UUID, session: sqlmodel.Session) -> None Sets the scheduler status to AVAILABLE and clears the job_id. :param scheduler_id: The ID of the scheduler to revert. :type scheduler_id: UUID :param session: SQLModel session. :type session: Session :returns: None :raises flip_api.utils.exceptions.NotFoundError: If no ``FLScheduler`` exists with the given ID. :raises DatabaseError: If the update fails at the DB layer. .. py:function:: get_net_by_model_id(model_id: uuid.UUID, session: sqlmodel.Session) -> flip_api.domain.interfaces.fl.INetDetails Get information for a net by model ID. :param model_id: The model ID. :type model_id: UUID :param session: SQLModel session. :type session: Session :returns: Details of the net. :rtype: INetDetails :raises flip_api.utils.exceptions.NotFoundError: If no net is associated with the given ``model_id``. :raises DatabaseError: If the query fails at the DB layer. .. py:function:: get_net_by_name(name: str, session: sqlmodel.Session) -> flip_api.domain.interfaces.fl.INetDetails | None Get information for a net by name :param name: Name of the net :type name: str :param session: SQLModel session :type session: Session :returns: Details of the net or None if not found :rtype: INetDetails | None :raises DatabaseError: If the query fails at the DB layer. .. py:function:: get_slot_names_by_trust_ids(trust_ids: list[uuid.UUID], session: sqlmodel.Session) -> dict[uuid.UUID, str] Map each trust id to the slot_name of its bound FL kit slot. The FL protocol identifies participants by the slot identity (the CN baked into the kit's cert), which is independent of the trust's display name on the hub. Callers that need to talk to / compare against an FL participant must look up the slot name rather than using ``Trust.name`` — admin-chosen display names can change without rotating the kit, and don't carry into the FL protocol. :param trust_ids: Trust ids to resolve. Empty input → empty mapping. :type trust_ids: list[UUID] :param session: SQLModel session. :type session: Session :returns: ``trust_id → slot_name``. Trusts without an assigned slot are absent from the result; callers should treat a miss as "no FL identity yet". :rtype: dict[UUID, str] .. py:function:: get_nets(session: sqlmodel.Session) -> list[flip_api.domain.interfaces.fl.INetDetails] Fetches all nets from the database. :param session: The database session. :type session: Session :returns: A list of all nets. :rtype: list[INetDetails] :raises flip_api.utils.exceptions.NotFoundError: If no nets are registered in the database. :raises DatabaseError: If the query fails at the DB layer. .. py:function:: resolve_backend(session: sqlmodel.Session, net: flip_api.domain.interfaces.fl.INetDetails | None = None) -> flip_api.domain.schemas.types.FLBackend Resolve the active FL backend at runtime from the nets (never from a static env var). Every net carries a non-null ``fl_backend`` set at seed time from FL_BACKEND. That seeded value is canonical — there is no runtime reconciliation — so resolution always reads the DB. :param session: SQLModel session (used when no net is given). :type session: Session :param net: When given, use this net's backend (the job is already pinned to it). When ``None`` (e.g. at model creation, before scheduling), use any net's backend — all nets run the same backend in single-backend mode. :type net: INetDetails | None :returns: The resolved backend (``nvflare`` or ``flower``). :rtype: FLBackend :raises ValueError: If no FL nets are registered at all (empty NET_ENDPOINTS / misconfig). .. py:function:: check_for_available_net(session: sqlmodel.Session) -> flip_api.domain.interfaces.fl.ISchedulerResponse | None Checks for any available nets and marks one as busy if found. :param session: The database session. :type session: Session :returns: The scheduler response if an available net is found, otherwise None. :rtype: ISchedulerResponse | None :raises DatabaseError: If the update fails at the DB layer. .. py:function:: check_for_queued_jobs(scheduler_id: uuid.UUID, session: sqlmodel.Session) -> flip_api.domain.interfaces.fl.IJobResponse | None Checks for any queued jobs for a given scheduler. :param scheduler_id: The ID of the scheduler to check. :type scheduler_id: UUID :param session: The database session. :type session: Session :returns: The job response if a queued job is found, otherwise None. :rtype: IJobResponse | None :raises flip_api.utils.exceptions.NotFoundError: If the scheduler referenced by ``scheduler_id`` cannot be found. :raises DatabaseError: If the query or update fails at the DB layer. :raises Exception: If the job references invalid trusts. .. py:function:: prepare_and_start_training(model_id: uuid.UUID, fl_job_id: uuid.UUID, trust_ids: list[uuid.UUID], session: sqlmodel.Session) -> None Prepares and starts the training process for a given model. :param model_id: The ID of the model to train. :type model_id: UUID :param fl_job_id: The ID of the federated learning job. :type fl_job_id: UUID :param trust_ids: The trust ids participating in the training. Names are looked up from the `trust` table here — at the FL backend boundary, which is the only place the FL protocol's name-based addressing matters. :type trust_ids: list[UUID] :param session: The database session. :type session: Session :returns: None :raises Exception: If the FL backend is unsupported, the net endpoint cannot be resolved, client availability validation fails, or training fails to start. On failure the job is removed, the model is marked as errored, and the original exception is re-raised. .. py:function:: get_required_training_details(model_id: uuid.UUID, session: sqlmodel.Session) -> flip_api.domain.interfaces.fl.IRequiredTrainingInformation Fetches the necessary details for training a model, including the project ID and the latest cohort query. :param model_id: The ID of the model to train. :type model_id: UUID :param session: The database session. :type session: Session :returns: The required training information. :rtype: IRequiredTrainingInformation :raises flip_api.utils.exceptions.NotFoundError: If the model or its associated cohort query cannot be found. :raises DatabaseError: If the query fails at the DB layer. .. py:function:: update_fl_scheduler(model_id: uuid.UUID, session: sqlmodel.Session) -> None Updates the FL job status to COMPLETED and sets the associated scheduler status to AVAILABLE. :param model_id: The ID of the model to update. :type model_id: UUID :param session: The database session. :type session: Session :returns: None :raises DatabaseError: If the update fails at the DB layer.