Logging Stack
FLIP uses a structured logging stack at each Trust site to collect, store and visualise application logs. The stack consists of three layers:
log_config – a shared Python library that emits structured JSON logs.
Grafana Alloy + Loki – Docker-native log collection and storage with 30-day retention.
Grafana – a web dashboard for querying and visualising logs.
Architecture
┌─────────────┐ ┌──────────────────┐ ┌─────────────┐
│ trust-api │ │ data-access-api │ │ imaging-api │
│ (JSON logs │ │ (JSON logs │ │ (JSON logs │
│ to stdout) │ │ to stdout) │ │ to stdout) │
└──────┬───────┘ └────────┬──────────┘ └──────┬───────┘
│ │ │
└───────────┬───────┘─────────────────────┘
▼
┌───────────────┐
│ Grafana Alloy │ Scrapes Docker logs via socket
│ (port 12345) │ Parses JSON, extracts labels
└───────┬───────┘
▼
┌───────────────┐
│ Loki │ Stores logs with labels
│ (port 3100) │ 30-day retention
└───────┬───────┘
▼
┌───────────────┐
│ Grafana │ Query & visualise
│ (port 3000) │
└───────────────┘
Each FLIP API service writes single-line JSON to stdout. Docker captures this output. Grafana Alloy discovers containers via the Docker socket, parses the JSON and forwards the logs to Loki. Grafana queries Loki through a pre-provisioned datasource.
Application Logging
Infrastructure Components
Grafana Alloy
Grafana Alloy discovers containers via the Docker socket and scrapes their
stdout logs. Configuration is at trust/observability/alloy/config.alloy (River syntax).
Alloy replaces the now end-of-life Promtail collector.
Key behaviours:
Discovers containers every 5 seconds via
discovery.docker.Extracts Docker labels as log labels:
container,service,project(viadiscovery.relabel).Parses JSON log lines and promotes
level,apiandeventto Loki labels for efficient querying (vialoki.processwithstage.jsonandstage.labels).request_idis extracted from the JSON but not promoted to a label.
Loki
Loki is the log storage backend. Configuration is at
trust/observability/loki/loki-config.yml.
Key settings:
Setting |
Value |
|---|---|
Retention period |
720 hours (30 days) |
Schema version |
v13 (TSDB) |
Storage backend |
Local filesystem |
Index rotation |
24 hours |
Compaction interval |
10 minutes |
Grafana
Grafana provides the web UI for log exploration. It is pre-provisioned with a Loki datasource and a Trust APIs dashboard so no manual configuration is required on first start.
Provisioning files are located at
trust/observability/grafana/provisioning/:
datasources/loki.yml– Loki datasource (uid:loki)dashboards/dashboards.yml– dashboard provider configurationdashboards/trust-apis.json– Trust APIs overview dashboard
Default credentials and port:
URL:
http://<trust-host>:3000Admin password: set via
GRAFANA_ADMIN_PASSWORDenvironment variable
Trust APIs dashboard
The provisioned Trust APIs dashboard (under the Observability folder) provides an overview of all three trust API services. It includes:
Stat panels – request rate, error count, p95 latency, active APIs
Time series – request rate and error rate by API over time
p95 request duration – latency trends by API
Status code distribution – breakdown of HTTP response codes
Slowest requests – table of completed requests sorted by duration
Recent errors – filtered log view of
ERROR-level entriesAll logs – full log stream with label filtering
An API dropdown at the top of the dashboard allows filtering by
trust-api, data-access-api, imaging-api, or all three.
Configuration
Environment variables
The following environment variables control the logging stack. Set them in the
appropriate .env.* file or pass them directly in the Docker Compose
override.
Variable |
Service |
Description |
|---|---|---|
|
All APIs (mapped to |
Sets the Python log level uniformly across all trust services. The
Pydantic |
|
Grafana |
Host port for the Grafana UI (default |
|
Grafana |
Admin password for Grafana |
|
Loki |
Host port for the Loki API (default |
Docker Compose services
The logging infrastructure is defined in the trust-level Docker Compose files:
trust/deploy/compose_trust.development.yml– development overrides with configurable portstrust/deploy/compose_trust.production.yml– production settings with persistent volumes and automatic restart
Three services are added:
loki (
grafana/loki:3.4.0) – log storagealloy (
grafana/alloy:v1.9.0) – log collector (depends on loki)grafana (
grafana/grafana:11.5.0) – dashboard (depends on loki)
In production, persistent volumes are mounted at:
/opt/flip/volumes/loki– Loki data/opt/flip/volumes/grafana– Grafana data and configuration
Operations & Querying Logs
Accessing Grafana
Open
http://<trust-host>:3000in a browser.Log in with the admin credentials.
Open the Trust APIs dashboard from the Observability folder for an overview of all API services, or navigate to Explore and select the Loki datasource for ad-hoc queries.
Example LogQL queries
All logs from a specific API:
{api="trust-api"}
Errors only:
{level="ERROR"}
Logs for a specific event:
{event="training.failed"}
Full-text search within a service:
{api="data-access-api"} |= "timeout"
Correlate logs by request ID:
{api=~"trust-api|data-access-api|imaging-api"} |= "d4e5f6a7-..."
Troubleshooting
Logs not appearing in Grafana
Check that the API containers are running:
docker compose ps.Check Alloy can reach Loki:
docker compose logs alloy.Verify Alloy has access to the Docker socket (
/var/run/docker.sockmust be mounted).Check Loki is healthy:
curl http://localhost:3100/ready.
High disk usage from Loki
Loki retains logs for 30 days. If disk space is a concern:
Reduce
retention_periodintrust/observability/loki/loki-config.yml.Check that the compactor is running (
compaction_interval: 10m).Monitor the
/opt/flip/volumes/lokidirectory size.
Changing log level at runtime
LOG_LEVEL is read at service startup. To change the level, update the
environment variable and restart the affected container:
docker compose restart trust-api