Every mutation route that returned an untyped dict now declares
response_model at the decorator. MessageResponse covers the eight
{"message": ...} envelopes (change-password, mutate-decky, mutate-
interval, update-deployment-limit, update-global-mutation-interval,
delete-user, update-user-role, reset-user-password). Purpose-built
models cover the richer shapes (DeployResponse for /deckies/deploy,
PurgeResponse for /config/reinit, ReapReportResponse for /reap-orphans,
UserResponse for /config/users). 204-No-Content and Response/
ORJSONResponse routes stay as-is.
The wire shape for clients is unchanged — the envelopes already only
shipped a message field. What changes is that a handler which
accidentally returns a richer dict (e.g. a full user row including
password_hash) would be silently stripped to the declared fields at
serialization time.
Also flips F4/D "expensive LIKE" to accepted (new DA-09) — the /logs
and /attackers search routes LIKE-scan unbounded columns, but both are
admin-gated, limit-capped, and operator rate-limit scope per DA-04.
FTS5 stays a performance TODO, not a security blocker.
Adds slowapi two-bucket rate limit on /auth/login — 10 attempts per
5 minutes per-IP AND per-username, tripping either → 429. Per-IP
catches botnets hitting one account; per-username catches distributed
credential stuffing against one account. In-memory storage: dashboard
API is single-process, Redis is disproportionate for v1.
X-Forwarded-For is deliberately NOT trusted (spoofable); reverse-proxy
deployments get one shared bucket per proxy IP. Logged in the threat
model as accepted risk DA-08, to be revisited when a verified-proxy
config lands.
Also scaffolds development/THREAT_MODEL.md with STRIDE-per-element
methodology, system-context DFD, and Dashboard↔API as the first fully
worked component (7 sub-flows, ~50 threat entries). F1 Authn ships
with 3 threats mitigated: rate limit (new), uniform 401 (verified
already in place), bcrypt length clamp (verified already in place via
Pydantic max_length=72).
Locust @task(2) hammers /auth/login in steady state on top of the
on_start burst. After caching the uuid-keyed user lookup and every
other read endpoint, login alone accounted for 47% of total
_execute at 500c/u — pure DB queueing on SELECT users WHERE
username=?.
5s TTL, positive hits only (misses bypass so a freshly-created
user can log in immediately). Password verify still runs against
the cached hash, so security is unchanged — the only staleness
window is: a changed password accepts the old password for up to
5s until invalidate_user_cache fires (it's called on every write).
The per-request SELECT users WHERE uuid=? in require_role was the
hidden tax behind every authed endpoint — it kept _execute at ~60%
across the profile even after the page caches landed. Even /health
(with its DB and Docker probes cached) was still 52% _execute from
this one query.
- dependencies.py: 10s TTL cache on get_user_by_uuid, well below JWT
expiry. invalidate_user_cache(uuid) is called on password change,
role change, and user delete.
- api_get_config.py: 5s TTL cache on the admin branch's list_users()
(previously fetched every /config call). Invalidated on user
create/update/delete.
- api_change_pass.py + api_manage_users.py: invalidation hooks on
all user-mutating endpoints.
verify_password / get_password_hash are CPU-bound and take ~250ms each
at rounds=12. Called directly from async endpoints, they stall every
other coroutine for that window — the single biggest single-worker
bottleneck on the login path.
Adds averify_password / ahash_password that wrap the sync versions in
asyncio.to_thread. Sync versions stay put because _ensure_admin_user and
tests still use them.
5 call sites updated: login, change-password, create-user, reset-password.
tests/test_auth_async.py asserts parallel averify runs concurrently (~1x
of a single verify, not 2x).
Extends tracing to every remaining module: all 23 API route handlers,
correlation engine, sniffer (fingerprint/p0f/syslog), prober (jarm/hassh/tcpfp),
profiler behavioral analysis, logging subsystem, engine, and mutator.
Bridges the ingester→SSE trace gap by persisting trace_id/span_id columns on
the logs table and creating OTEL span links in the SSE endpoint. Adds log-trace
correlation via _TraceContextFilter injecting otel_trace_id into Python LogRecords.
Includes development/docs/TRACING.md with full span reference (76 spans),
pipeline propagation architecture, quick start guide, and troubleshooting.