feat(workers): bus-backed Workers panel (registry, control, installed flag)
Ships the backend half of Config → Workers:
* Worker registry aggregates `system.*.health` + `system.bus.health`
heartbeats into a last-seen dict; OK / STALE / UNKNOWN tiers drop
out of a 90s window (3× the 30s heartbeat interval).
* `GET /api/v1/workers` returns the snapshot plus `bus_connected`
(so the UI can explain "all UNKNOWN" when the bus socket is down)
and a per-row `installed` flag populated from
`systemctl list-unit-files decnet-*.service` (cached 30s).
* `POST /api/v1/workers/{name}/stop` publishes a stop intent on
`system.<name>.control`; workers listen via the shared control
listener in `bus/publish.py`.
* Heartbeat + control listener wired into collector / profiler /
sniffer / prober / mutator worker loops. API self-heartbeats too
so the panel always has one ground-truth row.
* Topic helper `system_control(name)` + tests covering builder
validation, control listener shutdown path, and the API surface
(auth gating, bus-connected field, unknown-name 404).
Adds `StartFailure` / `StartAllResponse` models in anticipation of
the upcoming start endpoints (DEBT-034).
This commit is contained in:
@@ -30,6 +30,7 @@ ingestion_task: Optional[asyncio.Task[Any]] = None
|
||||
collector_task: Optional[asyncio.Task[Any]] = None
|
||||
attacker_task: Optional[asyncio.Task[Any]] = None
|
||||
sniffer_task: Optional[asyncio.Task[Any]] = None
|
||||
heartbeat_task: Optional[asyncio.Task[Any]] = None
|
||||
|
||||
|
||||
def get_background_tasks() -> dict[str, Optional[asyncio.Task[Any]]]:
|
||||
@@ -45,6 +46,7 @@ def get_background_tasks() -> dict[str, Optional[asyncio.Task[Any]]]:
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
||||
global ingestion_task, collector_task, attacker_task, sniffer_task
|
||||
global heartbeat_task
|
||||
|
||||
import resource
|
||||
soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
|
||||
@@ -115,10 +117,33 @@ async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
||||
else:
|
||||
log.info("Contract Test Mode: skipping background worker startup")
|
||||
|
||||
# Worker registry + API self-heartbeat — always on, even under
|
||||
# contract-test mode, so the Workers panel can render the process
|
||||
# without the dev needing to run a full stack. A missing bus turns
|
||||
# both into no-ops inside the helpers.
|
||||
try:
|
||||
from decnet.bus.app import get_app_bus
|
||||
from decnet.bus.publish import run_health_heartbeat
|
||||
from decnet.web.worker_registry import get_registry
|
||||
|
||||
_bus = await get_app_bus()
|
||||
await get_registry().start(_bus)
|
||||
if heartbeat_task is None or heartbeat_task.done():
|
||||
heartbeat_task = asyncio.create_task(
|
||||
run_health_heartbeat(_bus, "api"),
|
||||
)
|
||||
except Exception as exc: # noqa: BLE001
|
||||
log.warning("worker registry bootstrap failed: %s", exc)
|
||||
|
||||
yield
|
||||
|
||||
log.info("API shutdown cancelling background tasks")
|
||||
for task in (ingestion_task, collector_task, attacker_task, sniffer_task):
|
||||
try:
|
||||
from decnet.web.worker_registry import get_registry
|
||||
await get_registry().stop()
|
||||
except Exception as exc: # noqa: BLE001
|
||||
log.warning("worker registry stop raised: %s", exc)
|
||||
for task in (ingestion_task, collector_task, attacker_task, sniffer_task, heartbeat_task):
|
||||
if task and not task.done():
|
||||
task.cancel()
|
||||
try:
|
||||
|
||||
Reference in New Issue
Block a user