feat(agent/collector): topology-label discovery and master-authoritative supersede

Legacy fleet deckies live in decnet-state.json; MazeNET topology
containers don't. Tag them at compose-time with
decnet.topology.service=true and let the collector match on that label.
Spin up the agent's log collector on the first successful /topology/apply
(not in the lifespan — that would break the no-docker-on-boot invariant)
and tear it down with the app. Land log lines in DECNET_AGENT_LOG_FILE,
separate from master-side DECNET_INGEST_LOG_FILE, so a dev box running
both roles can't forward its own ingest back to itself.

When master pushes a topology that differs from whatever is pinned
locally, teardown the predecessor and accept the new one. Refusing with
409 left the agent stranded after partial deploys. record_error now
persists the hydrated blob so a later teardown can still walk the LAN
list — otherwise a half-failed apply strands containers + bridges with
no breadcrumb back to them.
This commit is contained in:
2026-04-21 10:23:10 -04:00
parent 050607e00d
commit 0cdcfe2653
8 changed files with 221 additions and 20 deletions

View File

@@ -84,6 +84,16 @@ DECNET_API_PORT: int = _port("DECNET_API_PORT", 8000)
# the master's JWT secret being present in the environment.
DECNET_INGEST_LOG_FILE: str | None = os.environ.get("DECNET_INGEST_LOG_FILE", "/var/log/decnet/decnet.log")
# Agent-side RFC 5424 sink written by decnet.collector.worker when run on
# a SWARM worker. The forwarder tails this file and ships lines over
# syslog-TLS to the master listener. Kept separate from
# DECNET_INGEST_LOG_FILE so a workstation-dev box (which may run both the
# master and a throwaway agent pointed at itself) can't accidentally
# recurse by forwarding its own ingest file back to itself.
DECNET_AGENT_LOG_FILE: str = os.environ.get(
"DECNET_AGENT_LOG_FILE", "/var/log/decnet/agent.log"
)
# SWARM log pipeline — RFC 5425 syslog-over-TLS between worker forwarders
# and the master listener. Plaintext syslog across hosts is forbidden.
DECNET_SWARM_SYSLOG_PORT: int = _port("DECNET_SWARM_SYSLOG_PORT", 6514)