Commit Graph

2 Commits

Author SHA1 Message Date
12e18b75db feat(swarm): expose needs_resync on TopologySummary + upsert record_error
Two small observability follow-ups to the phase-1 agent/topology wiring:

TopologySummary now carries needs_resync so operators can see the
heartbeat's resync flag via the topology list/detail API without
dropping into the DB.

TopologyStore.record_error becomes an upsert — when a docker/compose
failure fires during the first materialise (put() never reached), we
still land a marker row so GET /topology/state surfaces the error and
the next heartbeat carries an empty applied_version_hash. That empty
hash is what master's heartbeat check relies on to flag the topology
for resync instead of assuming the apply succeeded.
2026-04-21 01:41:30 -04:00
13cb0ff38e feat(agent): topology apply/teardown/state endpoints
New mTLS-protected routes on the agent:

- POST /topology/apply — master pushes {hydrated, version_hash}.
  Validates the hash matches locally (serialisation drift guard),
  runs the topology through the same validator/composer pipeline
  used master-side, then creates bridges + compose up + records the
  apply in topology.db.
- POST /topology/teardown — dismantles compose, removes bridges,
  clears topology.db.  Idempotent.
- GET /topology/state — returns applied row + live docker
  observation for the heartbeat.

Implementation lives in decnet/agent/topology_ops.py; it reuses the
private compose helpers from decnet.engine.deployer so we don't
duplicate compose/project-name plumbing.  The apply path is sync
under the hood (docker SDK + subprocess); we hop to a thread so the
event loop keeps servicing other agent traffic.

v1 is one-topology-per-agent; cross-topology apply returns 409.

Step 4 of the agent <-> topology integration.
2026-04-21 01:25:15 -04:00