feat(swarm): expose needs_resync on TopologySummary + upsert record_error
Two small observability follow-ups to the phase-1 agent/topology wiring: TopologySummary now carries needs_resync so operators can see the heartbeat's resync flag via the topology list/detail API without dropping into the DB. TopologyStore.record_error becomes an upsert — when a docker/compose failure fires during the first materialise (put() never reached), we still land a marker row so GET /topology/state surfaces the error and the next heartbeat carries an empty applied_version_hash. That empty hash is what master's heartbeat check relies on to flag the topology for resync instead of assuming the apply succeeded.
This commit is contained in:
@@ -131,10 +131,22 @@ class TopologyStore:
|
||||
self._conn.commit()
|
||||
|
||||
def record_error(self, topology_id: str, message: str) -> None:
|
||||
"""Attach a last-error message to the current row (for debugging)."""
|
||||
"""Attach a last-error message for *topology_id*.
|
||||
|
||||
Upserts a marker row when no apply has yet succeeded for this
|
||||
topology — that way a failure *during* the first materialise
|
||||
(put() hasn't been reached) still surfaces via GET
|
||||
/topology/state and the next heartbeat. The marker row uses an
|
||||
empty ``applied_version_hash`` so master's heartbeat check sees
|
||||
the hash mismatch and schedules a resync.
|
||||
"""
|
||||
self._conn.execute(
|
||||
"UPDATE applied_topology SET last_error=? WHERE topology_id=?",
|
||||
(message, topology_id),
|
||||
"INSERT INTO applied_topology"
|
||||
" (topology_id, applied_version_hash, hydrated_blob_json,"
|
||||
" applied_at, last_error)"
|
||||
" VALUES (?, '', '{}', 0, ?)"
|
||||
" ON CONFLICT(topology_id) DO UPDATE SET last_error=excluded.last_error",
|
||||
(topology_id, message),
|
||||
)
|
||||
self._conn.commit()
|
||||
|
||||
|
||||
@@ -687,6 +687,7 @@ class TopologySummary(BaseModel):
|
||||
target_host_uuid: Optional[str] = None
|
||||
status: str
|
||||
version: int
|
||||
needs_resync: bool = False
|
||||
created_at: datetime
|
||||
status_changed_at: Optional[datetime] = None
|
||||
|
||||
|
||||
Reference in New Issue
Block a user