feat(swarm): agent→master heartbeat with per-host cert pinning
New POST /swarm/heartbeat on the swarm controller. Workers post every
~30s with the output of executor.status(); the master bumps
SwarmHost.last_heartbeat and re-upserts each DeckyShard with a fresh
DeckyConfig snapshot and runtime-derived state (running/degraded).
Security: CA-signed mTLS alone is not sufficient — a decommissioned
worker's still-valid cert could resurrect ghost shards. The endpoint
extracts the presented peer cert (primary: scope["extensions"]["tls"],
fallback: transport.get_extra_info("ssl_object")) and SHA-256-pins it
to the SwarmHost.client_cert_fingerprint stored for the claimed
host_uuid. Extraction is factored into _extract_peer_fingerprint so
tests can exercise both uvicorn scope shapes and the both-unavailable
fail-closed path without mocking uvicorn's TLS pipeline.
Adds get_swarm_host_by_fingerprint to the repo interface (SQLModel
impl reuses the indexed client_cert_fingerprint column).
This commit is contained in:
@@ -782,6 +782,14 @@ class SQLModelRepository(BaseRepository):
|
||||
row = result.scalar_one_or_none()
|
||||
return row.model_dump(mode="json") if row else None
|
||||
|
||||
async def get_swarm_host_by_fingerprint(self, fingerprint: str) -> Optional[dict[str, Any]]:
|
||||
async with self._session() as session:
|
||||
result = await session.execute(
|
||||
select(SwarmHost).where(SwarmHost.client_cert_fingerprint == fingerprint)
|
||||
)
|
||||
row = result.scalar_one_or_none()
|
||||
return row.model_dump(mode="json") if row else None
|
||||
|
||||
async def list_swarm_hosts(self, status: Optional[str] = None) -> list[dict[str, Any]]:
|
||||
statement = select(SwarmHost).order_by(asc(SwarmHost.name))
|
||||
if status:
|
||||
|
||||
Reference in New Issue
Block a user