Auth (V2.1.1/V3.1.2, V2.1.3, V3.1.1): - Pin JWT iss/aud/typ at mint and require+verify them at decode; revocation (jti denylist + tokens_valid_from) still enforced. - Change-password now requires min_length=12. - SSE auth moves off JWT-in-URL to a single-use 60s opaque ticket (POST /auth/sse-ticket); raw JWT in query no longer authenticates a stream. Removed dead fail-open get_stream_user helper. Egress (V5.1.1, V9.1.1/V14.1.3): - Webhook delivery + CRUD reject SSRF destinations (private/loopback/link-local/ metadata, IPv4-mapped, multi-A-record) via resolved-IP validation, pin to the vetted IP, and never auto-follow redirects. Opt-out via DECNET_WEBHOOK_ALLOW_PRIVATE. - UpdaterClient pins the worker leaf cert SHA-256 against the stored per-host fingerprint (fail closed on missing/mismatch); DECNET_VERIFY_HOSTNAME now defaults True. Hardening (V13.1.3, V4.1.4, V13.1.2): - Rate-limit change-password (5/min), enroll-bundle (10/min), webhook-create (20/min), host-delete (20/min) via the existing slowapi limiter. - Correct false 'global auth middleware' comment; document enroll-bundle proxy trust. Correctness (BUG-7..11): - BUG-7 unbound bus in finally; BUG-8 apply_ceiling clamps to min(base,ceiling); BUG-9 commit before emit; BUG-10 multi-actor rearm for sub-threshold identities; BUG-11 normalize naive timestamps to UTC. Already-closed (no change): V14.1.1, V2.1.2/V3.1.3, V5.1.2. Tests added for every fix; unanimous adversarial review.
76 lines
2.5 KiB
Python
76 lines
2.5 KiB
Python
# SPDX-License-Identifier: AGPL-3.0-or-later
|
|
"""DELETE /swarm/hosts/{uuid} — decommission a worker.
|
|
|
|
Removes the DeckyShard rows bound to the host (portable cascade — MySQL
|
|
and SQLite both honor it via the repo layer), deletes the SwarmHost row,
|
|
and best-effort-cleans the per-worker bundle directory on the master.
|
|
|
|
Also asks the worker agent to wipe its own install (keeping logs). A
|
|
dead/unreachable worker does not block master-side cleanup.
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
import pathlib
|
|
|
|
from fastapi import APIRouter, Depends, HTTPException, Request, status
|
|
|
|
from decnet.logging import get_logger
|
|
from decnet.swarm.client import AgentClient
|
|
from decnet.web.db.repository import BaseRepository
|
|
from decnet.web.dependencies import get_repo, require_admin
|
|
from decnet.web.limiter import limiter
|
|
from decnet.web.router.swarm._mtls import PeerCert, require_operator_cert
|
|
|
|
log = get_logger("swarm.decommission")
|
|
router = APIRouter()
|
|
|
|
|
|
@router.delete(
|
|
"/hosts/{uuid}",
|
|
status_code=status.HTTP_204_NO_CONTENT,
|
|
tags=["Swarm Hosts"],
|
|
responses={
|
|
401: {"description": "Missing or invalid admin JWT"},
|
|
403: {"description": "Authenticated user is not an admin, or operator cert missing"},
|
|
404: {"description": "No host with this UUID is enrolled"},
|
|
429: {"description": "Too many decommission requests — retry after the window resets"},
|
|
},
|
|
)
|
|
@limiter.limit("20/minute")
|
|
async def api_decommission_host(
|
|
uuid: str,
|
|
request: Request,
|
|
repo: BaseRepository = Depends(get_repo),
|
|
_admin: dict = Depends(require_admin),
|
|
_operator: PeerCert = Depends(require_operator_cert),
|
|
) -> None:
|
|
row = await repo.get_swarm_host_by_uuid(uuid)
|
|
if row is None:
|
|
raise HTTPException(status_code=404, detail="host not found")
|
|
|
|
try:
|
|
async with AgentClient(host=row) as agent:
|
|
await agent.self_destruct()
|
|
except Exception:
|
|
log.exception(
|
|
"decommission: self-destruct dispatch failed host=%s — "
|
|
"proceeding with master-side cleanup anyway",
|
|
row.get("name"),
|
|
)
|
|
|
|
await repo.delete_decky_shards_for_host(uuid)
|
|
await repo.delete_swarm_host(uuid)
|
|
|
|
# Best-effort bundle cleanup; if the dir was moved manually, don't fail.
|
|
bundle_dir = pathlib.Path(row.get("cert_bundle_path") or "")
|
|
if bundle_dir.is_dir():
|
|
for child in bundle_dir.iterdir():
|
|
try:
|
|
child.unlink()
|
|
except OSError:
|
|
pass
|
|
try:
|
|
bundle_dir.rmdir()
|
|
except OSError:
|
|
pass
|