DECNET

Files

anti 86b9decf80 fix(engine): detect wedged buildx + surface recovery hint on deploy

When Docker's buildx leaks bind-mounts from a failed build it starts
reporting 'read-only file system' on its own activity file, even
though nothing is actually read-only. The user's host had 20+
leaked mounts before we noticed — each retry compounds the leak.

_compose_with_retry now:
 * Pre-flight counts /var/lib/docker/tmp/buildkit-mount* entries in
   /proc/self/mounts; if >= 10 and the command is a build, refuses
   to start and returns a clean recovery recipe instead of retrying.
 * On mid-build failures that match the wedge signature
   ('failed to update builder last activity time' or the activity-dir
   path in stderr), short-circuits the retry loop with the same
   recipe. The first occurrence no longer needs a pre-flight; the
   pre-flight catches repeat attempts.

Recipe points at 'docker buildx prune -af && sudo systemctl restart
docker', which is what actually clears the leaked mounts.

Tests cover all three paths: wedge preflight blocks builds, non-build
commands (down/stop) ignore the preflight, mid-build signature
detection kills the retry loop. A new autouse fixture stubs the
wedge-detector to 0 so dev-host state doesn't poison the mocked
subprocess tests.

Wiki companion commit adds Troubleshooting → 'Buildx leaked mounts'.

2026-04-24 19:25:45 -04:00

__init__.py

refactor: separate engine, collector, mutator, and fleet into independent subpackages

2026-04-12 00:26:22 -04:00

deployer.py

fix(engine): detect wedged buildx + surface recovery hint on deploy

2026-04-24 19:25:45 -04:00

reaper.py

feat(engine,api): add orphan topology resource reaper

2026-04-21 22:13:44 -04:00