docs(swarm): add buildx 0.17+ prereq alongside compose v2
Second Docker-side prereq uncovered running a real deploy on a fresh
Debian trixie VM: images pulled fine but 'docker compose up --build'
bailed with 'compose build requires buildx 0.17.0 or later'. Debian's
buildx is stuck at 0.13, so the compose-plugin install must be paired
with a buildx plugin install.
- Prereqs list now requires docker compose version AND docker buildx
version to be verified before enrolling.
- Install section renamed to 'Installing Compose v2 and Buildx on a
worker', covers both plugins with arch-aware curl incantations and
uname -m → compose-arch / buildx-arch mapping (compose uses x86_64,
buildx uses amd64 — footgun).
- Adds a troubleshooting row for the buildx-too-old case and one for
wrong-arch binary ('Invalid Plugins: ... exec format error').
Both uncovered on the live VM run; docs now match reality.
@@ -71,28 +71,36 @@ On the **master**:
|
||||
On each **worker**:
|
||||
|
||||
- DECNET installed.
|
||||
- **Docker Engine + Compose v2 plugin** (the agent shells out to
|
||||
`docker compose`, not the legacy `docker-compose`). This is the single
|
||||
most common setup trap — verify with `docker compose version` before
|
||||
enrolling. See [Installing Compose v2 on a worker](#installing-compose-v2-on-a-worker)
|
||||
below if your distro ships the Docker engine but not the plugin
|
||||
(Debian trixie's stock repos, for example, only carry v1).
|
||||
- **Docker Engine + Compose v2 plugin + Buildx ≥ 0.17** (the agent shells
|
||||
out to `docker compose` with `--build`, which in turn invokes buildx
|
||||
for image builds). Verify both before enrolling:
|
||||
```bash
|
||||
docker compose version # expect v2.x.y
|
||||
docker buildx version # expect v0.17.0 or newer
|
||||
```
|
||||
This is the single most common setup trap. Distros vary wildly in what
|
||||
they ship — Debian trixie's stock repos have neither the compose v2
|
||||
plugin nor a recent-enough buildx, for example. See [Installing
|
||||
Compose v2 and Buildx on a worker](#installing-compose-v2-and-buildx-on-a-worker)
|
||||
below.
|
||||
- `sudo` for the user running `decnet agent` (MACVLAN/IPVLAN needs root).
|
||||
`NOPASSWD` is convenient for unattended daemons.
|
||||
- Outbound TCP to master:6514 (log forward) and inbound TCP on 8765 from
|
||||
the master (deploy/teardown/health RPCs).
|
||||
|
||||
### Installing Compose v2 on a worker
|
||||
### Installing Compose v2 and Buildx on a worker
|
||||
|
||||
If `docker compose version` prints anything other than `Docker Compose
|
||||
version v2.x.y`, you need the plugin. Pick the path that matches your
|
||||
version v2.x.y`, or `docker buildx version` prints older than `v0.17.0`,
|
||||
install the missing plugin(s). Pick the path that matches your
|
||||
environment.
|
||||
|
||||
**Option A — Docker's official apt repo (recommended when it's available):**
|
||||
|
||||
```bash
|
||||
# Debian/Ubuntu. Adds Docker's own package source, then installs the
|
||||
# compose plugin alongside whatever docker-ce/docker.io you already have.
|
||||
# compose + buildx plugins alongside whatever docker-ce/docker.io you
|
||||
# already have.
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y ca-certificates curl
|
||||
sudo install -m 0755 -d /etc/apt/keyrings
|
||||
@@ -103,42 +111,72 @@ echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.
|
||||
https://download.docker.com/linux/debian $(. /etc/os-release && echo $VERSION_CODENAME) stable" \
|
||||
| sudo tee /etc/apt/sources.list.d/docker.list
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y docker-compose-plugin
|
||||
docker compose version # expect v2.x.y
|
||||
sudo apt-get install -y docker-compose-plugin docker-buildx-plugin
|
||||
docker compose version # expect v2.x.y
|
||||
docker buildx version # expect v0.17.0+
|
||||
```
|
||||
|
||||
For Ubuntu, swap `debian` for `ubuntu` in both the keyring URL and the
|
||||
sources.list entry.
|
||||
|
||||
**Option B — standalone binary (offline or restricted networks):**
|
||||
**Option B — standalone binaries (offline or restricted networks):**
|
||||
|
||||
Both plugins install the same way: download the binary for your
|
||||
architecture and drop it into Docker's CLI plugin directory.
|
||||
|
||||
```bash
|
||||
# Drop the v2 binary into Docker's CLI plugin directory. Works on any
|
||||
# distro with the Docker engine already installed.
|
||||
# Confirm the worker's architecture first — x86_64, aarch64, armv7l.
|
||||
ARCH=$(uname -m)
|
||||
case "$ARCH" in
|
||||
x86_64) COMPOSE_ARCH=x86_64; BUILDX_ARCH=amd64 ;;
|
||||
aarch64) COMPOSE_ARCH=aarch64; BUILDX_ARCH=arm64 ;;
|
||||
armv7l) COMPOSE_ARCH=armv7; BUILDX_ARCH=arm-v7 ;;
|
||||
esac
|
||||
|
||||
sudo mkdir -p /usr/local/lib/docker/cli-plugins
|
||||
|
||||
# Compose v2
|
||||
sudo curl -fsSL \
|
||||
"https://github.com/docker/compose/releases/download/v2.29.7/docker-compose-linux-$(uname -m)" \
|
||||
"https://github.com/docker/compose/releases/download/v2.29.7/docker-compose-linux-${COMPOSE_ARCH}" \
|
||||
-o /usr/local/lib/docker/cli-plugins/docker-compose
|
||||
sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-compose
|
||||
|
||||
# Buildx
|
||||
sudo curl -fsSL \
|
||||
"https://github.com/docker/buildx/releases/download/v0.18.0/buildx-v0.18.0.linux-${BUILDX_ARCH}" \
|
||||
-o /usr/local/lib/docker/cli-plugins/docker-buildx
|
||||
sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-buildx
|
||||
|
||||
docker compose version
|
||||
docker buildx version
|
||||
```
|
||||
|
||||
If the worker can't reach GitHub directly (closed lab network, air-gapped
|
||||
VM, etc.), download the binary on a box that *can* reach it and `scp` it
|
||||
to the worker's `/usr/local/lib/docker/cli-plugins/docker-compose` —
|
||||
that's the entire install.
|
||||
VM, etc.), download the binaries on a box that *can* reach it and `scp`
|
||||
them to the worker's `/usr/local/lib/docker/cli-plugins/` — that's the
|
||||
entire install.
|
||||
|
||||
**Watch the architecture.** Downloading `linux-x86_64` onto an `aarch64`
|
||||
worker (or vice versa) gets you `exec format error: failed to fetch
|
||||
metadata` from the `docker` CLI and the plugin is listed under "Invalid
|
||||
Plugins" in `docker info`. `uname -m` is your friend.
|
||||
|
||||
**Do not** install the legacy `docker-compose` (v1, the Python one) and
|
||||
call it a day. The DECNET deployer invokes `docker compose ...` as a
|
||||
subcommand, not `docker-compose ...` as a binary — they are different
|
||||
programs with different code paths, and v1 is end-of-life.
|
||||
|
||||
**Symptom if you get this wrong.** `decnet deploy --mode swarm` returns a
|
||||
500 from the worker with
|
||||
`CalledProcessError: Command '['docker', 'compose', ...]' returned
|
||||
non-zero exit status 125`. The worker's agent log will show the
|
||||
`docker` CLI's own help text dumped into stderr because `docker` treats
|
||||
`compose` as an unknown positional when the plugin isn't installed.
|
||||
**Symptoms if you get this wrong.**
|
||||
|
||||
- No compose plugin at all: `CalledProcessError: Command '['docker',
|
||||
'compose', ...]' returned non-zero exit status 125`, agent log shows
|
||||
the `docker` CLI's help text (because `compose` is an unknown
|
||||
subcommand).
|
||||
- Compose plugin OK but buildx too old: `compose build requires buildx
|
||||
0.17.0 or later` in the agent log, followed by `up --build` exit
|
||||
status 1. Images pull fine, the build step is what fails.
|
||||
- Wrong-arch binary: `Invalid Plugins: compose failed to fetch metadata:
|
||||
fork/exec ...: exec format error` in `docker info`.
|
||||
|
||||
Time sync is a hard requirement — mTLS cert validation fails if worker and
|
||||
master clocks differ by more than a few minutes. Run `chronyd`/`systemd-timesyncd`.
|
||||
@@ -565,7 +603,9 @@ decnet swarm decommission --name <each-worker> --yes
|
||||
| Lines appear in `master.log` but not the dashboard | Ingester not running, or pointed at the wrong JSON path | `systemctl status decnet-ingester`, confirm `DECNET_INGEST_LOG_FILE` matches `listener --json-path` |
|
||||
| `deploy --mode swarm` fails with `No enrolled workers` | Exactly what it says | `swarm enroll` at least one worker first |
|
||||
| Worker returns 500 on `/deploy` with `ip addr show <nic>` error | The worker's agent is re-detecting its own NIC (this is the relocalize step) and can't find a usable interface | Run `ip route show default` on the worker — if empty, the default route is missing; fix the worker's networking before deploying |
|
||||
| Worker returns 500 on `/deploy` with `docker compose ... exit status 125` and docker help text in the log | Compose v2 plugin is not installed on the worker; the stock `docker` binary is treating `compose` as an unknown subcommand | `docker compose version` on the worker. If it doesn't print v2.x.y, see [Installing Compose v2 on a worker](#installing-compose-v2-on-a-worker) |
|
||||
| Worker returns 500 on `/deploy` with `docker compose ... exit status 125` and docker help text in the log | Compose v2 plugin is not installed on the worker; the stock `docker` binary is treating `compose` as an unknown subcommand | `docker compose version` on the worker. If it doesn't print v2.x.y, see [Installing Compose v2 and Buildx on a worker](#installing-compose-v2-and-buildx-on-a-worker) |
|
||||
| Worker returns 500 on `/deploy` with `compose build requires buildx 0.17.0 or later` | Buildx plugin missing or too old on the worker; images pull but the build step fails | `docker buildx version` on the worker. If it's below v0.17.0, see [Installing Compose v2 and Buildx on a worker](#installing-compose-v2-and-buildx-on-a-worker) |
|
||||
| `docker info` lists a CLI plugin under "Invalid Plugins: ... exec format error" | Wrong-architecture binary installed — e.g. x86_64 binary dropped onto an aarch64 host | Re-download the plugin binary matching `uname -m` and overwrite the file in `/usr/local/lib/docker/cli-plugins/` |
|
||||
| Agent rejects master with `BAD_CERTIFICATE` | Master's own client cert (`~/.decnet/master/`) isn't in the worker's trust chain | Never happens if both sides were issued from the same CA. Check you didn't re-init the CA between `swarmctl` starts |
|
||||
|
||||
If things are really broken and you want a clean slate on the master:
|
||||
|
||||
Reference in New Issue
Block a user