feat(swarm): per-host microservices as systemd units, mutator off agents
Previously `decnet status` on an agent showed every microservice as DOWN
because deploy's auto-spawn was unihost-scoped and the agent CLI gate
hid the per-host commands. Now:
- collect, probe, profiler, sniffer drop out of MASTER_ONLY_COMMANDS
(they run per-host; master-side work stays master-gated).
- mutate stays master-only (it orchestrates swarm-wide respawns).
- decnet/mutator/ excluded from agent tarballs — never invoked there.
- decnet/web exclusion tightened: ship db/ + auth.py + dependencies.py
(profiler needs the repo singleton), drop api.py, swarm_api.py,
ingester.py, router/, templates/.
- Four new systemd unit templates (decnet-collector/prober/profiler/
sniffer) shipped in every enrollment tarball.
- enroll_bootstrap.sh enables + starts all four alongside agent and
forwarder at install time.
- updater restarts the aux units on code push so they pick up the new
release (best-effort — legacy enrollments without the units won't
fail the update).
- status table hides Mutator + API rows in agent mode.
This commit is contained in:
@@ -237,6 +237,14 @@ def _run_pip(
|
||||
AGENT_SYSTEMD_UNIT = "decnet-agent.service"
|
||||
FORWARDER_SYSTEMD_UNIT = "decnet-forwarder.service"
|
||||
UPDATER_SYSTEMD_UNIT = "decnet-updater.service"
|
||||
# Per-host microservices that run out of the same /opt/decnet tree. An
|
||||
# update replaces their code, so we must cycle them alongside the agent or
|
||||
# they keep serving the pre-update image. Best-effort: legacy enrollments
|
||||
# without these units installed shouldn't abort the update.
|
||||
AUXILIARY_SYSTEMD_UNITS = (
|
||||
"decnet-collector.service", "decnet-prober.service",
|
||||
"decnet-profiler.service", "decnet-sniffer.service",
|
||||
)
|
||||
|
||||
|
||||
def _systemd_available() -> bool:
|
||||
@@ -286,6 +294,13 @@ def _spawn_agent_via_systemd(install_dir: pathlib.Path) -> int:
|
||||
)
|
||||
if fwd.returncode != 0:
|
||||
log.warning("forwarder restart failed (ignored): %s", fwd.stderr.strip())
|
||||
for unit in AUXILIARY_SYSTEMD_UNITS:
|
||||
aux = subprocess.run( # nosec B603 B607
|
||||
["systemctl", "restart", unit],
|
||||
check=False, capture_output=True, text=True,
|
||||
)
|
||||
if aux.returncode != 0:
|
||||
log.warning("%s restart failed (ignored): %s", unit, aux.stderr.strip())
|
||||
pid_out = subprocess.run( # nosec B603 B607
|
||||
["systemctl", "show", "--property=MainPID", "--value", AGENT_SYSTEMD_UNIT],
|
||||
check=True, capture_output=True, text=True,
|
||||
|
||||
Reference in New Issue
Block a user