DECNET

Author	SHA1	Message	Date
anti	2bef3edb72	feat(swarm): unbundle master-only code from agent tarball + sync systemd units on update Agents now ship with collector/prober/sniffer as systemd services; mutator, profiler, web, and API stay master-only (profiler rebuilds attacker profiles against the master DB — no per-host DB exists). Expand _EXCLUDES to drop the full decnet/web, decnet/mutator, decnet/profiler, and decnet_web trees from the enrollment bundle. Updater now calls _heal_path_symlink + _sync_systemd_units after rotation so fleets pick up new unit files and /usr/local/bin/decnet tracks the shared venv without a manual reinstall. daemon-reload runs once per update when any unit changed. Fix _service_registry matchers to accept systemd-style /usr/local/bin/decnet cmdlines (psutil returns a list — join to string before substring-checking) so agent-mode `decnet status` reports collector/prober/sniffer correctly.	2026-04-19 19:19:17 -04:00
anti	d2cf1e8b3a	feat(updater): sync systemd unit files and daemon-reload on update The bootstrap installer copies etc/systemd/system/*.service into /etc/systemd/system at enrollment time, but the updater was skipping that step — a code push could not ship a new unit (e.g. the four per-host microservices added this session) or change ExecStart on an existing one. systemctl alone doesn't re-read unit files; daemon-reload is required. run_update / run_update_self now call _sync_systemd_units after rotation: diff each .service file against the live copy, atomically replace changed ones, then issue a single `systemctl daemon-reload`. No-op on legacy tarballs that don't ship etc/systemd/system/.	2026-04-19 19:07:24 -04:00
anti	6d7877c679	feat(swarm): per-host microservices as systemd units, mutator off agents Previously `decnet status` on an agent showed every microservice as DOWN because deploy's auto-spawn was unihost-scoped and the agent CLI gate hid the per-host commands. Now: - collect, probe, profiler, sniffer drop out of MASTER_ONLY_COMMANDS (they run per-host; master-side work stays master-gated). - mutate stays master-only (it orchestrates swarm-wide respawns). - decnet/mutator/ excluded from agent tarballs — never invoked there. - decnet/web exclusion tightened: ship db/ + auth.py + dependencies.py (profiler needs the repo singleton), drop api.py, swarm_api.py, ingester.py, router/, templates/. - Four new systemd unit templates (decnet-collector/prober/profiler/ sniffer) shipped in every enrollment tarball. - enroll_bootstrap.sh enables + starts all four alongside agent and forwarder at install time. - updater restarts the aux units on code push so they pick up the new release (best-effort — legacy enrollments without the units won't fail the update). - status table hides Mutator + API rows in agent mode.	2026-04-19 18:58:48 -04:00
anti	dad29249de	fix(updater): align bootstrap layout with updater; log update phases The bootstrap was installing into /opt/decnet/.venv with an editable `pip install -e .`, and /usr/local/bin/decnet pointed there. The updater writes releases to /opt/decnet/releases/active/ with a shared venv at /opt/decnet/venv — a parallel tree nothing on the box actually runs. Result: updates appeared to succeed (release dir rotated, SHA changed) but systemd kept executing the untouched bootstrap code. Changes: - Bootstrap now installs directly into /opt/decnet/releases/active with the shared venv at /opt/decnet/venv and /opt/decnet/current symlinked. Same layout the updater rotates in and out of. - /usr/local/bin/decnet -> /opt/decnet/venv/bin/decnet. - run_update / run_update_self heal /usr/local/bin/decnet on every push so already-enrolled hosts recover on the next update instead of needing a re-enroll. - run_update / run_update_self now log each phase (receive, extract, pip install, rotate, restart, probe) so the updater log actually shows what happened.	2026-04-19 18:39:11 -04:00
anti	43b92c7bd6	fix(updater): restart agent+forwarder+self via systemd on push Three holes in the systemd integration: 1. _spawn_agent_via_systemd only restarted decnet-agent.service, leaving decnet-forwarder.service running the pre-update code (same /opt/decnet tree, stale import cache). 2. run_update_self used os.execv regardless of environment — the re-execed process kept the updater's existing cgroup/capability inheritance but systemd would notice MainPID change and mark the unit degraded. 3. No path to surface a failed forwarder restart (legacy enrollments have no forwarder unit). Now: agent restart first, forwarder restart as best-effort (logged but non-fatal so legacy workers still update), MainPID still read from the agent unit. For update-self under systemd, spawn a detached sleep+ systemctl restart so the HTTP response flushes before the unit cycles.	2026-04-19 18:23:10 -04:00
anti	f5a5fec607	feat(deploy): systemd units w/ capability-based hardening; updater restarts agent via systemctl Add deploy/ unit files for every DECNET daemon (agent, updater, api, web, swarmctl, listener, forwarder). All run as User=decnet with NoNewPrivileges, ProtectSystem, PrivateTmp, LockPersonality; AmbientCapabilities=CAP_NET_ADMIN CAP_NET_RAW only on the agent (MACVLAN/scapy). Existing api/web units migrated to /opt/decnet layout and the same hardening stanza. Make the updater's _spawn_agent systemd-aware: under systemd (detected via INVOCATION_ID + systemctl on PATH), `systemctl restart decnet-agent.service` replaces the Popen path so the new agent inherits the unit's ambient caps instead of the updater's empty set. _stop_agent becomes a no-op in that mode to avoid racing systemctl's own stop phase. Tests cover the dispatcher branch selection, MainPID parsing, and the systemd no-op stop.	2026-04-19 00:44:06 -04:00
anti	40d3e86e55	fix(updater): bootstrap fresh venv with deps; rebuild self-update argv from env - _run_pip: on first venv use, install decnet with its full dep tree so the bootstrapped environment actually has typer/fastapi/uvicorn. Subsequent updates keep --no-deps for a near-no-op refresh. - run_update_self: do not reuse sys.argv to re-exec the updater. Inside the live process, sys.argv is the uvicorn subprocess invocation (--ssl-keyfile etc.), which 'decnet updater' CLI rejects. Reconstruct the operator-visible command from env vars set by updater.server.run.	2026-04-18 23:51:41 -04:00
anti	ebeaf08a49	fix(updater): fall back to /proc scan when agent.pid is missing If the agent was started outside the updater (manually, during dev, or from a prior systemd unit), there is no agent.pid for _stop_agent to target, so a successful code install leaves the old in-memory agent process still serving requests. Scan /proc for any decnet agent command and SIGTERM all matches so restart is reliable regardless of how the agent was originally launched.	2026-04-18 23:42:26 -04:00
anti	7765b36c50	feat(updater): remote self-update daemon with auto-rollback Adds a separate `decnet updater` daemon on each worker that owns the agent's release directory and installs tarball pushes from the master over mTLS. A normal `/update` never touches the updater itself, so the updater is always a known-good rescuer if a bad agent push breaks /health — the rotation is reversed and the agent restarted against the previous release. `POST /update-self` handles updater upgrades explicitly (no auto-rollback). - decnet/updater/: executor, FastAPI app, uvicorn launcher - decnet/swarm/updater_client.py, tar_tree.py: master-side push - cli: `decnet updater`, `decnet swarm update [--host\|--all] [--include-self] [--dry-run]`, `--updater` on `swarm enroll` - enrollment API issues a second cert (CN=updater@<host>) signed by the same CA; SwarmHost records updater_cert_fingerprint - tests: executor, app, CLI, tar tree, enroll-with-updater (37 new) - wiki: Remote-Updates page + sidebar + SWARM-Mode cross-link	2026-04-18 21:40:21 -04:00

9 Commits