Commit Graph

10 Commits

Author SHA1 Message Date
d80e6aa6d1 fix(security): close MEDIUM ASVS findings — JWT pinning, SSE tickets, SSRF, mTLS pin, rate limits + correctness bugs
Auth (V2.1.1/V3.1.2, V2.1.3, V3.1.1):
- Pin JWT iss/aud/typ at mint and require+verify them at decode; revocation
  (jti denylist + tokens_valid_from) still enforced.
- Change-password now requires min_length=12.
- SSE auth moves off JWT-in-URL to a single-use 60s opaque ticket
  (POST /auth/sse-ticket); raw JWT in query no longer authenticates a stream.
  Removed dead fail-open get_stream_user helper.

Egress (V5.1.1, V9.1.1/V14.1.3):
- Webhook delivery + CRUD reject SSRF destinations (private/loopback/link-local/
  metadata, IPv4-mapped, multi-A-record) via resolved-IP validation, pin to the
  vetted IP, and never auto-follow redirects. Opt-out via DECNET_WEBHOOK_ALLOW_PRIVATE.
- UpdaterClient pins the worker leaf cert SHA-256 against the stored per-host
  fingerprint (fail closed on missing/mismatch); DECNET_VERIFY_HOSTNAME now
  defaults True.

Hardening (V13.1.3, V4.1.4, V13.1.2):
- Rate-limit change-password (5/min), enroll-bundle (10/min), webhook-create
  (20/min), host-delete (20/min) via the existing slowapi limiter.
- Correct false 'global auth middleware' comment; document enroll-bundle proxy
  trust.

Correctness (BUG-7..11):
- BUG-7 unbound bus in finally; BUG-8 apply_ceiling clamps to min(base,ceiling);
  BUG-9 commit before emit; BUG-10 multi-actor rearm for sub-threshold identities;
  BUG-11 normalize naive timestamps to UTC.

Already-closed (no change): V14.1.1, V2.1.2/V3.1.3, V5.1.2. Tests added for
every fix; unanimous adversarial review.
2026-06-10 12:32:15 -04:00
a4193d7022 fix(updater): enforce master CN gate and mandatory tarball checksum
The worker-side updater extracted + pip-installed + re-exec'd any tarball
from any caller holding a CA-signed cert; the documented updater@* CN
gating was never implemented. Now:

- require_master_cert gates /update, /update-self, /rollback, /releases:
  the client cert CN must be decnet-master (the identity UpdaterClient
  presents). A worker/agent cert can no longer push code to a peer.
- sha256 is mandatory on /update and /update-self (400 otherwise), so the
  integrity check always runs before extract/install. UpdaterClient
  already sends it; this just hardens the contract.

The transport peer-identity primitives move to decnet/web/_mtls.py (a
light namespace module) so the minimal updater reuses them without
importing the API router tree; router/swarm/_mtls.py re-exports them and
keeps the operator gate. Closes the updater-RCE critical.
2026-05-30 17:22:12 -04:00
f2b3393669 chore: relicense to AGPL-3.0-or-later and add SPDX headers
Replaces LICENSE (GPLv3 -> AGPLv3) and prepends
`SPDX-License-Identifier: AGPL-3.0-or-later` to every source file
across decnet/, decnet_web/, tests/, scripts/, and tools/.

Rationale: closes the GPLv3 ASP loophole so any party operating a
modified DECNET as a network service must offer their modified
source. Personal copyright (Samuel Paschuan) + inbound=outbound
contributions make a future unilateral relicense infeasible.

- LICENSE: full AGPL-3.0 text (gnu.org/licenses/agpl-3.0.txt)
- COPYRIGHT: project copyright notice
- tools/add_spdx_headers.py: idempotent header injector
  (shebang- and PEP 263-aware)

Touches 1565 source files (.py, .ts, .tsx, .js, .jsx, .css, .sh).
No behavior change; comments only.
2026-05-22 21:04:16 -04:00
28e2a93355 sec(updater): harden tarball extraction and verify sha256 before extract
Reject symlinks, hardlinks, device nodes and FIFOs in update tarballs;
validate each member's resolved path stays under dest after symlink
resolution; cap uncompressed size at 256 MiB to bound gzip-bomb damage;
strip setuid/setgid bits from extracted modes.

Add an optional sha256 form field to /update and /update-self; the
master client computes and sends it on every push, the executor
refuses to extract on mismatch. mTLS already authenticates the
master, so this is defence-in-depth against in-transit corruption
and gives operators a way to pin "exactly these bytes" for vetted
releases.
2026-04-27 21:14:48 -04:00
d2cf1e8b3a feat(updater): sync systemd unit files and daemon-reload on update
The bootstrap installer copies etc/systemd/system/*.service into
/etc/systemd/system at enrollment time, but the updater was skipping
that step — a code push could not ship a new unit (e.g. the four
per-host microservices added this session) or change ExecStart on an
existing one. systemctl alone doesn't re-read unit files; daemon-reload
is required.

run_update / run_update_self now call _sync_systemd_units after
rotation: diff each .service file against the live copy, atomically
replace changed ones, then issue a single `systemctl daemon-reload`.
No-op on legacy tarballs that don't ship etc/systemd/system/.
2026-04-19 19:07:24 -04:00
6d7877c679 feat(swarm): per-host microservices as systemd units, mutator off agents
Previously `decnet status` on an agent showed every microservice as DOWN
because deploy's auto-spawn was unihost-scoped and the agent CLI gate
hid the per-host commands. Now:

  - collect, probe, profiler, sniffer drop out of MASTER_ONLY_COMMANDS
    (they run per-host; master-side work stays master-gated).
  - mutate stays master-only (it orchestrates swarm-wide respawns).
  - decnet/mutator/ excluded from agent tarballs — never invoked there.
  - decnet/web exclusion tightened: ship db/ + auth.py + dependencies.py
    (profiler needs the repo singleton), drop api.py, swarm_api.py,
    ingester.py, router/, templates/.
  - Four new systemd unit templates (decnet-collector/prober/profiler/
    sniffer) shipped in every enrollment tarball.
  - enroll_bootstrap.sh enables + starts all four alongside agent and
    forwarder at install time.
  - updater restarts the aux units on code push so they pick up the new
    release (best-effort — legacy enrollments without the units won't
    fail the update).
  - status table hides Mutator + API rows in agent mode.
2026-04-19 18:58:48 -04:00
43b92c7bd6 fix(updater): restart agent+forwarder+self via systemd on push
Three holes in the systemd integration:
1. _spawn_agent_via_systemd only restarted decnet-agent.service, leaving
   decnet-forwarder.service running the pre-update code (same /opt/decnet
   tree, stale import cache).
2. run_update_self used os.execv regardless of environment — the re-execed
   process kept the updater's existing cgroup/capability inheritance but
   systemd would notice MainPID change and mark the unit degraded.
3. No path to surface a failed forwarder restart (legacy enrollments have
   no forwarder unit).

Now: agent restart first, forwarder restart as best-effort (logged but
non-fatal so legacy workers still update), MainPID still read from the
agent unit. For update-self under systemd, spawn a detached sleep+
systemctl restart so the HTTP response flushes before the unit cycles.
2026-04-19 18:23:10 -04:00
f5a5fec607 feat(deploy): systemd units w/ capability-based hardening; updater restarts agent via systemctl
Add deploy/ unit files for every DECNET daemon (agent, updater, api, web,
swarmctl, listener, forwarder). All run as User=decnet with NoNewPrivileges,
ProtectSystem, PrivateTmp, LockPersonality; AmbientCapabilities=CAP_NET_ADMIN
CAP_NET_RAW only on the agent (MACVLAN/scapy). Existing api/web units migrated
to /opt/decnet layout and the same hardening stanza.

Make the updater's _spawn_agent systemd-aware: under systemd (detected via
INVOCATION_ID + systemctl on PATH), `systemctl restart decnet-agent.service`
replaces the Popen path so the new agent inherits the unit's ambient caps
instead of the updater's empty set. _stop_agent becomes a no-op in that mode
to avoid racing systemctl's own stop phase.

Tests cover the dispatcher branch selection, MainPID parsing, and the
systemd no-op stop.
2026-04-19 00:44:06 -04:00
ebeaf08a49 fix(updater): fall back to /proc scan when agent.pid is missing
If the agent was started outside the updater (manually, during dev,
or from a prior systemd unit), there is no agent.pid for _stop_agent
to target, so a successful code install leaves the old in-memory
agent process still serving requests. Scan /proc for any decnet agent
command and SIGTERM all matches so restart is reliable regardless of
how the agent was originally launched.
2026-04-18 23:42:26 -04:00
7765b36c50 feat(updater): remote self-update daemon with auto-rollback
Adds a separate `decnet updater` daemon on each worker that owns the
agent's release directory and installs tarball pushes from the master
over mTLS. A normal `/update` never touches the updater itself, so the
updater is always a known-good rescuer if a bad agent push breaks
/health — the rotation is reversed and the agent restarted against the
previous release. `POST /update-self` handles updater upgrades
explicitly (no auto-rollback).

- decnet/updater/: executor, FastAPI app, uvicorn launcher
- decnet/swarm/updater_client.py, tar_tree.py: master-side push
- cli: `decnet updater`, `decnet swarm update [--host|--all]
  [--include-self] [--dry-run]`, `--updater` on `swarm enroll`
- enrollment API issues a second cert (CN=updater@<host>) signed by the
  same CA; SwarmHost records updater_cert_fingerprint
- tests: executor, app, CLI, tar tree, enroll-with-updater (37 new)
- wiki: Remote-Updates page + sidebar + SWARM-Mode cross-link
2026-04-18 21:40:21 -04:00