Commit Graph

19 Commits

Author SHA1 Message Date
4c37ece39e feat(orchestrator): MVP synthetic life-injection worker (SSH only)
Adds a new decnet orchestrate worker whose job is to keep the honeypot
ecosystem from looking suspiciously static — a frozen LAN with no
inter-host traffic and no filesystem aging is its own honeypot tell.

MVP scope:
- New OrchestratorEvent table + repo methods (purpose-built sibling
  to Log so synthetic events stay separable from attacker-driven ones).
- New orchestrator.{activity,file}.<decky_id> bus topics +
  system.orchestrator.health heartbeat.
- SSH-only driver. Traffic action runs python3 inside src container
  to TCP-connect dst:22 and read the SSH banner — real on-the-wire
  SSH-protocol traffic without shipping creds. File action drops or
  refreshes a small file via docker exec on the destination.
- Random scheduler (50/50 traffic/file when >=2 SSH-capable deckies
  are running). Diurnal shaping, role-aware pairing, and session-aware
  backoff are explicit non-goals for MVP.
- CLI registration, systemd unit (SupplementaryGroups=docker),
  worker-registry entry so the dashboard shows orchestrator health.
- 11 tests: scheduler policy, driver argv shape + injection-safety,
  end-to-end one-tick integration with FakeBus + SQLite.
2026-04-26 19:43:20 -04:00
7fafdd66de feat(deploy): systemd units for identity + campaign clusterers
decnet-clusterer.service.j2 ships the identity clusterer that
landed last session (was overlooked) — bus-woken on attacker.>,
publishes identity.> events.

decnet-campaign-clusterer.service.j2 ships the campaign clusterer
from this session — bus-woken on identity.>, publishes campaign.>
events plus the cross-family identity.campaign.assigned. After=
decnet-clusterer.service so the identity layer is up before the
campaign layer reads its rows.

decnet.target Wants both new units. Both follow the same security
hardening profile as enrich + reuse-correlator.
2026-04-26 09:22:02 -04:00
8a6d632ab0 feat(deploy): systemd unit for decnet-enrich + register in worker panel
Mirrors decnet-reuse-correlator.service.j2: same hardening posture
(NoNewPrivileges, ProtectSystem=full, etc.), same restart policy, same
log file convention. The decnet init renderer picks it up automatically
via the decnet-*.service.j2 glob.

Also reconciles a naming inconsistency I shipped earlier: the heartbeat
name was 'intel' (the package) but the CLI command and unit are 'enrich'
(the action). Renamed the heartbeat to 'enrich' so the workers panel
displays the same string the operator types and the same string in the
systemd unit file. Convention across the project: heartbeat name =
registry key = unit basename = CLI command name.

Registers 'enrich' in worker_registry.KNOWN_WORKERS and in the
start-all preferred order. The decnet.target Wants= list also picks
up the new unit so 'systemctl start decnet.target' brings everything
up together.
2026-04-26 05:20:54 -04:00
a455248dd9 feat(deploy): systemd unit for decnet-reuse-correlator
Adds the systemd template for the credential-reuse correlator daemon
and wires it into decnet.target so `decnet init` installs it
automatically (the unit installer globs decnet-*.service.j2). Mirrors
the mutator template: bus-woken Type=simple service with the standard
hardening + on-failure restart.

Also registers `reuse-correlator` in the in-process worker registry
(so the dashboard panel surfaces its heartbeat instead of dropping it
as unknown) and slots it into the start-all preferred order between
mutator and webhook.
2026-04-26 04:29:10 -04:00
f8ef0a5cf1 fix(deploy): redirect DOCKER_CONFIG out of $HOME so ProtectHome doesn't kill builds
The api unit's ProtectHome=read-only made the user's HOME read-only
inside the unit's namespace. docker compose --build then tried to
write ~/.docker/buildx/activity/* and got EROFS — which we'd been
misdiagnosing as a buildx wedge for the last few iterations.

Real fix: set DOCKER_CONFIG and BUILDX_CONFIG in the unit's
Environment= to a path inside ReadWritePaths. Hardening stays on,
docker CLI writes to install_dir/.docker instead of /home/<user>/.docker.

The wedge classifier now detects this case (count==0 + /home/ in
the stderr path) and emits a recipe pointing at the env-var fix
instead of the driver-rebuild path. Test added.

Wiki gets the new branch first since it's the most common cause
on systemd-managed installs.
2026-04-24 22:07:13 -04:00
e6127a81a1 feat(webhook): worker + CLI + systemd unit
Introduces the `decnet webhook` long-running worker that consumes the
internal bus and POSTs matching events to configured subscriptions.

Design: one task per (subscription, pattern) pair. Each task opens
its own bus subscription, iterates events, and dispatches via the
shared deliver() client. No intermediate queue, no in-memory filter
matching — the bus's own pattern matcher is the filter. Reloads on
`system.webhook.subscriptions_changed` signals from the CRUD router,
with a 60s fallback timer in case a signal is lost.

Shutdown propagates via CancelledError on the outer task; all inner
subscription tasks are cancelled and awaited in a finally block.
Bus unavailable → worker stays up in idle mode per the DEBT-031
pattern, logging one warning.

Registered as a master-only CLI command (agents don't configure
webhooks — the subscription store lives on master). systemd unit
mirrors the profiler template; added to decnet.target Wants= list so
`systemctl start decnet.target` brings it up alongside everything
else. `decnet init` auto-picks up the new .service.j2 via its
existing `glob("decnet-*.service.j2")` sweep.
2026-04-24 15:46:11 -04:00
215251a122 fix(deploy): template --group on the bus ExecStart too
decnet-bus.service.j2 ran with User={{ user }} / Group={{ group }}
but the actual bus CLI invocation hardcoded --group decnet. The bus
chowns /run/decnet/bus.sock to that group at 0660 — so when an
operator ran `decnet init --group anti`, the socket ended up
owned by decnet:decnet while every worker (agent, api, collector,
forwarder, prober, updater) ran as anti and got EACCES on connect().

Each worker's bus-wiring catches the error, logs a warning, sets
bus=None, and carries on — which is correct for the data-plane but
silently kills Workers-panel heartbeats (run_health_heartbeat(None,
...) no-ops). So half the worker grid showed UNKNOWN even though
systemctl confirmed the processes were alive.

Swap the hardcoded --group decnet for --group {{ group }} so the
socket is owned by the same group the workers run under.
2026-04-24 01:09:55 -04:00
e4ccf30133 fix(init): template the polkit rule on --group too
polkit rule 50-decnet-workers.rules hardcoded isInGroup("decnet"),
so when 'decnet init --group anti' installed systemd units as
User=anti / Group=anti, the API (running as anti) could no longer
systemctl start/stop decnet-*.service — polkit fell back to
'interactive authentication required', which in a daemon context is
a hard fail:

  START FAILED · COLLECTOR — Failed to start decnet-collector.service:
  Access denied as the requested operation requires interactive
  authentication.

Rename the rule to .j2, parameterise the group on {{ group }}, and
route _install_polkit through _render_template /
_write_rendered_if_changed. Now the polkit rule matches whatever
group was passed to 'decnet init'.

Test fixture updated to seed the .j2 variant.
2026-04-24 01:07:16 -04:00
08436433ef fix(deploy): relocate StandardOutput/StandardError below multi-line ExecStart
Four templates use backslash line-continuation on ExecStart
(decnet-bus, decnet-forwarder, decnet-listener, decnet-updater). My
earlier sed inserted StandardOutput= and StandardError= right after
the first ExecStart= line, which split the command and systemd fed
those two lines back to the binary as extra positional arguments —
the bus in particular crashed with:

  Got unexpected extra argument
  (StandardOutput=append:/var/log/decnet/decnet.bus.log)

Walk the ExecStart block (follow \-continuation lines) and insert
the two Standard* directives AFTER the last continuation line. The
nine single-line ExecStart templates are unaffected in shape but
re-written through the same path to keep the whole set uniform.
2026-04-24 01:03:58 -04:00
d4b714dc39 fix(deploy): wire per-unit log files on master systemd services
The agent-side enroll-bundle templates (decnet/web/templates/*) always
set DECNET_SYSTEM_LOGS + StandardOutput/StandardError to a per-unit
file under /var/log/decnet. The master-side init templates (deploy/*)
never did, so every 'decnet init'-installed service:

- inherited the default DECNET_SYSTEM_LOGS=decnet.system.log — a
  relative path, landing in the unit's WorkingDirectory. All 13 units
  shared the same cwd and fought for the same file, or more often
  just failed to write it under ProtectSystem=full,
- emitted stdout/stderr to the journal by default, which is fine for
  uvicorn's INFO banter but makes per-service grepping a pain when
  you're chasing a single worker's trace.

Mirror the agent-side wiring on all 13 master templates:
- Environment=DECNET_SYSTEM_LOGS=/var/log/decnet/decnet.<name>.log
- StandardOutput=append:/var/log/decnet/decnet.<name>.log
- StandardError=append:/var/log/decnet/decnet.<name>.log

/var/log/decnet is already in ReadWritePaths so ProtectSystem=full
stays compatible. Operators now get a dedicated
/var/log/decnet/decnet.<unit>.log per service, both from the app's
structured logger and from any stray stderr — journalctl still
works too, but no longer the only option.
2026-04-24 00:57:23 -04:00
38832d87d5 fix(init): thread --user / --group through systemd unit templates
Every decnet-*.service.j2 hardcoded User=decnet / Group=decnet. The
init CLI accepted --user / --group and used them for useradd,
chown, /etc/decnet ownership and ReadWritePaths — but the Jinja
context omitted them entirely, so

  sudo decnet init --install-dir $PWD --user anti --group anti

rendered

  User=decnet
  Group=decnet

into every unit, which at best ran the workers as a user that didn't
match the files (fails to read the venv / config), and at worst spun
a parallel system user the operator never asked for.

Swap the hardcoded lines to {{ user }} / {{ group }} across all 13
templates and add both to the Jinja context in _install_units.
2026-04-24 00:36:23 -04:00
51012eaa67 feat(init): decouple venv from install_dir; fail loud if no venv exists
The systemd unit templates hardcoded {{ install_dir }}/venv/bin/decnet.
On production hosts enroll_bootstrap.sh creates exactly that path so it
worked. On dev boxes where the operator runs `sudo decnet init` against
a source checkout with a differently-named venv (.venv, .311, .312),
every decnet-*.service looped forever in auto-restart with:

  Failed at step EXEC spawning .../venv/bin/decnet: No such file or
  directory

Templates now use {{ venv_dir }} as an independent Jinja2 var. `decnet
init` adds --venv-dir (explicit override), otherwise autodetects:

  1. $VIRTUAL_ENV (only when inside --install-dir, so a user-home venv
     never gets baked into a root-owned unit),
  2. {install_dir}/venv (production default; what enroll_bootstrap
     creates),
  3. {install_dir}/{.venv,.311,.312,.313} (common dev conventions).

Init aborts before any file writes if nothing resolves — an
operator-friendly error beats journalctl spam on every unit restart.

python3-venv doesn't set a persistent system variable — $VIRTUAL_ENV
lives in the activated shell only — so this has to be decided + baked
in at init time; there's no way for systemd to "inherit the current
venv" at unit start.

Test mode (--prefix) skips venv validation so the existing test suite
doesn't need to stub up a venv tree per case.
2026-04-24 00:29:49 -04:00
1753eca198 feat(deploy): templatize systemd services on install_dir via Jinja2
Distros reserve /opt for different things (some package managers own it
outright), and a DECNET install that wants to live at /srv/decnet or
/usr/local/decnet had to hand-edit 13 service files post-install.

Converts every deploy/decnet-*.service to a .j2 template keyed on
{{ install_dir }}, rendered by `decnet init` at install time. All other
paths (log_dir, state_dir, runtime_dir, user, group) stay standard —
only install_dir varies.

Changes:
- deploy/decnet-*.service → deploy/decnet-*.service.j2 (13 files).
- decnet init gains --install-dir (default /opt/decnet, preserves
  existing behaviour byte-for-byte). Validates absolute-path at the
  CLI boundary. Threads through useradd --home-dir and the dir-creation
  list so the filesystem layout matches the rendered templates.
- _install_units renders via Jinja2 with StrictUndefined (typo → loud
  error, not a silent broken unit). SHA over rendered output so
  operators with a custom install_dir get idempotent re-runs.
- decnet.target, tmpfiles.d, polkit rule stay static — they don't
  reference install paths.
- 4 new tests: custom install_dir renders into units, default remains
  /opt/decnet, relative paths rejected, second run with same custom
  dir is idempotent.
2026-04-23 18:08:26 -04:00
3dae44c652 feat(cli): add decnet init one-shot master-host bootstrap
Creates the decnet system user/group, installs every unit file from
deploy/ into /etc/systemd/system, drops the polkit rule, seeds
/opt/decnet + /var/{lib,log}/decnet + /etc/decnet + /run/decnet,
writes a placeholder /etc/decnet/config.ini, applies the new
tmpfiles.d entry so /run/decnet survives reboots, daemon-reloads,
and `systemctl enable --now decnet.target`.

Idempotent (re-runs print [SKIP] on already-configured items),
--dry-run previews the plan without touching anything, --no-start
defers the target start, --force overwrites even matching unit
files. Master-only (added to MASTER_ONLY_COMMANDS).

9 orchestration tests cover the non-root gate, dry-run, useradd/
groupadd argv, SKIP on present user/group, unit-file idempotency,
--force overwrite, --no-start suppression, happy path, and the
"deploy/ not found" error message.
2026-04-22 14:28:11 -04:00
a41ef52249 chore(polkit): allow decnet group to manage decnet-*.service without password
Scoped rule — matches only `decnet-<name>.service` and `decnet.target`.
Any unit outside that regex falls through to the default polkit policy.
Required so the API (running as the `decnet` user) can invoke
`systemctl start decnet-<name>.service` non-interactively.
2026-04-22 14:07:17 -04:00
f21453afdc chore(systemd): add units for collector/profiler/sniffer/prober/mutator + decnet.target
Adds the five missing worker units plus a grouping target so
`systemctl start decnet.target` brings the whole fleet up in order.
Sniffer gets CAP_NET_RAW for scapy; collector and mutator join the
docker supplementary group for docker.sock access. Repoints
Documentation= across all existing units to the canonical
git.resacachile.cl wiki.
2026-04-22 14:06:42 -04:00
fbf289ff63 feat(bus): host-local UNIX-socket pub/sub worker (DEBT-029)
Land the `decnet bus` worker and `get_bus()` factory. Transport is a
host-local UNIX-domain socket (0660, group=decnet); authz is the file
mode. Wire framing is a tiny verb-line + 4-byte-BE length + orjson body.
NATS-style wildcard topics (`*`, `>`). At-most-once, fire-and-forget —
DB stays the source of truth. `FakeBus` / `NullBus` for tests and the
disabled path. Cross-host federation is deferred to a future
`--bridge-tcp` mode; DEBT-030 is master-only and unblocked.
2026-04-21 13:49:02 -04:00
f5a5fec607 feat(deploy): systemd units w/ capability-based hardening; updater restarts agent via systemctl
Add deploy/ unit files for every DECNET daemon (agent, updater, api, web,
swarmctl, listener, forwarder). All run as User=decnet with NoNewPrivileges,
ProtectSystem, PrivateTmp, LockPersonality; AmbientCapabilities=CAP_NET_ADMIN
CAP_NET_RAW only on the agent (MACVLAN/scapy). Existing api/web units migrated
to /opt/decnet layout and the same hardening stanza.

Make the updater's _spawn_agent systemd-aware: under systemd (detected via
INVOCATION_ID + systemctl on PATH), `systemctl restart decnet-agent.service`
replaces the Popen path so the new agent inherits the unit's ambient caps
instead of the updater's empty set. _stop_agent becomes a no-op in that mode
to avoid racing systemctl's own stop phase.

Tests cover the dispatcher branch selection, MainPID parsing, and the
systemd no-op stop.
2026-04-19 00:44:06 -04:00
fc99375c62 feat: add systemd service templates for API and Web Dashboard
Some checks failed
CI / Lint (ruff) (push) Successful in 15s
CI / Test (pytest) (3.11) (push) Failing after 21s
CI / Test (pytest) (3.12) (push) Failing after 22s
CI / SAST (bandit) (push) Failing after 13s
CI / Dependency audit (pip-audit) (push) Successful in 19s
CI / Open PR to main (push) Has been skipped
2026-04-08 01:48:05 -04:00