From c675bd26cf488e9c504ac053b5a004268c1b6d17 Mon Sep 17 00:00:00 2001 From: anti Date: Sat, 2 May 2026 19:40:12 -0400 Subject: [PATCH] docs(debt): mark DEBT-035 resolved; lift DEBT-047 filesystem-access blocker MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit DEBT-035 (artifacts written as the container uid, not the API's) is resolved by the two preceding commits: * 39a298f6 — persists DECNET-service api-user/api-group as names in decnet.ini for any future composer / worker that wants to resolve the local uid via pwd.getpwnam. * b2733216 — creates /var/lib/decnet/artifacts at init time with mode 0o2775 (setgid + group-write) owned by the DECNET-service user:group. The setgid bit is the load-bearing fix: Linux mkdir(2) propagates a parent's group AND its setgid bit to every new subdirectory. Docker auto-creates the per-decoy / per-service subtree as bind-mounts fire, so those subdirs come up with group=decnet and setgid set; container file writes (default umask 0o022 → mode 0o644) inherit the decnet group; the API process and the local TTP worker (both running as the DECNET-service user, primary group decnet) read via group-read. The original recommendation of compose `user:` injection turned out infeasible for SSH and Telnet — PAM's setuid(2) during login fundamentally cannot run from a non-root container. Setgid covers both root-internal and unprivileged-internal templates uniformly without requiring per-template carve-outs. DEBT-047 (R0047 BEC disk-reach) was gated on DEBT-035 for filesystem access. That blocker is lifted — `decnet ttp` running on agents as the local DECNET-service user can now read .eml files written by the SMTP decoy. The remaining DEBT-047 work is the master-only gate flip in decnet/cli/gating.py and the EmailLifter disk-reach helper itself (factor _resolve_artifact_path out of the artifacts API endpoint into a shared module). Soft-fail paths in api_get_transcript.py and api_get_artifact.py stay as defence-in-depth — option 2 should make them never fire on a healthy install but a misconfigured deploy must not 500 the API. --- development/DEBT.md | 123 +++++++++++++++++++++++++++++--------------- 1 file changed, 81 insertions(+), 42 deletions(-) diff --git a/development/DEBT.md b/development/DEBT.md index 9baf899c..58dd3fdd 100644 --- a/development/DEBT.md +++ b/development/DEBT.md @@ -1,11 +1,11 @@ # DECNET — Technical Debt Register -> Last updated: 2026-05-02 — merged the rogue root-level `DEBT.md` -> back into this canonical register. New: DEBT-044 (✅ email producer -> wiring), DEBT-045 (EmailLifter heavyweight, partial paid), DEBT-046 -> (mal-hash feed), DEBT-047 (R0047 BEC disk-reach), DEBT-048 (TTP -> intel provider mapping review — recurring), DEBT-049 (Sigma adapter -> post-v1). +> Last updated: 2026-05-02 — DEBT-035 (artifacts uid/gid) RESOLVED +> via setgid + group-write on the artifacts root; DEBT-047 (R0047 +> BEC disk-reach) filesystem-access blocker lifted accordingly. +> Earlier same-day: merged the rogue root-level DEBT.md into this +> canonical register; filed DEBT-044…DEBT-049 (email producer +> wiring + EmailLifter follow-ups + TTP recurring + Sigma post-v1). > Severity: 🔴 Critical · 🟠 High · 🟡 Medium · 🟢 Low --- @@ -320,35 +320,65 @@ All four signals fall out of the schema for free. CoV from `kd_iki_mean` + `kd_i **Status:** Open. Depends on the shard-scan fallback (shipped in `323077b`) and `SessionProfile` schema (shipped with session recording v1). The bus-trigger path depends on DEBT-031's deferred `attacker.session.started/ended` topics, but poll-driven ingestion works today and can ship first. -### DEBT-035 — Artifacts written as the container uid, not the API's -**Files:** `decnet/services/ssh.py`, `decnet/services/telnet.py`, `decnet/templates/{ssh,telnet}/{Dockerfile,entrypoint.sh}`, `decnet/composer.py` (wherever bind mounts for `/var/lib/decnet/artifacts/**` are generated), `decnet/web/router/transcripts/api_get_transcript.py` (consumer). +### ~~DEBT-035 — Artifacts written as the container uid, not the API's~~ ✅ RESOLVED 2026-05-02 +**Files:** `decnet/cli/init.py`, `decnet/web/router/transcripts/api_get_transcript.py` (soft-fail kept as defence-in-depth). -Every decoy container that produces artifacts (session recordings, captured uploads, credential dumps) writes into a host bind-mount under `/var/lib/decnet/artifacts/{decky}/{service}/...`. The writer is whatever uid is running inside the container — typically `root` (uid 0 inside the container, which maps to the host's `root` or the container's own unprivileged `decnet` uid depending on the template's `USER` directive). The API, on the other hand, runs under whatever `--user` was passed to `decnet init` — `anti` on dev boxes, `decnet` in production. +The original recommendation was option 1 (compose `user:` directive +sourcing the API uid/gid). On implementation that turned out to be +infeasible for two of the artifact-producing templates: SSH and +Telnet *fundamentally* need root inside the container because PAM +authentication uses `setuid(2)` to switch to the target user during +login, and a non-root `sshd` / `/bin/login` cannot do that. So +option 1 doesn't generalise. -On mismatch, the API process hits `PermissionError` the moment it tries to `stat()` the artifacts dir. The transcripts endpoint now soft-fails this into a 404 (shipped in `323077b`), which keeps the API up but still leaves the operator unable to view any session that was recorded before the mismatch was fixed by hand. +Option 2 (setgid bit + shared group) does generalise, and after +exploration it turned out to be **load-bearing on its own** — no +compose `user:` directive is required: -**Evidence (dev box, 2026-04-24):** -``` -PermissionError: [Errno 13] Permission denied: - '/var/lib/decnet/artifacts/omega-decky/ssh/transcripts' -``` -Workaround: `sudo chown -R anti:anti /var/lib/decnet/artifacts`. Every new decky re-creates the dir as whatever uid the container uses, so the workaround has to be re-run — which doesn't scale. +1. `decnet init` now creates `/var/lib/decnet/artifacts` with mode + `0o2775` (setgid + group-write) owned by the DECNET-service + `user:group` (commit `b2733216`). +2. Linux `mkdir(2)` propagates the setgid bit AND the parent's + group to every new subdirectory, so when Docker auto-creates + `/var/lib/decnet/artifacts/{decky}/{service}/...` for a bind- + mount, those subdirs come up with `group=decnet` and the setgid + bit set. +3. Containers write files with default umask `0o022`, which yields + mode `0o644` (group-readable). The file's group is `decnet` + (inherited via setgid). +4. The API process (and the local TTP worker on an agent) runs as + the DECNET-service user, whose primary group is `decnet` → + group-read on the file is satisfied → no manual chown. -**Design options (pick one, not all):** +`decnet/cli/init.py` also persists the resolved user / group as +**names** under `[decnet] api-user` / `api-group` in `decnet.ini` +(commit `39a298f6`). The kebab keys auto-translate to +`DECNET_API_USER` / `DECNET_API_GROUP` env vars via +`decnet/config_ini.py` at runtime, available to any future composer +or worker that needs to resolve the local uid via `pwd.getpwnam` +(deferred — not needed for this paydown, kept as the cleaner path +if a stricter security model is wanted later). -1. **Container runs as the host API's uid.** `compose_fragment()` for every artifact-producing service injects `user: "{host_uid}:{host_gid}"` into the compose snippet, sourcing the uid/gid from whatever `DECNET_API_UID` / `DECNET_API_GID` the master detected at init time (or `id -u` / `id -g` of the current process at compose time). This is the cleanest but has the most blast radius — bind mounts need to be pre-chowned to that uid before the container starts, and some templates have `entrypoint.sh` steps that assume root (e.g. `setcap`, `chmod` of system files during service setup). +**Acceptance verified**: fresh `decnet init --user anti --group anti +--prefix tmp` → `/var/lib/decnet/artifacts` lands at mode `0o2775` +owned by `anti:anti`. Subsequent decoy auto-create propagates the +group + setgid; files written 0o644 are readable by `anti`. -2. **Setgid bit on the artifacts tree + shared group.** `mkdir -p /var/lib/decnet/artifacts && chmod 2775 /var/lib/decnet/artifacts && chgrp decnet /var/lib/decnet/artifacts`. Every new file inherits the `decnet` group; the API (member of `decnet`) can read regardless of which uid wrote. Still requires each container to `chmod g+r` its output — sessrec/emitter code would need a small change to `umask(0002)` or explicit `fchmod` calls. Less invasive but fragile: any writer that forgets the umask silently regresses. +**Defence-in-depth retained**: the soft-fail path in +`api_get_transcript.py` and `api_get_artifact.py` stays — option 2 +makes it never fire on a healthy install but a misconfigured deploy +must still not 500 the API. -3. **Sidecar post-processor.** A long-running daemon under the API's uid `inotify`-watches `/var/lib/decnet/artifacts/**`, re-chowns new files on creation. Works without touching any template, but adds a new process and a race window between "file created" and "file readable by API". Not a great shape for an already-worker-heavy architecture. - -**Recommendation:** option 1, with the init command handling the setup (mkdir the artifacts tree with mode 0775, group = `--group`, then propagate the uid/gid into the compose generator). Option 2 as a fallback where option 1 can't land (e.g. templates that genuinely need root inside the container, like the conpot ICS template). - -**Acceptance:** -- A fresh `decnet init --user anti --group anti` → deploy a decky → exercise a recorded session → the API (running as `anti`) can read `/var/lib/decnet/artifacts/.../transcripts/sessions-*.jsonl` **without any manual chown**. -- The soft-fail path shipped in `323077b` stays as defence-in-depth — the API must never 500 on a permission mismatch, but it also shouldn't *need* to soft-fail on a healthy install. - -**Status:** Open. Current workaround is `sudo chown -R : /var/lib/decnet/artifacts` after every new deploy; soft-fail in the transcripts endpoint keeps the API alive in the interim. +**Out of scope (filed as separate follow-ups)**: +- Compose `user:` directive injection per fragment (option 1). + Optional polish for the 24 templates that already drop to + `logrelay`. SSH and Telnet are blocked on PAM/setuid as noted + above. File as a fresh DEBT entry if a stricter "container uid + matches host uid" model is wanted. +- `decnet ttp` master-only gate flip (`decnet/cli/gating.py:28–34`). + Required for DEBT-047 to land (TTP worker on agents reads `.eml` + files), but a separate one-line change with its own test. File + alongside the DEBT-047 disk-reach implementation. ### DEBT-037 — Webhook delivery guarantees beyond MVP **Files:** `decnet/webhook/` (**new**), `decnet/web/db/models/webhooks.py` (**new**), `decnet/web/router/webhooks/` (**new**). @@ -562,7 +592,7 @@ loop. **Status:** Open. Owner TBD. Filed 2026-05-02 alongside DEBT-045. ### DEBT-047 — EmailLifter R0047 BEC unblock (artifact disk-reach) -**Files:** `decnet/ttp/impl/email_lifter.py` (consumer), `decnet/web/ingester.py` (no producer change once unblocked). +**Files:** `decnet/ttp/impl/email_lifter.py` (consumer), `decnet/cli/gating.py` (master-only gate flip). R0047's predicate (`_p_bec` at `email_lifter.py:244`) reads `body_text` and `subject`, substring-matching them against per-rule keyword lists. Shipping raw body text on the abstracted service bus @@ -576,16 +606,25 @@ opens the `.eml` from the artifact tree at predicate against the body parsed in-process. Bus carries only the artifact pointer; raw body text never leaves the host disk boundary. -**Blocked by DEBT-035** (artifacts uid/gid mismatch) — `decnet ttp` -running on agents cannot read artifact files written by deckies on -the same host because of the permission mismatch. The legacy -`_p_bec` body_text path remains in place untouched, so when -disk-reach lands the predicate works without any code change. -**Trigger:** DEBT-035 paid (artifacts uid/gid aligned). Then add a -disk-reach helper to the EmailLifter that opens the `.eml` lazily -when a body-aware predicate runs. -**Status:** Open, gated on DEBT-035. Owner TBD. Filed 2026-05-02 -alongside DEBT-045. +**Filesystem access UNBLOCKED 2026-05-02 by DEBT-035 paydown** — +`/var/lib/decnet/artifacts` carries setgid + `decnet:decnet`, so +files written by SMTP decoys are group-readable by the local +DECNET-service user (which is what `decnet ttp` runs as on +agents). The legacy `_p_bec` body_text path remains in place +untouched, so when the disk-reach helper lands the predicate +works without any code change. +**Remaining work**: +- Flip the `decnet ttp` master-only gate at + `decnet/cli/gating.py:28–34` so agents can run the worker. +- Add a disk-reach helper to the EmailLifter that opens the + `.eml` lazily when a body-aware predicate (R0047 or R0048 + fallback) runs. Resolve `stored_as` to the artifact path via + the existing `_resolve_artifact_path` helper at + `decnet/web/router/artifacts/api_get_artifact.py:48` (factor + to a shared module for the lifter to import). +**Status:** Open. Owner TBD. Filed 2026-05-02 alongside DEBT-045. +Filesystem-access blocker resolved by DEBT-035 paydown +(2026-05-02). ### DEBT-048 — TTP intel provider mapping review (quarterly recurring) **Files:** `rules/ttp/R0054.yaml`–`R0058.yaml`, `decnet/ttp/impl/intel_lifter.py`, `development/TTP_TAGGING.md` §"Hard parts §9 Intel provider drift". @@ -664,7 +703,7 @@ user who needs it. | ~~DEBT-031~~ | ✅ | Workers / Bus integration | resolved | | DEBT-032 | 🟡 Medium | Correlation / Prober | open | | DEBT-033 | 🟡 Medium | Storage / Session recording | open | -| DEBT-035 | 🟡 Medium | Artifacts / Filesystem perms | open | +| ~~DEBT-035~~ | ✅ | Artifacts / Filesystem perms | resolved 2026-05-02 | | DEBT-036 | 🟡 Medium | Correlation / Keystroke dynamics | open | | DEBT-037 | 🟡 Medium | Integration / Webhooks | open (tracks MVP follow-ups) | | DEBT-038 | 🟡 Medium | Honeypot / SSH cred capture | open (document-only) | @@ -676,9 +715,9 @@ user who needs it. | ~~DEBT-044~~ | ✅ | TTP / Email producer wiring | resolved 2026-05-02 | | DEBT-045 | 🟡 Medium | TTP / EmailLifter heavyweight extraction | partial paid 2026-05-02 | | DEBT-046 | 🟡 Medium | TTP / EmailLifter mal-hash feed integration | open | -| DEBT-047 | 🟡 Medium | TTP / EmailLifter R0047 BEC (disk-reach) | open (gated on DEBT-035) | +| DEBT-047 | 🟡 Medium | TTP / EmailLifter R0047 BEC (disk-reach) | open (FS-access unblocked 2026-05-02; remaining: gate flip + lifter helper) | | DEBT-048 | 🟡 Medium | TTP / Intel provider mapping review (recurring) | open / recurring | | DEBT-049 | 🟡 Medium | TTP / Sigma adapter (post-v1) | open | -**Remaining open:** DEBT-011 (Alembic), DEBT-023 (image pinning), DEBT-026 (modular mailboxes), DEBT-027 (Dynamic bait store), DEBT-028 (deploy endpoint tests), DEBT-032 (fingerprint rotation detection), DEBT-033 (transcript shard rotation), DEBT-035 (artifacts uid/gid alignment), DEBT-036 (session-profile ingester), DEBT-037 (webhook delivery hardening), DEBT-038 (SSH PAM cred-capture limitations — document-only), DEBT-042 (orchestrator failure-count window), DEBT-043 (frontend test framework), DEBT-045 (EmailLifter heavyweight — partial paid; carved-out follow-ups remain), DEBT-046 (mal-hash feed), DEBT-047 (R0047 BEC disk-reach — gated on DEBT-035), DEBT-048 (TTP intel provider mapping review — recurring quarterly), DEBT-049 (TTP Sigma adapter — post-v1). +**Remaining open:** DEBT-011 (Alembic), DEBT-023 (image pinning), DEBT-026 (modular mailboxes), DEBT-027 (Dynamic bait store), DEBT-028 (deploy endpoint tests), DEBT-032 (fingerprint rotation detection), DEBT-033 (transcript shard rotation), DEBT-036 (session-profile ingester), DEBT-037 (webhook delivery hardening), DEBT-038 (SSH PAM cred-capture limitations — document-only), DEBT-042 (orchestrator failure-count window), DEBT-043 (frontend test framework), DEBT-045 (EmailLifter heavyweight — partial paid; carved-out follow-ups remain), DEBT-046 (mal-hash feed), DEBT-047 (R0047 BEC disk-reach — filesystem unblocked 2026-05-02; remaining: gate flip + lifter helper), DEBT-048 (TTP intel provider mapping review — recurring quarterly), DEBT-049 (TTP Sigma adapter — post-v1). **Estimated remaining effort:** ~21 hours plus the new EmailLifter / TTP follow-ups. DEBT-030 Phase B (optimistic staged-buffer editor) is a follow-up, not debt.