Commit Graph

485 Commits

Author SHA1 Message Date
4e436da569 feat(realism): LLM enrichment for user-class file bodies
Stage 6 of the realism migration. User-class file bodies (note,
todo, draft, script) optionally get LLM-authored content; system
classes (cron / daemon logs, /tmp caches) stay template-only because
formulaic *is* the right look for them.

New surface:

- realism.llm.circuit.LLMCircuitBreaker — process-local sliding-window
  breaker. 3 consecutive failures trip open; 60s cooldown to half-open;
  half-open success closes, failure re-opens. Protects the orchestrator
  tick from sustained Ollama wedges (per-call timeout already covers
  one-shot hangs).
- realism.prompts._style — em-dash suppression lifted from the
  email prompt. Persona.uses_llms_heavily opts out per the
  feedback_em_dash_llm_tell.md memory. Includes strip_em_dashes
  belt-and-braces sub for output that slipped past the prompt rule.
- realism.prompts.filebody — class-conditioned prompts (note / todo
  / draft / script) with persona context, language pinning, output
  shape rule.
- realism.bodies.make_body_with_llm — async wrapper around make_body
  that calls the LLM when one is provided AND the breaker allows.
  Falls back to template on timeout / error / empty / system-class.

Wiring:

- scheduler.pick_file accepts optional llm + llm_breaker + llm_timeout.
  When the planner picks a create action and the content_class is a
  user-class, the body_hint is replaced with the LLM-authored body
  (or falls back to the deterministic body_hint).
- orchestrator.worker constructs get_llm() at startup gated by
  DECNET_REALISM_LLM env var (any non-empty value enables; empty /
  "off" / "none" / "0" disables). Passes llm + breaker through every
  tick.
- decnet orchestrate gains --llm/--no-llm flag overriding the env var.
2026-04-27 16:42:58 -04:00
b321e29002 feat(realism): EditAction read-modify-write of planted files
Stage 3b of the realism migration. A TODO.md planted on Monday gets a
checkbox flipped on Tuesday; a notes file grows a follow-up line; a
cron log gets a fresh entry tacked on. The synthetic_files row's
edit_count, last_modified, and content_hash advance.

New surface:

- EditAction dataclass (peer of FileAction in scheduler.py): carries
  decky, path, persona, content_class, previous_body, mtime, and
  synthetic_file_uuid for the worker's update path.
- realism.bodies.next_iteration(cls, persona, prev, rng): per-class
  deterministic mutators. TODO flips an unchecked box and/or appends;
  notes/drafts/scripts append; logs are append-only (mirroring real
  log behaviour). Canary, cache_tmp, email raise KeyError —
  unsupported.
- realism.planner.pick gains an edit branch: 60% create, 30% edit
  (when an edit_candidate is supplied), 10% leave-alone. Returns
  None on leave-alone — quiet ticks are realism too.
- scheduler.pick_file pre-fetches a single edit candidate via
  repo.pick_random_synthetic_file_for_edit ~50% of ticks; the
  planner decides whether to use it.
- SSHDriver._run_edit: turns next_iteration output into a
  plant_file call (mtime-bumped, mode 0o644). Stashes new_body in
  result.payload so the worker can hash it for synthetic_files.
- worker._bump_synthetic_file_after_edit: patches edit_count + 1,
  last_modified=now, content_hash, last_body for the row UUID.
  No-op when the row was pruned mid-flight.
- events.to_row / topic_for / event_type_for now recognise
  EditAction (kind="file", action="file:edit").
2026-04-27 16:38:17 -04:00
32eeb0c813 refactor(orchestrator): collapse decnet-emailgen.service into orchestrator
Stage 5 of the realism migration. Email generation is no longer a
separate worker / systemd unit / CLI subcommand — the orchestrator's
single tick loop covers SSH traffic, file plants, and email drops.
Going from 21 services to 20.

Worker:
- _one_tick rolls between traffic / file / email (45/45/10 weights).
  The 10% email weight at a 60s orchestrator interval produces ~one
  email per 10 minutes, close to the pre-collapse 5-minute cadence.
- get_driver_for(action) (stage 4) handles SSH vs Email dispatch.
- Quiet branches fall through so a (decky-set, persona-pool,
  mail-decky) shape that silences one branch doesn't waste the tick.
- Periodic prune covers both orchestrator_events and
  orchestrator_emails tables.

Deletions:
- deploy/decnet-emailgen.service.j2
- decnet/orchestrator/emailgen/worker.py
- decnet/cli/emailgen.py
- tests/orchestrator/emailgen/test_worker_integration.py

Renames (history-preserving):
- decnet/web/router/emailgen/ -> decnet/web/router/realism/
- tests/api/emailgen/        -> tests/api/realism/
- tests/cli/test_emailgen_*  -> tests/cli/test_realism_*

Public surface changes (clean break, pre-v1):
- API URL /api/v1/emailgen/personas -> /api/v1/realism/personas
- CLI `decnet emailgen import-personas` -> `decnet realism
  import-personas`. `decnet emailgen run` is gone — the orchestrator
  covers it.
- gating.py: emailgen master-only group replaced by realism.
- decnet-orchestrator.service.j2: DECNET_REALISM_* env block added.
- decnet.target: decnet-emailgen.service entry removed.
- frontend: PersonaGeneration.tsx fetches /realism/personas.
2026-04-27 16:33:04 -04:00
cb1872c52f feat(realism): synthetic_files table + planner wiring + scheduler swap
Stage 3 of the realism migration. Replaces orchestrator/scheduler.py's
hardcoded _FILE_TEMPLATES/_USERS (3 templates emitting epoch-suffixed
filenames like notes-1777315854.txt with identical bodies per
template) with a persona-driven realism engine.

New surface:

- SyntheticFile SQLModel (synthetic_files table, UNIQUE on
  decky_uuid+path) — per-(decky, path) state for the future
  edit-in-place flow. Pre-v1, no _migrate_* helper.
- BaseRepository methods: record_synthetic_file,
  update_synthetic_file, list_synthetic_files,
  pick_random_synthetic_file_for_edit (used by stage 3b).
- realism/naming.py: per-content-class filename templates,
  persona-conditioned. /var/log/cron.log + logrotate skeleton for
  system-class; /home/<persona>/TODO.md, scratch.md, etc. for
  user-class. Anti-regression test pins "no 8+ digit decimals in
  basenames" (the realism failure today).
- realism/bodies.py: deterministic body templates per content_class.
  TODO body uses checkbox markdown, script body has a shebang, cron
  body matches syslog cron shape ("CRON[PID]: (user) CMD (...)").
- realism/planner.py: pick(deckies, now, rng) returns a Plan.
  Diurnal-gated, weighted user/system content split (70/30 user
  bias). Create-only in stage 3; edit branch lands in stage 3b.

Scheduler split:

- scheduler.pick is now traffic-only (sync).
- scheduler.pick_file is async, takes a repo, resolves personas
  (Topology.email_personas for topology-source deckies; global
  realism.personas_pool otherwise), and maps Plan -> FileAction.
- FileAction gains persona/content_class/mtime fields.

Worker:

- _one_tick rolls 50/50 between traffic and file each tick. After a
  successful FileAction plant, _record_synthetic_file persists or
  patches the synthetic_files row (catching the unique-constraint
  collision on re-plant of the same path).
- SSHDriver._run_file passes action.mtime through to plant_file so
  files don't all stamp at wall-clock-now.
2026-04-27 16:22:07 -04:00
636c057cc5 refactor(orchestrator): extract ActivityDriver ABC + driver factory
Stage 4 of the realism migration. Lifts the driver Protocol into a
proper ABC with default plant_file/read_file methods (raise
NotImplementedError), and adds get_driver_for(action) so the
orchestrator worker can dispatch by action shape without isinstance
chains.

SSHDriver now inherits ActivityDriver and implements:
- plant_file: streams base64 via stdin (ARG_MAX-safe, mirrors
  decnet.canary.planter; commit c17b9e0). Honours mtime via touch -d
  so realism-planned files don't all stamp at wall-clock-now.
- read_file: docker exec cat with FileNotFoundError on rc=1, used by
  the upcoming EditAction (stage 3b).

EmailDriver inherits ActivityDriver. Driver alias kept for back-compat
during the migration; removed once realism stages 5-7 land.
2026-04-27 16:09:46 -04:00
0b9873982d refactor(realism): move emailgen LLM/personas/prompt into shared library
Lift the format-agnostic pieces from decnet/orchestrator/emailgen/
into the new decnet/realism/ library so file-class content generation
(stage 3 of the realism migration) can reuse them. Email-specific
delivery (RFC 2822 EML, IMAP/POP3 spool, thread chains) stays in
orchestrator/.

Renames (history-preserving git mv):
  emailgen/personas.py     -> realism/personas.py
  emailgen/prompt.py       -> realism/prompts/email.py
  emailgen/global_pool.py  -> realism/personas_pool.py
  emailgen/llm/            -> realism/llm/

Env-var clean break (pre-v1, no aliases):
  DECNET_EMAILGEN_LLM      -> DECNET_REALISM_LLM
  DECNET_EMAILGEN_MODEL    -> DECNET_REALISM_MODEL
  DECNET_EMAILGEN_TIMEOUT  -> DECNET_REALISM_TIMEOUT
  DECNET_EMAILGEN_PERSONAS -> DECNET_REALISM_PERSONAS
  DECNET_EMAILGEN_FAKE_OUTPUT -> DECNET_REALISM_FAKE_OUTPUT

Importers rewritten in: orchestrator/emailgen/scheduler.py,
orchestrator/drivers/email.py, web/router/{emailgen,topology}/
api_personas.py, cli/emailgen.py. Tests for moved modules relocated
to tests/realism/; tests for stay-put modules updated in place.

API URL `/api/v1/emailgen/personas` and CLI `decnet emailgen
import-personas` keep their public names until the service-collapse
commit (stage 5).
2026-04-27 16:05:43 -04:00
f57c621117 feat(realism): scaffold decnet/realism/ library
Empty subpackage skeleton for the realism migration: ContentClass enum
(file/email/canary content categories), Plan dataclass (frozen, with
edit-action invariant), in_work_hours window check (wrap-around
supported, fail-open on parse error), and sample_mtime for backdated
file timestamps that snap into a persona's active hours.

Stage 1 of the orchestrator+canary realism unification — no
production caller wired yet; planner.pick is a stub returning None
until stage 3.
2026-04-27 15:55:21 -04:00
6376523923 feat(canary): mysql_dump generator with phone-home replica payload
Mirrors the Canarytokens.org trick: a base64-wrapped CHANGE REPLICATION
SOURCE TO + START REPLICA block in the dump trailer. Importing the
file into MySQL resolves <slug>.<dns_zone> (DNS trip) and opens a 3306
replica handshake whose SOURCE_USER smuggles @@hostname and
@@lc_time_names of the victim DB.

DNS lookup alone is sufficient for detection via the existing canary
dns_server; capturing the smuggled metadata via a 3306 handshake
responder is a follow-up.
2026-04-27 13:52:55 -04:00
5ac8e0f91a feat(canary): honeydoc_docx + honeydoc_pdf generators
honeydoc previously emitted HTML only — operators picking 'Document'
out of the dropdown got a .html file dropped at /Documents/
quarterly_report.docx, which any attacker would clock the moment they
ran 'file' on it.

Two new generators that emit the real artifact format:

- honeydoc_docx: stdlib zipfile only. Builds a minimal but valid
  Office Open XML zip with the same Q3 review body as the HTML
  flavor and an external-image relationship pointing at the
  callback URL — same trick the operator-upload DOCX instrumenter
  uses, fetched on document open by Word and LibreOffice. Reuses
  _drawing() and _next_rid() from instrumenters/docx.py to keep
  the body/relationships shape identical between synthesised and
  instrumented files.

- honeydoc_pdf: pikepdf-backed. One-page PDF in the 14 base fonts
  (Helvetica, no font embedding), realistic body, /OpenAction /URI
  on the catalog so most viewers fire the callback on document
  open. Falls back to a clear error if pikepdf is missing so the
  operator can switch to honeydoc / honeydoc_docx.

Default placement paths now reflect each generator's true extension
(.html / .docx / .pdf) so the UI suggests something sensible. Both
generators surfaced in the New Token modal's generator dropdown.
2026-04-27 13:44:20 -04:00
c17b9e01c8 fix(canary): stream base64 payload via stdin to avoid ARG_MAX
Real-world plant() crashed with OSError [Errno 7] Argument list too
long when an artifact (honeydoc HTML / DOCX / PDF) base64-encoded
into the sh -c script body exceeded the kernel's argv limit (typically
128KB-2MB depending on the host).

Fix: keep the script trivial ('mkdir -p ... && base64 -d > path && ...')
and stream the encoded bytes through 'docker exec -i ... sh -c'
stdin instead. _run() grew an optional stdin_bytes parameter that's
piped into proc.communicate(input=...). The stdin path covers
arbitrarily large artifacts.

Tests updated:
- test_plant_argv_and_base64_round_trip now asserts the docker -i
  flag is present and the base64 payload reaches stdin (and notably
  is NOT in the script body).
- _FakeProc.communicate accepts input=None across the board so the
  patched fast path no longer trips on the new kwarg.
2026-04-27 13:37:19 -04:00
34c85346a6 feat(deploy): seed canary baseline at deploy time + tests
Hooks decnet.canary.planter.seed_baseline into the deploy() flow's
fleet-mirror step. After upserting a FleetDecky as 'running' we seed
the configured baseline canary set on the freshly-deployed decky.

Persona detection: read d.nmap_os (Windows -> windows path-mapping,
otherwise linux). Failures are logged and surface as state=failed
rows in the UI; the deploy itself MUST NOT abort (resilience
principle in CLAUDE.md).

Tests confirm:
- seed_baseline produces one row per configured generator per decky;
- the deployer source wires seed_baseline inside a try/except so a
  failure can't abort the deploy.
2026-04-27 13:19:08 -04:00
6c4ea706f8 feat(api): canary token CRUD router (/api/v1/canary) + tests
Two sub-routers under /api/v1/canary:

blobs (operator-uploaded artifacts, deduped by sha256):
- POST   /blobs          (multipart upload; admin)
- GET    /blobs          (list with token_count; admin)
- DELETE /blobs/{uuid}   (refcount-aware; 409 when referenced; admin)

tokens (per-decky planted artifacts):
- POST   /tokens                          (generate or instrument + plant; admin)
- GET    /tokens?decky_name=&kind=&state= (filter; viewer)
- GET    /tokens/{uuid}                   (detail; viewer)
- GET    /tokens/{uuid}/preview           (instrumented bytes; admin)
- GET    /tokens/{uuid}/triggers          (paged callback log; viewer)
- DELETE /tokens/{uuid}                   (revoke + bus event; admin)

XOR validation: exactly one of blob_uuid / generator must be set.
Path validation rejects relative/NUL/newlines/.. segments. Every
body-bearing route documents 400 plus 401/403/404 as applicable.

Stdlib MIME sniffer (no python-magic dep) covers PNG/JPEG/GIF/PDF/
HTML/XML/DOCX/XLSX/JSON/YAML/TOML/text/plain; everything else falls
through to passthrough.

Tests run end-to-end through the live FastAPI app (planter docker
exec is patched); 17 cases covering dedup, refcount, lifecycle,
XOR validation, path validation, and 404 paths.
2026-04-27 13:18:00 -04:00
f9513bb7dd feat(cli): register decnet canary subcommand + tests
decnet canary launches the HTTP + DNS callback receiver via
decnet.canary.worker.run. Mirrors the shape of decnet webhook
(typer command with --daemon flag, asyncio.run in the foreground).

Deliberately NOT added to MASTER_ONLY_COMMANDS — every host that
hosts deckies runs its own canary worker, and the bus events stay
local to that host (per-host webhook fanout handles SIEM egress).
2026-04-27 13:13:23 -04:00
fae3e0caa3 feat(canary): worker (HTTP + stdlib DNS callback receivers) + tests
decnet canary worker hosts both callback surfaces in one process:

- HTTP: a tiny FastAPI app on its own port (default 8088). The only
  meaningful route is GET /c/{slug} which looks up the slug, persists
  a CanaryTrigger, publishes canary.<id>.triggered, and returns a 1x1
  transparent GIF. Unknown slugs return the same response (stealth);
  no decnet strings leak in headers/banners; docs/openapi/redoc are
  disabled. X-Forwarded-For is honored.

- DNS: an authoritative UDP server for *.<canary_zone> using
  asyncio.DatagramProtocol with stdlib-only DNS wire-format parsing
  (no dnslib dep). Same lookup -> persist -> publish flow, plus a
  sinkhole A record (192.0.2.1) so the attacker's resolver doesn't
  loop on NXDOMAIN. Single-label slugs only; multi-label probes
  return NXDOMAIN. Pointer loops in malformed queries are caught
  (10-hop cap) so an adversarial packet can't wedge the parser.

Tests cover both surfaces without privileged sockets:
- HTTP via Starlette TestClient: known/unknown slug, headers, XFF,
  stealth-string assertions.
- DNS via direct DatagramProtocol drive: known slug -> ANSWER,
  unknown -> NXDOMAIN, pointer-loop -> ValueError, malformed
  packet -> silent drop.
2026-04-27 13:12:05 -04:00
8fb9bc5545 feat(canary): planter (docker exec injector) + tests
Plant / revoke / seed_baseline using the same docker-exec-with-sh-c
pattern proven by decnet/orchestrator/drivers/ssh.py:_run_file.

Each plant call composes a single sh script:
  mkdir -p <dirname> && printf %s <base64> | base64 -d > <path> &&
  chmod <mode> <path> && touch -d @<mtime> <path>

Base64-on-the-host / decode-in-the-container keeps binary artifacts
(DOCX/PDF/PNG) safe across the argv boundary; the placement_path,
mode, and mtime are shlex-quoted.

State transitions hit the repo: planted -> failed on docker error
with stderr captured into last_error. Bus events fire on success
(canary.<id>.placed) and on revoke (canary.<id>.revoked) — wrapped
in try/except so a downed bus never blocks a placement.

seed_baseline(decky_name, repo) is the deploy-hook entry point —
reads DECNET_CANARY_BASELINE (default git_config,env_file,honeydoc,
aws_creds), persists one row per generator, plants each. Failed
placements are logged but do NOT abort; the deployer hook treats
the return list as informational.
2026-04-27 13:08:18 -04:00
19ceff4417 feat(canary): operator-upload instrumenters + tests
Seven instrumenters that mutate operator-supplied artifacts to
embed the callback URL:

- passthrough — bytes unchanged; only DNS-callback tokens trip
  detection, with the slug embedded in the placement path
- plain      — substitutes {{CANARY_URL}}/{{CANARY_HOST}} placeholders;
  falls back to appending a comment line whose prefix adapts to the
  apparent file syntax (#, //, ;)
- html       — injects a 1x1 tracking pixel before </body>, appends
  if the close tag is missing
- docx       — direct zipfile manipulation (no python-docx dep):
  inserts an external-image Relationship into word/_rels/document.xml.rels
  and a matching <w:drawing> element before </w:body>
- xlsx       — sibling of docx; injects an external-image relationship
  into xl/_rels/workbook.xml.rels (orphan rels are still fetched on
  open by most viewers)
- pdf        — uses pikepdf to install /OpenAction /URI on the catalog;
  rejects with a clear message when pikepdf isn't installed
- image      — uses Pillow to embed slug + URL in PNG tEXt / JPEG
  comment; rejects with a clear message when Pillow isn't installed

DOCX and XLSX share the rId allocator + relationship injector via
the docx module; both work on stdlib zipfile only.

Tests synthesise minimal real DOCX/XLSX fixtures inline, round-trip
each instrumenter, and assert the callback URL ends up in the
mutated bytes while the file still parses.
2026-04-27 13:03:42 -04:00
c7658ea65e feat(canary): synthesised-artifact generators + tests
Five built-in generators that produce deterministic fake artifacts
keyed by the token slug:

- aws_creds  — passive [default]/[prod] credentials block, no
               callback wiring (AWS-key tokens require an external
               trap, which is post-v1)
- git_config — .git/config with origin url = http_base/c/<slug>/repo.git
- env_file   — .env with API_BASE_URL + WEBHOOK_NOTIFY_URL embedding
               the callback URL plus inert realism filler
- ssh_key    — PEM-shaped fake private key whose host comment carries
               <slug>.<dns_zone> when DNS is deployed, else the
               http_base host
- honeydoc   — minimal HTML report with a 1x1 tracking-pixel <img>
               whose src is the callback URL; fallback for the
               deploy-time baseline before the operator uploads a
               real DOCX/PDF

Tests assert byte-stability (same ctx -> same bytes), slug presence
in the embedded fields, that aws_creds is intentionally URL-free,
and that every artifact carries operator-facing notes for the
preview endpoint.
2026-04-27 12:59:19 -04:00
8f19adecfe feat(canary): package scaffolding (base/factory/paths/storage) + tests
Mirrors the decnet.intel layout (base + factory + lazy concrete
imports). Defines:

- CanaryArtifact / CanaryContext dataclasses + the generator and
  instrumenter ABCs they share
- factory dispatch for generators (git_config/env_file/ssh_key/
  aws_creds/honeydoc) and instrumenters (docx/xlsx/pdf/html/image/
  plain/passthrough), plus pick_instrumenter_for_mime() for MIME-driven
  dispatch on operator uploads
- persona-aware default placement paths (Linux vs. Windows-shaped)
  and absolute-path validation that the API will use to validate
  operator-supplied placement_path values
- on-disk blob store: sha256-keyed two-level fan-out, idempotent
  writes, refcount-aware unlink (the DB row is the source of truth)

Also covers prior commits' tests (bus topics, models, repo CRUD)
under tests/canary/. 79 tests, all pass.
2026-04-27 12:56:01 -04:00
6a0d140e91 feat(db): canary token repository CRUD
Adds the abstract surface on BaseRepository and the SQLModel-backed
implementation (shared by SQLite and MySQL) for:

- canary blobs (upsert-by-sha256, list-with-refcount, refcount-aware delete)
- canary tokens (create, slug lookup, list with filters, state update)
- canary triggers (record+bump-counters atomically, list, attribute)

The triggers path is a single session that inserts the row and bumps the
parent token's counters together, so a subscriber that reads the token
right after the bus event sees the updated count. Blob delete refuses
while any token (including revoked) still references the blob; pre-v1
revoked tokens stick around for forensic value.
2026-04-27 12:48:24 -04:00
813f14bf2a feat(db): canary token tables (blob/token/trigger)
Three new tables for the canary tokens feature:

- canary_blobs       — operator-uploaded source artifacts, deduped by sha256
- canary_tokens      — one planted artifact in one decky; carries the
                       callback slug, generator/instrumenter, and lifecycle
- canary_triggers    — append-only log of every callback hit; attacker_id
                       back-filled by the correlator

Pydantic request/response shapes live in the same file per the
single-source-of-truth convention. No migrations file — pre-v1
SQLModel.metadata.create_all() covers it.
2026-04-27 12:45:41 -04:00
914c911984 feat(bus): canary token bus topics (placed/triggered/revoked)
Reserved topic family for the upcoming canary-tokens feature so the
correlator and webhook fanout can subscribe to canary.> from day one.
No producers yet; planter, decnet canary worker, and API will publish
in subsequent commits.
2026-04-27 12:43:23 -04:00
828165783e feat(templates): standalone NTLMSSP Type 3 parser + decnet-init wrapper
* decnet/templates/{rdp,smb}/ntlmssp.py — minimal Type 3 (Authenticate)
  parser shared between the SMB and RDP-NLA templates.  Lands NTLM
  creds in the universal Credential table with secret_kind=ntlmssp_v1
  / ntlmssp_v2 and secret_b64 = base64 of the NtChallengeResponse so
  the bounty pipeline can feed the right hashcat mode.
* scripts/decnet-init.sh — convenience wrapper around `sudo decnet init
  --force` that targets the current working directory; saves operators
  retyping the install paths during dev iterations.
2026-04-27 10:12:30 -04:00
f046634d6e feat(web): Persona Generation page under AUTOMATION
New dashboard surface for editing the global emailgen persona pool —
the JSON file fleet (MACVLAN/IPVLAN) and SWARM-shard mail deckies pull
from.  MazeNET topology personas are out of scope here; they're
configured per-topology in the topology editor.

Backend:
* GET/PUT /api/v1/emailgen/personas — admin-write, viewer-read.  PUT
  validates with the same Pydantic schema the worker uses
  (parse_personas), drops invalid entries with a warning, returns 400
  only when the entire payload fails.  Path is operator-discoverable
  on every response so a CLI-driven backup workflow stays visible.

Frontend:
* PersonaGeneration.tsx + .css — table + add/edit modal with the full
  EmailPersona schema (name, email, role, tone, mannerisms list,
  language, signature, active hours, reply latency, uses_llms_heavily).
  Local edits are batched; explicit "SAVE CHANGES" writes back, with a
  dirty-indicator pill and a "DISCARD" reset.  Email uniqueness is
  enforced client-side so the scheduler never picks the same persona
  as both sender + recipient.
* Sidebar AUTOMATION group gains a "Persona Generation" entry next to
  Orchestrator; route registered at /persona-generation.

The worker reads the same on-disk file the API writes — see
decnet.orchestrator.emailgen.global_pool.  The API resets the
in-process cache on every read/write so the worker picks up dashboard
edits within its next tick rather than waiting on mtime.
2026-04-27 09:55:42 -04:00
818aebadfc feat(web): emailgen events in Orchestrator page
The SSE pipe at /orchestrator/events/stream was already streaming
'orchestrator.email.{decky_uuid}' events (the subscription is for the
'orchestrator.>' wildcard), but the consumer side dropped them on the
floor.  Three fixes to close the loop:

* useOrchestratorStream.ts now registers an 'email' SSE listener — the
  EventSource silently ignores frames whose event name has no listener,
  so missing this entry meant every email frame was dropped before
  reaching the page's onEvent handler.

* /api/v1/orchestrator/events accepts kind=email and dispatches to
  list_orchestrator_emails, adapting rows to the existing wire shape:
  subject -> action, sender_email -> src_decky_uuid, recipient_email
  -> dst_decky_uuid, plus email-specific extras (thread_id, language,
  mail_decky_uuid, message_id, in_reply_to) ride along as top-level
  keys.

* Orchestrator.tsx gains an 'email' tab in the kind filter and a
  branch in the row renderer / inspector that:
   - shows full sender / recipient (no UUID truncation),
   - chips the language code next to the subject,
   - relabels ACTION as SUBJECT in the inspector and surfaces
     thread / in-reply-to / mail-decky details.

The 'all' tab continues to show traffic+file only (today's behavior);
operators see emails by switching to the email tab.  A union view at
the API layer is the obvious follow-up but not necessary for now.
2026-04-26 22:56:48 -04:00
73692b52f0 feat(emailgen): gate as master-only
Two-layer gating per CLAUDE.md:
- registration-time: emailgen added to MASTER_ONLY_GROUPS so agents
  don't see the sub-app in 'decnet --help' at all.
- body-guard: _require_master_mode('emailgen ...') at the top of every
  sub-command body so a direct callable import (third-party tooling)
  still bails on agent hosts.

Matches the convention used for 'swarm', 'topology', 'geoip'.  SWARM
agents push their generated mail through the master's emailgen worker
(or none at all); cross-agent emailgen federation stays out of scope.
2026-04-26 22:45:59 -04:00
6d520eaa6f refactor(emailgen): pluggable LLM backend (base/factory/impl)
Lift the Ollama subprocess shell-out out of EmailDriver and into a
proper provider subpackage shape:

  decnet/orchestrator/emailgen/llm/
    base.py        — LLMBackend Protocol + LLMResult + LLMTimeout
    factory.py     — get_llm() reads DECNET_EMAILGEN_LLM
    impl/ollama.py — current 'ollama run' subprocess path
    impl/fake.py   — canned-output backend used by tests

Driver now takes an LLMBackend on construction (or inherits the
factory default).  Tests inject FakeBackend instead of monkeypatching
the subprocess layer, which is cleaner and ~10x faster.  Swapping
Ollama for the Anthropic API / vLLM / llama.cpp is now a third branch
in factory.py; no driver rewrite needed.

Mirrors the convention used by decnet.web.db.factory + decnet.bus.factory
per the provider-subpackages-from-day-one rule in memory.
2026-04-26 22:43:36 -04:00
4badc75fb2 feat(emailgen): global persona pool + Date-stamped EML mtimes
Two changes that unwind earlier MazeNET-only assumptions and fix a
realism tell:

1. Persona resolution is now per-decky-source, not topology-only.  The
   scheduler walks the union view (list_running_deckies, including
   fleet MACVLAN/IPVLAN + SWARM shards) and picks the right persona
   list for each source:
     * topology decky -> Topology.email_personas (per-topology richness
       preserved)
     * fleet / shard  -> a single host-wide pool loaded from disk
       (DECNET_EMAILGEN_PERSONAS, /etc/decnet/email_personas.json, or
       ~/.decnet/email_personas.json)
   Operators install the global pool via 'decnet emailgen
   import-personas <file>' which validates with the same Pydantic
   schema the worker uses.

2. The driver now runs 'touch -d <Date>' inside the docker exec right
   after the EML write so file mtime matches the email's RFC 2822
   Date: header.  Without this an attacker 'ls -lt'ing the spool sees
   every email clustered inside the worker's tick window — the
   cluster itself was a stylometric tell.

CLI now exposes 'decnet emailgen' as a sub-app with 'run' (default,
backwards-compatible with bare 'decnet emailgen') and 'import-personas'.
list_running_deckies carries topology_id through so consumers can resolve
the parent topology without a second round-trip.
2026-04-26 22:39:16 -04:00
2979997442 feat(templates): IMAP/POP3 servers read EML spool from emailgen
When IMAP_EMAIL_SEED / POP3_EMAIL_SEED points at a directory of .eml
files (the orchestrator emailgen worker's drop path,
/var/spool/decnet-emails/ by convention), the bait mailbox is replaced
with those LLM-generated, persona-driven, threaded messages.  Empty /
missing dir keeps the hardcoded fallback so a fresh deployment is never
silent.  Cached with mtime invalidation + a short TTL so a hot mailbox
doesn't pay the parse cost on every IMAP/POP3 command.

Replaces the DEBT-026 stub on both templates that named the env var but
never wired it through.
2026-04-26 22:21:01 -04:00
3ee55ec341 feat(emailgen): Ollama-driven fake email worker for IMAP/POP3 deckies
Second orchestrator worker (decnet emailgen) that drips persona-driven,
threaded, multi-language fake emails into running mail deckies.  Personas
live on Topology.email_personas; topology-wide language_default falls
through to any persona that doesn't pin its own.  Em-dashes are
suppressed at the prompt layer by default and only lifted for personas
explicitly marked uses_llms_heavily — em-dashes are an LLM tell and a
flat corpus of em-dashed mail is a giveaway.

EML delivery writes into /var/spool/decnet-emails/<thread>/<msg>.eml on
the mail decky via docker exec; wiring the IMAP/POP3 templates to read
from that spool (replacing the hardcoded _BAIT_EMAILS) is the next step.
2026-04-26 22:16:19 -04:00
9650366d34 fix(orchestrator): drop topology_deckies FK on event src/dst columns
Once the orchestrator started seeing fleet + SWARM shard sources via
list_running_deckies (a844148), every event row landing on a fleet decky
broke the FK to topology_deckies — the column now carries opaque ids
("local:omega-decky" for fleet, "host_uuid:decky_name" for shards) that
will never match topology_deckies.uuid.

Symptom on the operator's mothership:
  IntegrityError 1452 — orchestrator_events_ibfk_2 FK violated on every
  tick once the reconciler populated fleet_deckies.

Index on dst_decky_uuid is preserved (the dashboard reads
"events for this decky" frequently); only the FK is removed.  Keeps
data integrity loose by design — events are append-only history that
should outlive the deckies they reference.

Existing MySQL deployments need the FK dropped manually:
  ALTER TABLE orchestrator_events
    DROP FOREIGN KEY orchestrator_events_ibfk_2,
    DROP FOREIGN KEY orchestrator_events_ibfk_1;

SQLite users are unaffected — SQLite doesn't enforce FKs by default.
2026-04-26 21:40:06 -04:00
c3518e3159 feat(workers): surface clusterer, campaign-clusterer, reconciler in panel
The Workers panel (Config → Workers tab) hardcodes its row list in
KNOWN_WORKERS — by design, so a rogue publisher can't inject UI rows.
Three heartbeat-emitting workers were missing:

  * clusterer            — behavioral clustering (decnet/clustering/)
  * campaign-clusterer   — campaign assembly  (decnet/clustering/campaign/)
  * reconciler           — host-local fleet convergence (added in 430262e)

Each already publishes on system.<name>.health via run_health_heartbeat,
so they show up live the moment they're added to the registry — no
frontend or subscriber wiring needed (Config.tsx renders whatever
/workers returns).

Also added to _PREFERRED_ORDER in start-all so START ALL WORKERS brings
them up in dependency-friendly order: data-plane → reconciler → intel
→ clustering → output → orchestrator.

Three deployable units (listener, web, swarmctl) intentionally remain
absent from KNOWN_WORKERS — they don't emit heartbeats (CLI / static
server / one-shot tooling), so they'd permanently render as UNKNOWN
and confuse operators.  Adding them is a separate decision that needs
a "synthesize installed-but-silent rows" pass on the registry.
2026-04-26 21:31:34 -04:00
430262e01a feat(fleet): systemd unit + bus signal for fleet reconciler
Two pieces, one PR because they share a deployment surface:

1. systemd. decnet-reconciler.service.j2 mirrors the orchestrator unit
   shape (docker group, hardened sandbox, append-logs).  Read-only
   /var/lib/decnet so it can read decnet-state.json without write
   access.  Auto-discovered by `decnet init` via the existing
   decnet-*.service.j2 glob — no init.py change needed.  Added to
   decnet.target so `systemctl start decnet.target` brings it up
   alongside collector / sniffer / mutator / etc.  Also added to the
   agent reaper script so self-destruct cleans it up on workers.

2. Bus signal. reconcile_once now publishes
   `decky.<host_uuid:name>.state` on every insert / delete /
   state-changed transition.  Reuses the existing DECKY_STATE topic
   family (no bus/topics.py change → no wiki update needed per the
   bus-signals doc rule).  Composite host_uuid:name segment keeps
   fleet rows distinguishable from MazeNET TopologyDecky rows whose
   ids are bare UUIDs.  Quiet ticks publish nothing — convergence
   means silence.

Bus is plumbed through the worker, defaults to None for unit-test
callers.  publish_safely keeps the source-of-truth contract: DB write
is authoritative, the publish is best-effort notification.

Captures previous_state into a local before update_fleet_decky_state
runs — a fake repo that mutates rows in-place would otherwise see the
post-update state and report previous == current.  Real repos don't
have this concern but the fix is cheap and makes the function less
order-dependent.
2026-04-26 21:21:36 -04:00
a8441481b5 fix(orchestrator): see fleet + shard deckies, not just topology rows
Switches _one_tick from list_running_topology_deckies to
list_running_deckies (the union view added in 095500a). Resolves the
permanent "no actionable deckies (running+ssh count=0)" log on hosts
running only unihost MACVLAN / IPVLAN decoys — the orchestrator now
sees fleet_deckies rows alongside MazeNET topology rows and SWARM
DeckyShard rows.

Also fixes the misleading log message: the old "running+ssh count=N"
reported the *pre-filter* total (count of all running deckies, not
the SSH-eligible subset that scheduler.pick actually evaluates). New
line breaks down running, ssh_eligible, and per-source counts so
debugging "why isn't it picking?" no longer requires reading
scheduler internals.

Regression test: orchestrator integration suite now seeds fleet_deckies
rows (not just topology_deckies) and verifies a tick picks them and
records an event with dst="local:fleet-*" — proving the original bug
on the operator's mothership is fixed.
2026-04-26 21:16:22 -04:00
f775223a83 feat(fleet): reconciler converges JSON ↔ DB ↔ docker
Adds decnet.fleet.reconciler — a pure async function plus a long-lived
worker — that periodically reconciles the three sources of truth on a
DECNET host:

  1. decnet-state.json (CLI-canonical fleet record)
  2. fleet_deckies table (DB mirror, written by engine.deployer)
  3. docker inspect (actual per-container runtime state)

Drift handling:
  * JSON has X, DB doesn't       → INSERT (deploy ran with DB offline)
  * DB has X (this host), JSON doesn't → DELETE (teardown ran with DB offline)
  * Both have X, docker disagrees → flip state to running/failed/degraded
  * Docker socket unreachable    → leave existing state alone (don't
                                    torch every row to torn_down)

Cross-host safety: deletions are scoped to host_uuid for the local host;
a master that runs both a local fleet and swarm workers will never
clobber a peer's slice.

CLI:
  decnet reconcile --once            # one-shot, prints counts
  decnet reconcile [--interval N]    # long-lived worker, mirrors
                                     # orchestrator's lifecycle (control
                                     # listener + heartbeat + tick loop)

Promotes decnet/fleet.py → decnet/fleet/ package so the reconciler can
live alongside it without name collision (build_deckies_from_ini and
all_service_names re-exported unchanged via __init__.py).

14 new tests cover state aggregation rules, all four drift directions,
host_uuid scoping, docker-unreachable safety, and worker shutdown via
the bus control event.
2026-04-26 21:14:48 -04:00
8814902999 docs(api): clarify fleet_deckies + JSON dual-write happens in engine.deployer
The unihost API path delegates to engine.deployer.deploy(), which now
writes both decnet-state.json (existing) and the fleet_deckies DB
table (added in 646aeec).  Comment makes the single-sink design
explicit so future maintainers don't add a parallel save_state /
upsert_fleet_decky call here.

No behavioral change — every fleet-creation path on every host (CLI
deploy, this unihost API path, and per-worker SWARM agent deploys)
already routes through the engine.deployer single sink.
2026-04-26 21:08:44 -04:00
646aeeca40 feat(deployer): mirror fleet deploy/teardown into fleet_deckies table
CLI deploy now writes both surfaces: decnet-state.json (existing,
canonical for offline / no-API hosts) and the new fleet_deckies DB
table (visible to orchestrator, web dashboard, REST API).

Best-effort: a DB outage logs a warning and returns. The JSON file
remains the source of truth for `decnet status`, `decnet teardown`,
sniffer, and collector — operators on a CLI-only host keep working.

_run_async helper bridges sync deploy() into the async repository.
Always uses a fresh thread because the API handler at
web.router.fleet.api_deploy_deckies invokes deploy() from inside a
FastAPI event loop, which would otherwise break asyncio.run.

Verified end-to-end against MySQL: deploy mirror inserts rows, union
view (list_running_deckies) returns them with source="fleet",
teardown mirror removes them. Works from both sync (CLI) and async
(API handler) call sites.
2026-04-26 21:05:50 -04:00
095500ae9a feat(db): FleetDecky table mirrors decnet-state.json into the DB
Adds a fleet_deckies table so DB-only consumers (orchestrator, web
dashboard, REST API) can see unihost / MACVLAN / IPVLAN deckies
without reading the JSON state file. Mirrors DeckyShard field-for-field.

Composite PK (host_uuid, name) future-proofs for a mothership that
runs both a local fleet and acts as a swarm master. host_uuid defaults
to the "local" sentinel — no FK to swarm_hosts because the local
mothership isn't enrolled as a worker.

Repo additions: upsert_fleet_decky, delete_fleet_decky,
list_fleet_deckies, list_running_fleet_deckies,
update_fleet_decky_state, plus list_running_deckies which unions
topology + fleet + shard sources for the orchestrator.

Smoke-tested round-trip against MySQL: upsert, list_running, union
view (source="fleet"), delete.
2026-04-26 21:00:01 -04:00
c595d039bd feat(sniffer): ISN sequence classifier (reuses seq_class helper)
Mirrors the IP-ID classifier for TCP ISN values: per-source-IP rolling
deque (maxlen=8) populated from each inbound SYN's tcp.seq, classified
on every emission. A 'random' verdict is the modern norm; 'incremental',
'zero', or 'constant' indicates legacy stacks or hand-rolled raw-socket
tooling — a strong fingerprint signal.

Active prober now also captures server_isn (single sample, not classified
in-flight; downstream consumers correlating multi-probe results can apply
seq_class.classify_sequence themselves).

Profiler rollup carries the latest non-'unknown' label into
attacker.tcp_fingerprint. Dedup key already covers isn_class from
the previous commit, so transitions emit cleanly.

UI surfaces ISN class as a colour-coded tag with a ⚠ glyph for
non-random verdicts, since they're the genuinely interesting case.
2026-04-26 20:30:24 -04:00
0e40cc8ae1 feat(sniffer): IP-ID sequence classifier (random/incremental/zero/constant)
Adds a per-source-IP rolling sample buffer (deque, maxlen=8) for IP-ID
values seen on attacker SYNs and a stdlib-only classifier in
decnet/sniffer/seq_class.py. Each new SYN appends ip.id and re-classifies
the buffer; the result is logged on tcp_syn_fingerprint events alongside
sample count.

The dedup key now folds in ipid_class so a transition from 'unknown' to
a definitive verdict emits exactly one fresh event instead of being
suppressed by the old (os|options) key. Profiler rollup carries the
latest non-'unknown' label into attacker.tcp_fingerprint.

UI surfaces it as a colour-coded tag in the TCP STACK panel: random
neutral, incremental amber, zero/constant green (the strong signal).
2026-04-26 20:28:32 -04:00
b0b08754d0 feat(fingerprint): ToS/DSCP/ECN extraction in active + passive TCP fingerprint
Active prober now reads ip.tos from the SYN-ACK and emits tos/dscp/ecn
alongside the existing TTL/window/options fields. dscp is folded into the
fingerprint hash so different DSCP markings produce distinct signatures.

Passive sniffer logs the same three fields on tcp_syn_fingerprint events;
profiler rollup carries them into the attacker tcp_fingerprint snapshot;
AttackerDetail's TCP STACK panel now surfaces DSCP and ECN cells.
2026-04-26 20:25:37 -04:00
3de19eb102 feat(orchestrator): periodic prune of orchestrator_events
Every 100 ticks, trim per-dst_decky_uuid history down to 10000 rows
(oldest first). Keeps the events table bounded on long-running fleets
without paying the cost on every write.
2026-04-26 19:58:43 -04:00
5b5ff54fa2 feat(web): orchestrator events read API + SSE stream
GET /api/v1/orchestrator/events — paginated list with optional
kind=traffic|file filter. GET /api/v1/orchestrator/events/stream —
SSE: snapshot on connect, live forward of orchestrator.> bus events
mapped to 'traffic' / 'file' SSE event names.

Repo gains list_orchestrator_events(limit, offset, kind?, since_ts?),
count_orchestrator_events(kind?), and prune_orchestrator_events
(per_dst_cap=10000) for periodic worker-side trimming.
2026-04-26 19:58:12 -04:00
900c0c3ef5 refactor(bus): rename ORCHESTRATOR_ACTIVITY → ORCHESTRATOR_TRAFFIC
Aligns the bus token with the DB column value; OrchestratorEvent.kind
is 'traffic'/'file' but the topic was 'activity'/'file'. The asymmetry
made consumer code (UI filter, SSE event names) need a translation
layer. No external subscribers existed yet.
2026-04-26 19:53:40 -04:00
4c37ece39e feat(orchestrator): MVP synthetic life-injection worker (SSH only)
Adds a new decnet orchestrate worker whose job is to keep the honeypot
ecosystem from looking suspiciously static — a frozen LAN with no
inter-host traffic and no filesystem aging is its own honeypot tell.

MVP scope:
- New OrchestratorEvent table + repo methods (purpose-built sibling
  to Log so synthetic events stay separable from attacker-driven ones).
- New orchestrator.{activity,file}.<decky_id> bus topics +
  system.orchestrator.health heartbeat.
- SSH-only driver. Traffic action runs python3 inside src container
  to TCP-connect dst:22 and read the SSH banner — real on-the-wire
  SSH-protocol traffic without shipping creds. File action drops or
  refreshes a small file via docker exec on the destination.
- Random scheduler (50/50 traffic/file when >=2 SSH-capable deckies
  are running). Diurnal shaping, role-aware pairing, and session-aware
  backoff are explicit non-goals for MVP.
- CLI registration, systemd unit (SupplementaryGroups=docker),
  worker-registry entry so the dashboard shows orchestrator health.
- 11 tests: scheduler policy, driver argv shape + injection-safety,
  end-to-end one-tick integration with FakeBus + SQLite.
2026-04-26 19:43:20 -04:00
d531cea536 feat(web): read-only campaigns API + SSE + frontend
API: /api/v1/campaigns (paginated list), /api/v1/campaigns/{uuid}
(soft-merge chain follow), /api/v1/campaigns/{uuid}/identities
(member identities), and /api/v1/campaigns/events (SSE under
campaign.> + JWT-via-?token=, snapshot-on-connect). Mirror of the
identity router; same auth, same shape, same OpenAPI tags pattern.

Frontend: CampaignDetail.tsx page (same visual vocabulary as
IdentityDetail), useCampaignStream hook (mirror of
useIdentityStream), /campaigns/:id route, IdentityDetail's
CAMPAIGN badge becomes clickable and navigates to the campaign.
useIdentityStream now listens for identity.campaign.assigned so
the badge appears live without a manual refresh.
2026-04-26 09:20:17 -04:00
75af00c9c8 test(clustering): full-bound passes through production campaign clusterer
Runs the chained identity + campaign clustering pipeline against all
seven fixtures via from_synthetic / from_synthetic_identity adapters
and ratchets every YAML floor to 1.0 — the production clusterer
(and the reference clusterers used in the per-fixture tests) all
score perfectly across ARI / homogeneity / completeness /
singleton_recall on each fixture.

Three substrate fixes surfaced by the ratchet:

- Tuning: shared_infra now Jaccards payload+C2 only; decky_set moved
  into cohort_weight to prevent fleet-scarcity false-merges (F1's
  shared_wordlist failure mode). Tier weight raised to 1.0 so
  shared payload+C2 alone crosses threshold (F5's intended pass).
- Adapter: from_synthetic_identity now reads SyntheticSession
  started_at + duration_s for session_windows and per-decky
  timestamps (the production-row adapter still uses start_ts/end_ts
  when available).
- Fixture data: paused_campaign.yaml's JA3 collided exactly with
  vpn_hopping.yaml's (same TLS extension list). The collision
  fused two unrelated campaigns under the chained identity layer
  in the noise_floor composite. Made paused's JA3 distinct.

Also wires Campaign / CampaignsResponse into models/__init__.py's
__all__ that was missed in the schema commit.
2026-04-26 09:13:59 -04:00
6936a1426c feat(clustering): campaign-clusterer worker + bus topics + CLI
The campaign clusterer worker mirrors the identity-side worker shell
(bus connect, heartbeat, control listener, slow-tick fallback) but
wakes on identity.> instead of attacker.> — campaign-level work is
gated on identity-layer changes, not raw observations.

The connected-components implementation reads identities via
list_identities_for_clustering, projects them with from_identity_row,
runs union-find over combined_campaign_weight, writes campaigns rows,
sets attacker_identities.campaign_id, and runs the same revocable-
merge pass as the identity layer (a merged-out campaign whose
identities no longer co-cluster with the winner gets revoked).

Bus: adds campaign.> family (formed / identity.assigned / merged /
unmerged) plus the cross-family identity.campaign.assigned so
existing identity-stream subscribers see the badge update without
having to subscribe to campaign.>. Wiki Service-Bus.md updated in
wiki-checkout in the same wave per the project's bus-signals
discipline.

CLI: decnet campaign-clusterer registered as master-only via
MASTER_ONLY_COMMANDS; --poll-interval / --daemon mirror the identity
clusterer command surface.
2026-04-26 09:04:00 -04:00
0946bab424 feat(clustering): campaign-level similarity primitives
The signal taxonomy for the campaign clusterer (next commit). Mirror
of the identity-layer module but with edge families that don't
translate 1:1: phase-handoff (load-bearing for F5 multi_operator —
the signal the identity-side fingerprint-disagreement veto deliberately
isn't), shared-infra (vetoed at identity level, primary positive
signal here), temporal-overlap (pairwise-relative — F7 invariance
preserved), cohort (weak supporting weight only).

Tier weights tuned so phase-handoff alone crosses threshold (F5),
shared-infra + temporal-overlap together cross (canonical co-op
pattern), and shared-infra + cohort together do NOT (F1
shared_wordlist's failure mode). The F7 time-shift invariant is
explicitly tested on every time-bearing edge and on the combined
weight.
2026-04-26 08:57:46 -04:00
0a1cf65ddb feat(db): Campaign SQLModel + repo write/read methods
Adds the campaigns table and the BaseRepository / SQLModelRepository
methods that the campaign-clusterer worker (next commit) needs to
populate it. Mirrors the AttackerIdentity layer: schema_version from
day one for federation gossip, soft-merge via merged_into_uuid with a
chain-walking get_campaign_by_uuid, list_campaigns excluding merged-
out rows while list_all_campaigns returns the unfiltered set for the
revoke pass. attacker_identities.campaign_id gets a real FK now that
the target table exists.
2026-04-26 08:54:28 -04:00
97aa57faed feat(api): SSE stream for identity events at /api/v1/identities/events
Mirrors GET /api/v1/topologies/{id}/events: subscribes to identity.>
on the bus for the duration of the request and forwards each event as
a named SSE frame (formed / observation.linked / merged / unmerged).

The endpoint is broadly scoped (every identity event, not per-uuid)
because both AttackerDetail and IdentityDetail need the same
firehose: AttackerDetail watches for an identity.formed that finally
binds its identity_id; IdentityDetail watches for
observation.linked / merged / unmerged against its current row. A
per-uuid filter would force the client to know its identity before
subscribing, which it doesn't always.

JWT via ?token= (EventSource can't set headers), require_stream_viewer
gate, sse_connection_slot per-user cap, snapshot-on-connect with
the first 50 identities so the client buffer renders without a
separate REST call.

Bus-disabled / unreachable path keeps the connection alive on
keepalives so the client doesn't reconnect-storm; it can re-poll
the REST API on its own timer.
2026-04-26 08:36:17 -04:00