DECNET

Author	SHA1	Message	Date
anti	cc6abf7256	fix(tests/stress): eliminate 0-request flakes in locust runs Three independent issues conspired to make stress tests record 0 requests: 1. Every virtual user did /auth/login in on_start. With 1000 users in a spike window, bcrypt-bound logins never finished and on_start failed for all users — aggregated requests stayed at 0. Pre-fetch a single admin token in the fixture (cached per-host) and pass it via DECNET_STRESS_TOKEN so locust users skip the login storm. 2. Locust exits non-zero on any request failure by default, causing run_locust to throw away an otherwise valid stats CSV. Pass --exit-code-on-error 0 so per-test assertions are the only fail gate. 3. test_stress_sustained ran two locust subprocesses against the same uvicorn. Phase 1's keep-alive connections wedged phase 2 into 0 recorded requests ~2/3 of the time. Refactored stress_server into a start_stress_server() context manager and gave each phase its own uvicorn. Stable 3/3 on full suite, 3/3 on test_stress_sustained alone.	2026-04-28 13:01:11 -04:00
anti	681931d9bb	docs(roadmap): tick certificate details and three sibling roadmap items	2026-04-28 11:41:17 -04:00
anti	72cc928ebf	feat(prober-cert): roll up fingerprints onto AttackerIdentity Brings the federation-gossip columns on AttackerIdentity to life — ja3_hashes, hassh_hashes, and the new tls_cert_sha256 — by projecting the union of every member observation's fingerprints JSON onto the identity at clusterer create / link / merge time. - decnet/profiler/identity_rollup.py: pure extract_fp_summaries() reads the production bounty shape (payload.fingerprint_type + payload.{ja3,hash,cert_sha256}) and returns deduped+sorted JSON list[str] per family, or None when a family has no signal so the column stays NULL instead of '[]'. - BaseRepository.update_identity_fingerprints + SQLModel impl: one idempotent write that overwrites the three summary columns and bumps updated_at. - ConnectedComponentsClusterer: after every per-component reconciliation (fresh-create OR existing-merge+link), recomputes and writes the rollup for the target identity. Wrapped in a best-effort helper so a write failure logs but never breaks the tick. - Tests: extract_fp_summaries unit (dedup, sort determinism, unknown types ignored, malformed JSON, nested-stringified payloads, non-string values); end-to-end clusterer ticks populate the columns on create + on later observation links; no-fingerprint clusters keep the columns NULL.	2026-04-28 11:28:54 -04:00
anti	9ab43b4ea4	feat(prober-cert): UI for active TLS certificate captures - FpCertificate renders the new cert_sha256 field (truncated, with full hash on hover) and a FROM line carrying the prober-side target_ip/port so the source is visible. - tls_certificate payloads split on target_ip presence: prober certs land under ACTIVE PROBES, sniffer certs under PASSIVE FINGERPRINTS. Two synthetic fpType keys (tls_certificate_active / tls_certificate_passive) drive the bucketing without disturbing the on-the-wire fingerprint_type.	2026-04-28 11:23:34 -04:00
anti	5f8149daee	feat(prober-cert): capture leaf TLS cert after successful JARM JARM probes are crafted ClientHellos with weird ciphers — they never complete a real handshake, so the peer cert isn't reachable from those sockets. After a non-empty JARM hash proves the port speaks TLS, do a separate ssl.wrap_socket() against the same (ip, port) to fetch and parse the leaf cert. - decnet/prober/tlscert.py: fetch + parse via cryptography lib; swallows all connect/handshake/parse failures (returns None). - decnet/prober/worker.py::_capture_tls_cert: emits a tls_certificate event with subject_cn / issuer / SANs / validity / SHA-256 + publishes on the bus. Wired from _jarm_phase only when JARM succeeds, so non-TLS ports never trigger a second connect. - Tests cover happy path, cert-fetch failure, defense-in-depth crash, empty-JARM skip, publish_fn, and parser edge cases (garbage DER, empty bytes, missing SAN extension, non-self-signed).	2026-04-28 11:14:44 -04:00
anti	4749c972e5	feat(prober-cert): schema for active TLS cert capture Adds storage for TLS certificate details collected from attacker-run servers by the active prober (sibling to the existing JARM probe). - AttackerIdentity.tls_cert_sha256 / Campaign.tls_cert_sha256: JSON list[str] columns mirroring ja3_hashes / hassh_hashes for federation gossip. - ingester clause 9b: emits a 'tls_certificate' fingerprint bounty when a prober event carries subject_cn (disjoint from the existing sniffer-gated clause). - Prober-side capture (ssl.wrap_socket follow-up after JARM) and profiler rollup land in sibling commits.	2026-04-28 11:09:25 -04:00
anti	e986e81421	fix(test-schemathesis): drop unsupported_method check The check expects 405 for any HTTP method not declared on a path. DECNET's topology router has a static `/topologies/services` (GET only) sibling to a parameterized `/topologies/{topology_id}` (DELETE), so a DELETE on the static path falls through to the parameterized route and hits auth, which returns 401 — by design. Leaking 405-vs-401 would let unauthenticated callers enumerate valid topology UUIDs. The same shape applies to other static/dynamic sibling pairs across the API. The check is fundamentally incompatible with that routing strategy; document the omission inline.	2026-04-28 10:20:43 -04:00
anti	ccc8619387	fix(test-schemathesis): disable rate limiter in fuzz subprocess Schemathesis fires up to 3000 examples per endpoint. POST /auth/login caps at 10/5min per IP, so the second example onward returns 429 and the positive_data_acceptance check flags it as RejectedPositiveData (its allowed-status list is hardcoded in schemathesis to 2xx/401/403/404/409/5xx, so OpenAPI tweaks can't fix it). DECNET_LIMITER_ENABLED=false exists for exactly this case (see limiter.py docstring on stress/load testing). Reverts the custom_openapi shim from `5d88346` / `9b1168c` — the endpoint already declares 429 in its responses= map (api_login.py:38), and the shim turned out to address a problem that wasn't there. Drop the companion test along with it.	2026-04-28 09:51:49 -04:00
anti	9b1168ce0b	fix(api): scope 429 OpenAPI injection to rate-limited routes Previous commit advertised 429 on every operation. Only routes decorated with @limiter.limit can actually return slowapi's 429 — currently just POST /api/v1/auth/login. Documenting it elsewhere is dishonest and would mislead clients into expecting a response the server cannot produce. Walk slowapi's _route_limits / _dynamic_route_limits registries to identify decorated endpoints, match them to FastAPI routes by {module}.{name}, and only inject 429 on those. Existing per-route 429 declarations (e.g. SSE connection-cap on events streams via sse_limits) are untouched.	2026-04-28 01:00:34 -04:00
anti	5d883466a2	fix(api): advertise 429 on every operation in OpenAPI SlowAPI middleware can short-circuit any request with 429 once a per-route or per-IP rate limit fires (e.g. POST /api/v1/auth/login is capped at 10/5min). The OpenAPI spec did not declare 429 on any operation, so schemathesis flagged legitimate rate-limit responses as RejectedPositiveData / status-code-nonconformance failures. Override app.openapi to inject a generic 429 response object on every HTTP operation in the generated schema. Add a contract test that fails if any operation drops the 429 advertisement.	2026-04-28 00:58:37 -04:00
anti	6b407e8c9c	fix(tests): align stale tests with current behavior - swarm/test_swarm_api, swarm/test_heartbeat: replace deprecated asyncio.get_event_loop().run_until_complete() with asyncio.run(); the former raises in 3.11 once another test has set+closed a loop on the main thread. - prober/test_prober_bus, prober/test_prober_worker: extend tcp_fingerprint mocks with tos/dscp/ecn/server_isn so the worker doesn't KeyError into the prober_error branch. - services/test_service_isolation: collector now retries on event-stream errors instead of exiting; assert it stays running and cancel cleanly. - live/test_imap_live, live/test_pop3_live: log format emits outcome="failure", not "failed". - live/test_service_isolation_live: is_service_container accepts label OR state-name; rewrite the empty-state test against a synthetic unlabeled container instead of the host's real fleet.	2026-04-28 00:44:40 -04:00
anti	8344b539c8	fix(ssh-template): drop sshd/pam_unix native chatter at rsyslog OpenSSH's native syslog ("Failed password", "Connection from", "Connection closed by …") and the pam_unix lines emitted from sshd's PAM stack add no signal beyond what auth-helper already captures as structured login_attempt events. They cluttered the dashboard and arrived without an SD wrapper, forcing prose-IP heuristics in the collector. Add a `:programname, isequal, "sshd" stop` rule above the forwarding actions in /etc/rsyslog.d/50-journal-forward.conf. pam_unix lines from sshd inherit programname=sshd so the same rule covers both. sudo / login / su pam_unix lines keep flowing (different programname), so post-login privilege escalation telemetry is preserved.	2026-04-27 23:26:53 -04:00
anti	9350ce195a	fix(collector,correlation): extract attacker IP from sshd/pam free-form prose Native sshd and pam_unix lines route through rsyslog without the relay@55555 SD wrapper and without key=value pairs, so attacker_ip fell through to "Unknown". Add a prose-IP fallback to both parsers: anchored patterns (from/rhost/client/src) win first so we never pick the local listener in "Connection from X port Y on Z port 22", with a bare-IPv4 scan as the last resort.	2026-04-27 23:16:42 -04:00
anti	3c571cce5a	fix(correlation): prober events no longer count as attacker traversal The prober writes events with hostname=decnet-prober and target_ip= <the attacker being fingerprinted>. The parser pulls target_ip into attacker_ip (it's one of _IP_FIELDS), which is correct for indexing fingerprints under the attacker — but it had a side effect: every fingerprinted attacker had two distinct deckies on file (the real decoy they touched + decnet-prober) and the correlation engine's traversals() classified that as lateral movement. Live dashboard showed bogus "dmz-gateway -> decnet-prober" paths and TRAVERSAL badges on attackers who'd done nothing but knock on the front door. The prober is internal infrastructure, not a hop. Filter the "decnet-" namespace out of distinct-decky counts and hop paths in the engine. Fingerprints stay attached to the attacker profile via the existing per-IP event index — just no longer as traversal.	2026-04-27 23:02:23 -04:00
anti	e03a6d10a0	fix(collector): retry on event-stream errors and add periodic reconciler Hit live on first VPS deploy: a window between the initial client.containers.list() snapshot and the client.events() start-event stream let topology service containers slip through, requiring an operator restart for them to be picked up. Two fixes: * `_watch_events` now wraps the events() call in a retry loop with exponential backoff (1s -> 30s cap). A docker.errors.APIError, daemon reload, or SDK stream-decode hiccup used to make the executor task return cleanly, leaving the collector "running" with no event subscription. Future container starts were silently dropped until the unit was restarted. * New `_reconcile_loop` async task ticks every DECNET_COLLECTOR_RECONCILE_S (default 30s), re-scans client.containers.list(), and calls _spawn for any service container not already in `active`. Belt to the event watcher's suspenders: even if a start event is dropped during a reconnect window, the reconciler picks it up within one cycle. Also prunes finished futures from `active` so the dict's bounded by current container count rather than agent lifetime churn.	2026-04-27 22:56:13 -04:00
anti	c5db1d7ba2	fix(config-ini): strip inline # and ; comments from values The module docstring teaches inline comments — `mode = master # or "agent"` is the canonical example for the [decnet] section. Python's configparser ignores those by default unless inline_comment_prefixes is set explicitly, so the comment became part of the value and downstream validators rejected it ("mode must be 'agent' or 'master', got 'master # or \"agent\"'"). Hit live on first VPS deploy: every CLI invocation crashed at import time with a stack trace that didn't make it obvious the docstring's example was the trigger. Now the parser does what the docs promise.	2026-04-27 22:55:58 -04:00
anti	0b1a17b4eb	fix(agent): pass --always-recreate-deps so service netns shares stay fresh Decky service containers join their base via `network_mode: container:<base>` and Docker binds that share at service start time. If `docker compose up` recreates a base (e.g. ports: changes after a forwards_l3 toggle) but decides services are unchanged, services keep a stale FD into the destroyed namespace and end up with only `lo` — so external traffic hits a closed port on the live base and gets RST. Hit live on the first VPS deploy: external SSH to the dmz-gateway was refused while sshd was listening, because base and service netns inodes had drifted apart. `--always-recreate-deps` makes compose rebuild every dependent whenever its base is recreated, removing the race entirely.	2026-04-27 22:55:48 -04:00
anti	0a525ebd37	fix(web): proxy follows DECNET_API_HOST instead of hardcoding 127.0.0.1 The dashboard's /api/* proxy hardcoded 127.0.0.1 as the target host. That works when the API binds to a wildcard or to loopback, but breaks the moment an operator binds the API to a specific address — e.g. a Tailscale IP for tailnet-only deploys: the API stops listening on loopback entirely and the proxy gets ECONNREFUSED on every request. The web command now reads DECNET_API_HOST and falls back to loopback only when the API is on a wildcard (0.0.0.0 / :: / unset). A new --api-host flag overrides at the CLI level.	2026-04-27 22:55:25 -04:00
anti	673bc5b819	ops(init): ship logrotate config so /var/log/decnet can't fill the disk Without rotation, the syslog listener and per-host collector grow /var/log/decnet/ without bound — a noisy attacker (or an active probe storm) fills the disk in hours on a small VPS. New deploy/logrotate.d/decnet caps at 7 daily rotations or 100 MiB, whichever comes first, and uses copytruncate because the ingester and forwarder hold the files open via Python and won't reopen on a rename rotation. Wire install / remove into `decnet init` and `decnet init --deinit` alongside the existing tmpfiles.d / polkit handling.	2026-04-27 21:26:13 -04:00
anti	5415e98458	sec(api): mode-gate and eager-load JWT secret in lifespan Refuse to start decnet.web.api when DECNET_MODE=agent (unless the operator explicitly opts into dual-role with DECNET_DISALLOW_MASTER= false). The Typer CLI already hides master-only commands on agents, but a misconfigured systemd unit or a direct uvicorn invocation would bypass that — now the lifespan itself refuses, before any worker, DB or bus comes up. Resolve DECNET_JWT_SECRET eagerly at startup so a missing or known- bad value fails at boot rather than on the first auth-gated request. The lazy-load shape stays useful for non-master CLIs.	2026-04-27 21:26:03 -04:00
anti	1a7da33375	sec(env): refuse to start master API with footgun public-binding config Add validate_public_binding() called from the master API lifespan: when DECNET_API_HOST is non-loopback, refuse to start if DECNET_CORS_ORIGINS still contains a loopback origin (catches the "operator flipped to 0.0.0.0 to make it work and forgot to update CORS" footgun) or if DECNET_CANARY_HTTP_BASE is plaintext http:// to a non-loopback host. Log CRITICAL when DECNET_LIMITER_ENABLED=false on a public binding. The validator no-ops under pytest so unrelated suites don't trip on it. Add DECNET_VERIFY_HOSTNAME env knob; AgentClient and UpdaterClient consult it when verify_hostname is None, giving production deploys TLS hostname verification on top of the existing CA + fingerprint pin. Default off so dev enrollments with mismatched SANs keep working.	2026-04-27 21:15:15 -04:00
anti	28e2a93355	sec(updater): harden tarball extraction and verify sha256 before extract Reject symlinks, hardlinks, device nodes and FIFOs in update tarballs; validate each member's resolved path stays under dest after symlink resolution; cap uncompressed size at 256 MiB to bound gzip-bomb damage; strip setuid/setgid bits from extracted modes. Add an optional sha256 form field to /update and /update-self; the master client computes and sends it on every push, the executor refuses to extract on mismatch. mTLS already authenticates the master, so this is defence-in-depth against in-transit corruption and gives operators a way to pin "exactly these bytes" for vetted releases.	2026-04-27 21:14:48 -04:00
anti	1de4136ed9	style(realism-ui): adopt the persona-page design language Both pages now layer on DeckyFleet.css + PersonaGeneration.css and use the project's house vocabulary — fleet-root shell, page-header with title-group + actions, btn / btn.violet / btn.ghost, info-banner with the violet left rule, and the dim/matrix/alert text accents. RealismConfig: inputs are flush-styled weight-input fields with a violet focus ring; section heads carry a TOTAL badge; canary rows get the project's amber accent; canary probability lives in a panel-bordered slider row. SyntheticFiles: the inline-styled table is now a styled .files-table with the standard hover affordance, the filter-row uses tweak-group label+select pairs, the drawer carries .drawer-eyebrow / .drawer-title / .meta-grid in the same style as the canary token drawer, and pager buttons share the .btn.ghost.small treatment. No behavioural change.	2026-04-27 18:08:58 -04:00
anti	2950fc216e	feat(realism-ui): human-readable content_class labels Single source of truth in decnet_web/src/realism/labels.ts: maps each ContentClass enum value to a friendly display name ("Note", "Cron Log", "Canary · AWS Credentials", …). Used by RealismConfig (weight tables + class filter dropdown) and SyntheticFiles (table row + drawer detail). Canary classes get a subtle amber accent so the dashboard's read of "this row is callback-bearing" doesn't depend on prefix-spotting in mono text. Raw enum value still appears in dim mono next to the label so an operator copy/pasting from logs or grepping the codebase still finds it. No backend change: the wire shape is still the snake_case enum; the beautification is render-time only.	2026-04-27 18:04:33 -04:00
anti	56a88d7bd4	feat(realism-ui): operator panel for planner weights + canary probability New /realism-config page sits next to Persona Generation and Synthetic Files under the Automation nav. Editable weight tables for user / system / canary content classes (with live percent share), plus a slider for canary_probability. Wires GET/PUT /api/v1/realism/config — viewer can read; admin required to save. Validation errors from the API are surfaced inline rather than swallowed; the SAVE button refreshes from the server's canonical snapshot so the operator sees exactly what landed (matters because cross-list entries are silently dropped server-side).	2026-04-27 18:01:35 -04:00
anti	2cc60bd677	feat(realism): operator-tunable planner weights via realism_config New realism_config table (uuid PK + unique key) + two repo methods (get/set) backs an admin-only GET/PUT /api/v1/realism/config surface. The planner now exposes apply_payload(payload) / current_payload() / reset_to_defaults() and reads its weights through mutable module globals; pick() resolves the live values each call. Validation catches negative weights, zero totals, out-of-range canary_probability, unknown content_class names, and silently drops cross-list entries (canary class on the user list, etc). The orchestrator worker calls _refresh_realism_config(repo) on startup and every 5 ticks (~5min at 60s interval). Operator changes land within one refresh window with no bus signal — the simpler path for a knob whose latency tolerance is minutes.	2026-04-27 18:00:08 -04:00
anti	da3c35c6a4	fix(realism): synthetic_files path fits MySQL utf8mb4 index cap The (decky_uuid VARCHAR(64), path VARCHAR(1024)) UNIQUE constraint generated a 4352-byte composite key under utf8mb4 (4 bytes/char), busting MySQL's 3072-byte cap and crashing decnet api on init with: Specified key was too long; max key length is 3072 bytes Tighten path to VARCHAR(512) — (64+512)*4 = 2304 bytes, well under the cap. Real realism + canary placement paths are short (/home/<persona>/Documents/<file>, ~70 chars); 512 keeps headroom without the index hassle. Pre-v1, no migration helper. Adds a regression test pinning the (decky_uuid + path) byte budget so a future widening fails loudly in CI rather than at MySQL deploy time.	2026-04-27 17:55:35 -04:00
anti	397a1a111e	feat(realism): LLM/breaker status on orchestrator heartbeat Surfaces realism subsystem state on the existing worker heartbeat extra hook (system.orchestrator.health) — no new bus topic. Payload carries {llm_enabled, llm_backend, llm_model, llm_breaker_state}, so the dashboard's worker panel renders a live LLM badge with a colored breaker-state dot: closed (green) — LLM healthy half_open (amber) — cooldown elapsed; next call is a probe open (red) — short-circuiting to deterministic templates Heartbeat is the canonical worker self-report channel; piggybacking on extra(...) avoids a new topic family while keeping the snapshot recomputed each beat (30s).	2026-04-27 17:51:00 -04:00
anti	55e86f606c	feat(realism-ui): synthetic files browser New /synthetic-files page sits next to Persona Generation and Canary Tokens under the Automation nav group. Operators get a paginated inventory of files realism has grown across the fleet (decky, path, persona, content_class, last_modified, edit_count, hash) with filters on decky / persona / content_class. Decky filter is a dropdown sourced from /deckies — never free text. Row click opens a drawer with the body preview; the drawer surfaces a TRUNCATED chip when the stored body is at the 64KB cap.	2026-04-27 17:48:05 -04:00
anti	87cb61c8b2	feat(realism): synthetic-files browser API Adds GET /api/v1/realism/synthetic-files (paginated list, filters by decky_uuid, persona, content_class) and GET /api/v1/realism/synthetic-files/{uuid} (single row with last_body and a truncated:bool flag set when the stored body is at the 64KB cap). Repo gains count_synthetic_files() and get_synthetic_file(uuid). The list view drops last_body to keep the wire payload bounded; the detail endpoint is the only path that returns it. Read-only — orchestrator remains the sole writer.	2026-04-27 17:44:53 -04:00
anti	2eeec15f9c	feat(orchestrator-ui): mark file-edit events with an EDIT badge FileAction and EditAction both write kind="file" — the discriminator is action="file:create" vs "file:edit". The dashboard timeline used to render both identically; now an EDIT sub-chip surfaces edits without widening the kind enum (which doubles as the bus topic family). No schema or API change. Polish only.	2026-04-27 17:42:21 -04:00
anti	147f52467f	feat(canary): kind reflects trip surface per generator decnet/canary/cultivator wrote kind="http" for every cultivated token, even DNS-trip ones (ssh_key, mysql_dump) and passive bait (aws_creds). The canary worker uses kind to route attacker callbacks to the right token; a misaligned kind means a real DNS resolution of ssh_key or mysql_dump never attributes to the planted slug. Add _GENERATOR_TO_KIND aligned with CanaryKind in models/canary.py and look it up at create_canary_token time.	2026-04-27 17:40:37 -04:00
anti	49da15823f	refactor(realism): single source of truth for persona→login decnet/realism/naming._home and decnet/canary/cultivator._persona_login both normalised "John Smith"→"johnsmith" with identical logic. Lift to decnet.realism.personas.login_for(persona) and have both consumers import it. Drift between the two would have left canary placement and realism path naming using different login derivations.	2026-04-27 17:39:04 -04:00
anti	7e9bc6d49a	refactor(realism): enforce synthetic_files 64KB cap at the repo The orchestrator worker clipped last_body at write time, but the repo didn't enforce. A future caller that forgot the clip would write the full body. Move the clip to record_synthetic_file and update_synthetic_file via SYNTHETIC_FILE_BODY_LIMIT in decnet/web/db/models/realism.py. Worker now passes the full body and trusts the repo. Tests retargeted to assert repo enforcement.	2026-04-27 17:37:36 -04:00
anti	b86129e35e	tests: realism migration regression coverage Four gaps from the realism migration plan, plus one flaky test fixed. Added: - tests/deploy/test_orchestrator_unit.py — replaces the dead test_emailgen_unit.py. Asserts: * decnet-orchestrator.service.j2 carries the DECNET_REALISM_* env block (LLM, MODEL, TIMEOUT, PERSONAS) so per-host tuning works without editing the .j2. * Legacy DECNET_EMAILGEN_* vars are NOT referenced — clean break contract from stage 5. * decnet.target wants orchestrator + canary, does NOT want decnet-emailgen.service. Anti-regression for service-collapse. * deploy/decnet-emailgen.service.j2 stays deleted. - tests/orchestrator/test_worker_integration.py — new test_one_tick_email_branch_records_orchestrator_email. Pins the action-roll to email, seeds a topology with an IMAP mail decky + two personas, stubs LLM + docker-exec write paths, verifies an orchestrator_emails row + bus event land. Restores end-to-end email coverage that was lost when the pre-collapse test_worker_integration.py was deleted. - tests/realism/test_synthetic_files_truncation.py — pins the 64KB last_body cap on create + edit, and documents the consequence: edit candidates carry a truncated snapshot of files that exceeded the cap. If a future change lifts the cap, _LIMIT in the test must lift with it. Fixed flaky: - tests/orchestrator/test_scheduler.py — two pick_file tests pinned to random.Random(1). Without a seed, the 3% canary gate (stage 7) and 10% leave-alone roll occasionally flaked the assertions because the _FakeRepo doesn't carry a create_canary_token method. Note: the existing test_realism_subprocess_import_personas_rejects_in_agent_mode already covers agent-mode rejection of decnet realism import-personas; no new gating test needed.	2026-04-27 17:29:25 -04:00
anti	a07fb3fe08	feat(realism): canary cultivator on the realism contract Stage 7 — final stage of the realism migration. Canary plants are now scheduled by the same realism planner that handles inert content, keeping the orchestrator as the single decision point and avoiding duplicate diurnal / persona / rate-limit logic in the canary subsystem. New surface: - decnet/canary/cultivator.py: cultivate(plan, repo) builds a CanaryContext, calls the right generator (canary_aws_creds -> aws_creds, canary_mysql_dump -> mysql_dump, …), persists the canary_tokens row before plant so the canary worker can attribute callbacks even on plant-time previews. Resolves canary placements to credible operator paths (~/.aws/credentials, ~/.ssh/id_rsa, /var/backups/db_backup.sql). - realism/planner.py adds 8 canary content_classes uniformly weighted inside a 3% probability gate. Hard-capped: each tick at most one canary; create branch falls through to inert otherwise. - scheduler.pick_file dispatches canary content_class to the cultivator; FileAction grows an optional content_bytes field so binary canary artifacts (DOCX/PDF/honeydoc) survive the wire intact instead of being utf-8 round-tripped. - SSHDriver._run_file uses content_bytes when set, falls back to encoding the str content otherwise. Stealth (per feedback_stealth.md): cultivator does not introduce any DECNET literal; the underlying generators are already stealth-clean and the test suite asserts the contract holds. Tests cover round-tripping every canary class through the cultivator, verifying placement-path conventions, persona-login normalisation ("John Smith" -> /home/johnsmith/.aws/credentials), and the no-DECNET-leak invariant.	2026-04-27 16:47:59 -04:00
anti	4e436da569	feat(realism): LLM enrichment for user-class file bodies Stage 6 of the realism migration. User-class file bodies (note, todo, draft, script) optionally get LLM-authored content; system classes (cron / daemon logs, /tmp caches) stay template-only because formulaic is the right look for them. New surface: - realism.llm.circuit.LLMCircuitBreaker — process-local sliding-window breaker. 3 consecutive failures trip open; 60s cooldown to half-open; half-open success closes, failure re-opens. Protects the orchestrator tick from sustained Ollama wedges (per-call timeout already covers one-shot hangs). - realism.prompts._style — em-dash suppression lifted from the email prompt. Persona.uses_llms_heavily opts out per the feedback_em_dash_llm_tell.md memory. Includes strip_em_dashes belt-and-braces sub for output that slipped past the prompt rule. - realism.prompts.filebody — class-conditioned prompts (note / todo / draft / script) with persona context, language pinning, output shape rule. - realism.bodies.make_body_with_llm — async wrapper around make_body that calls the LLM when one is provided AND the breaker allows. Falls back to template on timeout / error / empty / system-class. Wiring: - scheduler.pick_file accepts optional llm + llm_breaker + llm_timeout. When the planner picks a create action and the content_class is a user-class, the body_hint is replaced with the LLM-authored body (or falls back to the deterministic body_hint). - orchestrator.worker constructs get_llm() at startup gated by DECNET_REALISM_LLM env var (any non-empty value enables; empty / "off" / "none" / "0" disables). Passes llm + breaker through every tick. - decnet orchestrate gains --llm/--no-llm flag overriding the env var.	2026-04-27 16:42:58 -04:00
anti	b321e29002	feat(realism): EditAction read-modify-write of planted files Stage 3b of the realism migration. A TODO.md planted on Monday gets a checkbox flipped on Tuesday; a notes file grows a follow-up line; a cron log gets a fresh entry tacked on. The synthetic_files row's edit_count, last_modified, and content_hash advance. New surface: - EditAction dataclass (peer of FileAction in scheduler.py): carries decky, path, persona, content_class, previous_body, mtime, and synthetic_file_uuid for the worker's update path. - realism.bodies.next_iteration(cls, persona, prev, rng): per-class deterministic mutators. TODO flips an unchecked box and/or appends; notes/drafts/scripts append; logs are append-only (mirroring real log behaviour). Canary, cache_tmp, email raise KeyError — unsupported. - realism.planner.pick gains an edit branch: 60% create, 30% edit (when an edit_candidate is supplied), 10% leave-alone. Returns None on leave-alone — quiet ticks are realism too. - scheduler.pick_file pre-fetches a single edit candidate via repo.pick_random_synthetic_file_for_edit ~50% of ticks; the planner decides whether to use it. - SSHDriver._run_edit: turns next_iteration output into a plant_file call (mtime-bumped, mode 0o644). Stashes new_body in result.payload so the worker can hash it for synthetic_files. - worker._bump_synthetic_file_after_edit: patches edit_count + 1, last_modified=now, content_hash, last_body for the row UUID. No-op when the row was pruned mid-flight. - events.to_row / topic_for / event_type_for now recognise EditAction (kind="file", action="file:edit").	2026-04-27 16:38:17 -04:00
anti	32eeb0c813	refactor(orchestrator): collapse decnet-emailgen.service into orchestrator Stage 5 of the realism migration. Email generation is no longer a separate worker / systemd unit / CLI subcommand — the orchestrator's single tick loop covers SSH traffic, file plants, and email drops. Going from 21 services to 20. Worker: - _one_tick rolls between traffic / file / email (45/45/10 weights). The 10% email weight at a 60s orchestrator interval produces ~one email per 10 minutes, close to the pre-collapse 5-minute cadence. - get_driver_for(action) (stage 4) handles SSH vs Email dispatch. - Quiet branches fall through so a (decky-set, persona-pool, mail-decky) shape that silences one branch doesn't waste the tick. - Periodic prune covers both orchestrator_events and orchestrator_emails tables. Deletions: - deploy/decnet-emailgen.service.j2 - decnet/orchestrator/emailgen/worker.py - decnet/cli/emailgen.py - tests/orchestrator/emailgen/test_worker_integration.py Renames (history-preserving): - decnet/web/router/emailgen/ -> decnet/web/router/realism/ - tests/api/emailgen/ -> tests/api/realism/ - tests/cli/test_emailgen_* -> tests/cli/test_realism_* Public surface changes (clean break, pre-v1): - API URL /api/v1/emailgen/personas -> /api/v1/realism/personas - CLI `decnet emailgen import-personas` -> `decnet realism import-personas`. `decnet emailgen run` is gone — the orchestrator covers it. - gating.py: emailgen master-only group replaced by realism. - decnet-orchestrator.service.j2: DECNET_REALISM_* env block added. - decnet.target: decnet-emailgen.service entry removed. - frontend: PersonaGeneration.tsx fetches /realism/personas.	2026-04-27 16:33:04 -04:00
anti	cb1872c52f	feat(realism): synthetic_files table + planner wiring + scheduler swap Stage 3 of the realism migration. Replaces orchestrator/scheduler.py's hardcoded _FILE_TEMPLATES/_USERS (3 templates emitting epoch-suffixed filenames like notes-1777315854.txt with identical bodies per template) with a persona-driven realism engine. New surface: - SyntheticFile SQLModel (synthetic_files table, UNIQUE on decky_uuid+path) — per-(decky, path) state for the future edit-in-place flow. Pre-v1, no _migrate_* helper. - BaseRepository methods: record_synthetic_file, update_synthetic_file, list_synthetic_files, pick_random_synthetic_file_for_edit (used by stage 3b). - realism/naming.py: per-content-class filename templates, persona-conditioned. /var/log/cron.log + logrotate skeleton for system-class; /home/<persona>/TODO.md, scratch.md, etc. for user-class. Anti-regression test pins "no 8+ digit decimals in basenames" (the realism failure today). - realism/bodies.py: deterministic body templates per content_class. TODO body uses checkbox markdown, script body has a shebang, cron body matches syslog cron shape ("CRON[PID]: (user) CMD (...)"). - realism/planner.py: pick(deckies, now, rng) returns a Plan. Diurnal-gated, weighted user/system content split (70/30 user bias). Create-only in stage 3; edit branch lands in stage 3b. Scheduler split: - scheduler.pick is now traffic-only (sync). - scheduler.pick_file is async, takes a repo, resolves personas (Topology.email_personas for topology-source deckies; global realism.personas_pool otherwise), and maps Plan -> FileAction. - FileAction gains persona/content_class/mtime fields. Worker: - _one_tick rolls 50/50 between traffic and file each tick. After a successful FileAction plant, _record_synthetic_file persists or patches the synthetic_files row (catching the unique-constraint collision on re-plant of the same path). - SSHDriver._run_file passes action.mtime through to plant_file so files don't all stamp at wall-clock-now.	2026-04-27 16:22:07 -04:00
anti	636c057cc5	refactor(orchestrator): extract ActivityDriver ABC + driver factory Stage 4 of the realism migration. Lifts the driver Protocol into a proper ABC with default plant_file/read_file methods (raise NotImplementedError), and adds get_driver_for(action) so the orchestrator worker can dispatch by action shape without isinstance chains. SSHDriver now inherits ActivityDriver and implements: - plant_file: streams base64 via stdin (ARG_MAX-safe, mirrors decnet.canary.planter; commit `c17b9e0`). Honours mtime via touch -d so realism-planned files don't all stamp at wall-clock-now. - read_file: docker exec cat with FileNotFoundError on rc=1, used by the upcoming EditAction (stage 3b). EmailDriver inherits ActivityDriver. Driver alias kept for back-compat during the migration; removed once realism stages 5-7 land.	2026-04-27 16:09:46 -04:00
anti	0b9873982d	refactor(realism): move emailgen LLM/personas/prompt into shared library Lift the format-agnostic pieces from decnet/orchestrator/emailgen/ into the new decnet/realism/ library so file-class content generation (stage 3 of the realism migration) can reuse them. Email-specific delivery (RFC 2822 EML, IMAP/POP3 spool, thread chains) stays in orchestrator/. Renames (history-preserving git mv): emailgen/personas.py -> realism/personas.py emailgen/prompt.py -> realism/prompts/email.py emailgen/global_pool.py -> realism/personas_pool.py emailgen/llm/ -> realism/llm/ Env-var clean break (pre-v1, no aliases): DECNET_EMAILGEN_LLM -> DECNET_REALISM_LLM DECNET_EMAILGEN_MODEL -> DECNET_REALISM_MODEL DECNET_EMAILGEN_TIMEOUT -> DECNET_REALISM_TIMEOUT DECNET_EMAILGEN_PERSONAS -> DECNET_REALISM_PERSONAS DECNET_EMAILGEN_FAKE_OUTPUT -> DECNET_REALISM_FAKE_OUTPUT Importers rewritten in: orchestrator/emailgen/scheduler.py, orchestrator/drivers/email.py, web/router/{emailgen,topology}/ api_personas.py, cli/emailgen.py. Tests for moved modules relocated to tests/realism/; tests for stay-put modules updated in place. API URL `/api/v1/emailgen/personas` and CLI `decnet emailgen import-personas` keep their public names until the service-collapse commit (stage 5).	2026-04-27 16:05:43 -04:00
anti	f57c621117	feat(realism): scaffold decnet/realism/ library Empty subpackage skeleton for the realism migration: ContentClass enum (file/email/canary content categories), Plan dataclass (frozen, with edit-action invariant), in_work_hours window check (wrap-around supported, fail-open on parse error), and sample_mtime for backdated file timestamps that snap into a persona's active hours. Stage 1 of the orchestrator+canary realism unification — no production caller wired yet; planner.pick is a stub returning None until stage 3.	2026-04-27 15:55:21 -04:00
anti	6376523923	feat(canary): mysql_dump generator with phone-home replica payload Mirrors the Canarytokens.org trick: a base64-wrapped CHANGE REPLICATION SOURCE TO + START REPLICA block in the dump trailer. Importing the file into MySQL resolves <slug>.<dns_zone> (DNS trip) and opens a 3306 replica handshake whose SOURCE_USER smuggles @@hostname and @@lc_time_names of the victim DB. DNS lookup alone is sufficient for detection via the existing canary dns_server; capturing the smuggled metadata via a 3306 handshake responder is a follow-up.	2026-04-27 13:52:55 -04:00
anti	5ac8e0f91a	feat(canary): honeydoc_docx + honeydoc_pdf generators honeydoc previously emitted HTML only — operators picking 'Document' out of the dropdown got a .html file dropped at /Documents/ quarterly_report.docx, which any attacker would clock the moment they ran 'file' on it. Two new generators that emit the real artifact format: - honeydoc_docx: stdlib zipfile only. Builds a minimal but valid Office Open XML zip with the same Q3 review body as the HTML flavor and an external-image relationship pointing at the callback URL — same trick the operator-upload DOCX instrumenter uses, fetched on document open by Word and LibreOffice. Reuses _drawing() and _next_rid() from instrumenters/docx.py to keep the body/relationships shape identical between synthesised and instrumented files. - honeydoc_pdf: pikepdf-backed. One-page PDF in the 14 base fonts (Helvetica, no font embedding), realistic body, /OpenAction /URI on the catalog so most viewers fire the callback on document open. Falls back to a clear error if pikepdf is missing so the operator can switch to honeydoc / honeydoc_docx. Default placement paths now reflect each generator's true extension (.html / .docx / .pdf) so the UI suggests something sensible. Both generators surfaced in the New Token modal's generator dropdown.	2026-04-27 13:44:20 -04:00
anti	c17b9e01c8	fix(canary): stream base64 payload via stdin to avoid ARG_MAX Real-world plant() crashed with OSError [Errno 7] Argument list too long when an artifact (honeydoc HTML / DOCX / PDF) base64-encoded into the sh -c script body exceeded the kernel's argv limit (typically 128KB-2MB depending on the host). Fix: keep the script trivial ('mkdir -p ... && base64 -d > path && ...') and stream the encoded bytes through 'docker exec -i ... sh -c' stdin instead. _run() grew an optional stdin_bytes parameter that's piped into proc.communicate(input=...). The stdin path covers arbitrarily large artifacts. Tests updated: - test_plant_argv_and_base64_round_trip now asserts the docker -i flag is present and the base64 payload reaches stdin (and notably is NOT in the script body). - _FakeProc.communicate accepts input=None across the board so the patched fast path no longer trips on the new kwarg.	2026-04-27 13:37:19 -04:00
anti	af15e68a3d	fix(web): pick decky from a select instead of a free-text input Fetches GET /deckies on page load and feeds the running fleet into the create modal as a <select>. Falls back to an empty-state hint ('No deckies running. Deploy a fleet first.') when the list is empty so the operator isn't staring at an unusable form. Default selection is the first decky returned.	2026-04-27 13:32:51 -04:00
anti	fcdb32908d	fix(web): canary header matches PersonaGeneration / DeckyFleet Switches the page header to the standard .fleet-root .page-header / .page-title-group / h1 / .page-sub / .actions pattern used by every other top-level page. Drops the redundant AUTOMATION supertitle (the sidebar group already labels that) and the inline Target icon next to the title. Action buttons use the project's btn / btn violet classes for visual parity with ADD PERSONA / BULK UPLOAD.	2026-04-27 13:29:37 -04:00
anti	11b0a99914	fix(web): type-only import for CanaryTokenRow verbatimModuleSyntax is enabled in tsconfig — types must use the 'import type' form. Caught by 'npm run build' (tsc -b).	2026-04-27 13:28:03 -04:00
anti	e2c8b77546	feat(web): canary tokens page (under AUTOMATION) New /canary-tokens route, lazy-loaded and gated behind the existing auth flow. Wired into the AUTOMATION NavGroup beside Orchestrator and Persona Generation, using the Target icon. Two components: - CanaryTokens.tsx: list + filter (text + state), stats summary, Tokens / Blobs tab switcher, inline CreateModal + UploadModal. Alt+C opens the create modal (per feedback_linux_meta_key). Drag- drop blob upload, server-sniffed MIME drives the instrumenter. - CanaryTokenDrawer.tsx: per-token detail panel matching the MailDrawer.tsx visual format (right-side drawer, --bg/--border/ --dim/--text CSS vars, X close, focus trap + escape key, monospace metadata table, paginated callback history). Backdrop close uses target===currentTarget instead of stopPropagation on the panel (per feedback_react_stop_propagation_native_delegation). Preview button downloads the deterministically re-derived instrumented bytes; revoke button hits DELETE with a confirm prompt. Type-checks clean (npx tsc --noEmit).	2026-04-27 13:27:14 -04:00

1 2 3 4 5 ...

847 Commits