New realism_config table (uuid PK + unique key) + two repo methods
(get/set) backs an admin-only GET/PUT /api/v1/realism/config surface.
The planner now exposes apply_payload(payload) / current_payload() /
reset_to_defaults() and reads its weights through mutable module
globals; pick() resolves the live values each call. Validation
catches negative weights, zero totals, out-of-range canary_probability,
unknown content_class names, and silently drops cross-list entries
(canary class on the user list, etc).
The orchestrator worker calls _refresh_realism_config(repo) on
startup and every 5 ticks (~5min at 60s interval). Operator changes
land within one refresh window with no bus signal — the simpler path
for a knob whose latency tolerance is minutes.
Surfaces realism subsystem state on the existing worker heartbeat
extra hook (system.orchestrator.health) — no new bus topic. Payload
carries {llm_enabled, llm_backend, llm_model, llm_breaker_state}, so
the dashboard's worker panel renders a live LLM badge with a colored
breaker-state dot:
closed (green) — LLM healthy
half_open (amber) — cooldown elapsed; next call is a probe
open (red) — short-circuiting to deterministic templates
Heartbeat is the canonical worker self-report channel; piggybacking on
extra(...) avoids a new topic family while keeping the snapshot
recomputed each beat (30s).
The orchestrator worker clipped last_body at write time, but the repo
didn't enforce. A future caller that forgot the clip would write the
full body. Move the clip to record_synthetic_file and
update_synthetic_file via SYNTHETIC_FILE_BODY_LIMIT in
decnet/web/db/models/realism.py. Worker now passes the full body and
trusts the repo. Tests retargeted to assert repo enforcement.
Stage 7 — final stage of the realism migration. Canary plants are
now scheduled by the same realism planner that handles inert content,
keeping the orchestrator as the single decision point and avoiding
duplicate diurnal / persona / rate-limit logic in the canary
subsystem.
New surface:
- decnet/canary/cultivator.py: cultivate(plan, repo) builds a
CanaryContext, calls the right generator (canary_aws_creds ->
aws_creds, canary_mysql_dump -> mysql_dump, …), persists the
canary_tokens row before plant so the canary worker can attribute
callbacks even on plant-time previews. Resolves canary placements
to credible operator paths (~/.aws/credentials, ~/.ssh/id_rsa,
/var/backups/db_backup.sql).
- realism/planner.py adds 8 canary content_classes uniformly weighted
inside a 3% probability gate. Hard-capped: each tick at most one
canary; create branch falls through to inert otherwise.
- scheduler.pick_file dispatches canary content_class to the
cultivator; FileAction grows an optional content_bytes field so
binary canary artifacts (DOCX/PDF/honeydoc) survive the wire
intact instead of being utf-8 round-tripped.
- SSHDriver._run_file uses content_bytes when set, falls back to
encoding the str content otherwise.
Stealth (per feedback_stealth.md): cultivator does not introduce
any DECNET literal; the underlying generators are already
stealth-clean and the test suite asserts the contract holds.
Tests cover round-tripping every canary class through the cultivator,
verifying placement-path conventions, persona-login normalisation
("John Smith" -> /home/johnsmith/.aws/credentials), and the
no-DECNET-leak invariant.
Stage 6 of the realism migration. User-class file bodies (note,
todo, draft, script) optionally get LLM-authored content; system
classes (cron / daemon logs, /tmp caches) stay template-only because
formulaic *is* the right look for them.
New surface:
- realism.llm.circuit.LLMCircuitBreaker — process-local sliding-window
breaker. 3 consecutive failures trip open; 60s cooldown to half-open;
half-open success closes, failure re-opens. Protects the orchestrator
tick from sustained Ollama wedges (per-call timeout already covers
one-shot hangs).
- realism.prompts._style — em-dash suppression lifted from the
email prompt. Persona.uses_llms_heavily opts out per the
feedback_em_dash_llm_tell.md memory. Includes strip_em_dashes
belt-and-braces sub for output that slipped past the prompt rule.
- realism.prompts.filebody — class-conditioned prompts (note / todo
/ draft / script) with persona context, language pinning, output
shape rule.
- realism.bodies.make_body_with_llm — async wrapper around make_body
that calls the LLM when one is provided AND the breaker allows.
Falls back to template on timeout / error / empty / system-class.
Wiring:
- scheduler.pick_file accepts optional llm + llm_breaker + llm_timeout.
When the planner picks a create action and the content_class is a
user-class, the body_hint is replaced with the LLM-authored body
(or falls back to the deterministic body_hint).
- orchestrator.worker constructs get_llm() at startup gated by
DECNET_REALISM_LLM env var (any non-empty value enables; empty /
"off" / "none" / "0" disables). Passes llm + breaker through every
tick.
- decnet orchestrate gains --llm/--no-llm flag overriding the env var.
Stage 3b of the realism migration. A TODO.md planted on Monday gets a
checkbox flipped on Tuesday; a notes file grows a follow-up line; a
cron log gets a fresh entry tacked on. The synthetic_files row's
edit_count, last_modified, and content_hash advance.
New surface:
- EditAction dataclass (peer of FileAction in scheduler.py): carries
decky, path, persona, content_class, previous_body, mtime, and
synthetic_file_uuid for the worker's update path.
- realism.bodies.next_iteration(cls, persona, prev, rng): per-class
deterministic mutators. TODO flips an unchecked box and/or appends;
notes/drafts/scripts append; logs are append-only (mirroring real
log behaviour). Canary, cache_tmp, email raise KeyError —
unsupported.
- realism.planner.pick gains an edit branch: 60% create, 30% edit
(when an edit_candidate is supplied), 10% leave-alone. Returns
None on leave-alone — quiet ticks are realism too.
- scheduler.pick_file pre-fetches a single edit candidate via
repo.pick_random_synthetic_file_for_edit ~50% of ticks; the
planner decides whether to use it.
- SSHDriver._run_edit: turns next_iteration output into a
plant_file call (mtime-bumped, mode 0o644). Stashes new_body in
result.payload so the worker can hash it for synthetic_files.
- worker._bump_synthetic_file_after_edit: patches edit_count + 1,
last_modified=now, content_hash, last_body for the row UUID.
No-op when the row was pruned mid-flight.
- events.to_row / topic_for / event_type_for now recognise
EditAction (kind="file", action="file:edit").
Stage 3 of the realism migration. Replaces orchestrator/scheduler.py's
hardcoded _FILE_TEMPLATES/_USERS (3 templates emitting epoch-suffixed
filenames like notes-1777315854.txt with identical bodies per
template) with a persona-driven realism engine.
New surface:
- SyntheticFile SQLModel (synthetic_files table, UNIQUE on
decky_uuid+path) — per-(decky, path) state for the future
edit-in-place flow. Pre-v1, no _migrate_* helper.
- BaseRepository methods: record_synthetic_file,
update_synthetic_file, list_synthetic_files,
pick_random_synthetic_file_for_edit (used by stage 3b).
- realism/naming.py: per-content-class filename templates,
persona-conditioned. /var/log/cron.log + logrotate skeleton for
system-class; /home/<persona>/TODO.md, scratch.md, etc. for
user-class. Anti-regression test pins "no 8+ digit decimals in
basenames" (the realism failure today).
- realism/bodies.py: deterministic body templates per content_class.
TODO body uses checkbox markdown, script body has a shebang, cron
body matches syslog cron shape ("CRON[PID]: (user) CMD (...)").
- realism/planner.py: pick(deckies, now, rng) returns a Plan.
Diurnal-gated, weighted user/system content split (70/30 user
bias). Create-only in stage 3; edit branch lands in stage 3b.
Scheduler split:
- scheduler.pick is now traffic-only (sync).
- scheduler.pick_file is async, takes a repo, resolves personas
(Topology.email_personas for topology-source deckies; global
realism.personas_pool otherwise), and maps Plan -> FileAction.
- FileAction gains persona/content_class/mtime fields.
Worker:
- _one_tick rolls 50/50 between traffic and file each tick. After a
successful FileAction plant, _record_synthetic_file persists or
patches the synthetic_files row (catching the unique-constraint
collision on re-plant of the same path).
- SSHDriver._run_file passes action.mtime through to plant_file so
files don't all stamp at wall-clock-now.
Stage 4 of the realism migration. Lifts the driver Protocol into a
proper ABC with default plant_file/read_file methods (raise
NotImplementedError), and adds get_driver_for(action) so the
orchestrator worker can dispatch by action shape without isinstance
chains.
SSHDriver now inherits ActivityDriver and implements:
- plant_file: streams base64 via stdin (ARG_MAX-safe, mirrors
decnet.canary.planter; commit c17b9e0). Honours mtime via touch -d
so realism-planned files don't all stamp at wall-clock-now.
- read_file: docker exec cat with FileNotFoundError on rc=1, used by
the upcoming EditAction (stage 3b).
EmailDriver inherits ActivityDriver. Driver alias kept for back-compat
during the migration; removed once realism stages 5-7 land.
Lift the Ollama subprocess shell-out out of EmailDriver and into a
proper provider subpackage shape:
decnet/orchestrator/emailgen/llm/
base.py — LLMBackend Protocol + LLMResult + LLMTimeout
factory.py — get_llm() reads DECNET_EMAILGEN_LLM
impl/ollama.py — current 'ollama run' subprocess path
impl/fake.py — canned-output backend used by tests
Driver now takes an LLMBackend on construction (or inherits the
factory default). Tests inject FakeBackend instead of monkeypatching
the subprocess layer, which is cleaner and ~10x faster. Swapping
Ollama for the Anthropic API / vLLM / llama.cpp is now a third branch
in factory.py; no driver rewrite needed.
Mirrors the convention used by decnet.web.db.factory + decnet.bus.factory
per the provider-subpackages-from-day-one rule in memory.
Two changes that unwind earlier MazeNET-only assumptions and fix a
realism tell:
1. Persona resolution is now per-decky-source, not topology-only. The
scheduler walks the union view (list_running_deckies, including
fleet MACVLAN/IPVLAN + SWARM shards) and picks the right persona
list for each source:
* topology decky -> Topology.email_personas (per-topology richness
preserved)
* fleet / shard -> a single host-wide pool loaded from disk
(DECNET_EMAILGEN_PERSONAS, /etc/decnet/email_personas.json, or
~/.decnet/email_personas.json)
Operators install the global pool via 'decnet emailgen
import-personas <file>' which validates with the same Pydantic
schema the worker uses.
2. The driver now runs 'touch -d <Date>' inside the docker exec right
after the EML write so file mtime matches the email's RFC 2822
Date: header. Without this an attacker 'ls -lt'ing the spool sees
every email clustered inside the worker's tick window — the
cluster itself was a stylometric tell.
CLI now exposes 'decnet emailgen' as a sub-app with 'run' (default,
backwards-compatible with bare 'decnet emailgen') and 'import-personas'.
list_running_deckies carries topology_id through so consumers can resolve
the parent topology without a second round-trip.
Second orchestrator worker (decnet emailgen) that drips persona-driven,
threaded, multi-language fake emails into running mail deckies. Personas
live on Topology.email_personas; topology-wide language_default falls
through to any persona that doesn't pin its own. Em-dashes are
suppressed at the prompt layer by default and only lifted for personas
explicitly marked uses_llms_heavily — em-dashes are an LLM tell and a
flat corpus of em-dashed mail is a giveaway.
EML delivery writes into /var/spool/decnet-emails/<thread>/<msg>.eml on
the mail decky via docker exec; wiring the IMAP/POP3 templates to read
from that spool (replacing the hardcoded _BAIT_EMAILS) is the next step.
Switches _one_tick from list_running_topology_deckies to
list_running_deckies (the union view added in 095500a). Resolves the
permanent "no actionable deckies (running+ssh count=0)" log on hosts
running only unihost MACVLAN / IPVLAN decoys — the orchestrator now
sees fleet_deckies rows alongside MazeNET topology rows and SWARM
DeckyShard rows.
Also fixes the misleading log message: the old "running+ssh count=N"
reported the *pre-filter* total (count of all running deckies, not
the SSH-eligible subset that scheduler.pick actually evaluates). New
line breaks down running, ssh_eligible, and per-source counts so
debugging "why isn't it picking?" no longer requires reading
scheduler internals.
Regression test: orchestrator integration suite now seeds fleet_deckies
rows (not just topology_deckies) and verifies a tick picks them and
records an event with dst="local:fleet-*" — proving the original bug
on the operator's mothership is fixed.
Every 100 ticks, trim per-dst_decky_uuid history down to 10000 rows
(oldest first). Keeps the events table bounded on long-running fleets
without paying the cost on every write.
Aligns the bus token with the DB column value; OrchestratorEvent.kind
is 'traffic'/'file' but the topic was 'activity'/'file'. The asymmetry
made consumer code (UI filter, SSE event names) need a translation
layer. No external subscribers existed yet.
Adds a new decnet orchestrate worker whose job is to keep the honeypot
ecosystem from looking suspiciously static — a frozen LAN with no
inter-host traffic and no filesystem aging is its own honeypot tell.
MVP scope:
- New OrchestratorEvent table + repo methods (purpose-built sibling
to Log so synthetic events stay separable from attacker-driven ones).
- New orchestrator.{activity,file}.<decky_id> bus topics +
system.orchestrator.health heartbeat.
- SSH-only driver. Traffic action runs python3 inside src container
to TCP-connect dst:22 and read the SSH banner — real on-the-wire
SSH-protocol traffic without shipping creds. File action drops or
refreshes a small file via docker exec on the destination.
- Random scheduler (50/50 traffic/file when >=2 SSH-capable deckies
are running). Diurnal shaping, role-aware pairing, and session-aware
backoff are explicit non-goals for MVP.
- CLI registration, systemd unit (SupplementaryGroups=docker),
worker-registry entry so the dashboard shows orchestrator health.
- 11 tests: scheduler policy, driver argv shape + injection-safety,
end-to-end one-tick integration with FakeBus + SQLite.