New realism_config table (uuid PK + unique key) + two repo methods
(get/set) backs an admin-only GET/PUT /api/v1/realism/config surface.
The planner now exposes apply_payload(payload) / current_payload() /
reset_to_defaults() and reads its weights through mutable module
globals; pick() resolves the live values each call. Validation
catches negative weights, zero totals, out-of-range canary_probability,
unknown content_class names, and silently drops cross-list entries
(canary class on the user list, etc).
The orchestrator worker calls _refresh_realism_config(repo) on
startup and every 5 ticks (~5min at 60s interval). Operator changes
land within one refresh window with no bus signal — the simpler path
for a knob whose latency tolerance is minutes.
The (decky_uuid VARCHAR(64), path VARCHAR(1024)) UNIQUE constraint
generated a 4352-byte composite key under utf8mb4 (4 bytes/char),
busting MySQL's 3072-byte cap and crashing decnet api on init with:
Specified key was too long; max key length is 3072 bytes
Tighten path to VARCHAR(512) — (64+512)*4 = 2304 bytes, well under
the cap. Real realism + canary placement paths are short
(/home/<persona>/Documents/<file>, ~70 chars); 512 keeps headroom
without the index hassle. Pre-v1, no migration helper.
Adds a regression test pinning the (decky_uuid + path) byte budget so
a future widening fails loudly in CI rather than at MySQL deploy
time.
Adds GET /api/v1/realism/synthetic-files (paginated list, filters by
decky_uuid, persona, content_class) and
GET /api/v1/realism/synthetic-files/{uuid} (single row with last_body
and a truncated:bool flag set when the stored body is at the 64KB cap).
Repo gains count_synthetic_files() and get_synthetic_file(uuid). The
list view drops last_body to keep the wire payload bounded; the detail
endpoint is the only path that returns it. Read-only — orchestrator
remains the sole writer.
decnet/realism/naming._home and decnet/canary/cultivator._persona_login
both normalised "John Smith"→"johnsmith" with identical logic. Lift
to decnet.realism.personas.login_for(persona) and have both consumers
import it. Drift between the two would have left canary placement and
realism path naming using different login derivations.
The orchestrator worker clipped last_body at write time, but the repo
didn't enforce. A future caller that forgot the clip would write the
full body. Move the clip to record_synthetic_file and
update_synthetic_file via SYNTHETIC_FILE_BODY_LIMIT in
decnet/web/db/models/realism.py. Worker now passes the full body and
trusts the repo. Tests retargeted to assert repo enforcement.
Four gaps from the realism migration plan, plus one flaky test
fixed.
Added:
- tests/deploy/test_orchestrator_unit.py — replaces the dead
test_emailgen_unit.py. Asserts:
* decnet-orchestrator.service.j2 carries the DECNET_REALISM_*
env block (LLM, MODEL, TIMEOUT, PERSONAS) so per-host tuning
works without editing the .j2.
* Legacy DECNET_EMAILGEN_* vars are NOT referenced — clean break
contract from stage 5.
* decnet.target wants orchestrator + canary, does NOT want
decnet-emailgen.service. Anti-regression for service-collapse.
* deploy/decnet-emailgen.service.j2 stays deleted.
- tests/orchestrator/test_worker_integration.py — new
test_one_tick_email_branch_records_orchestrator_email. Pins the
action-roll to email, seeds a topology with an IMAP mail decky +
two personas, stubs LLM + docker-exec write paths, verifies an
orchestrator_emails row + bus event land. Restores end-to-end
email coverage that was lost when the pre-collapse
test_worker_integration.py was deleted.
- tests/realism/test_synthetic_files_truncation.py — pins the 64KB
last_body cap on create + edit, and documents the consequence:
edit candidates carry a truncated snapshot of files that exceeded
the cap. If a future change lifts the cap, _LIMIT in the test
must lift with it.
Fixed flaky:
- tests/orchestrator/test_scheduler.py — two pick_file tests
pinned to random.Random(1). Without a seed, the 3% canary gate
(stage 7) and 10% leave-alone roll occasionally flaked the
assertions because the _FakeRepo doesn't carry a
create_canary_token method.
Note: the existing
test_realism_subprocess_import_personas_rejects_in_agent_mode
already covers agent-mode rejection of decnet realism
import-personas; no new gating test needed.
Stage 7 — final stage of the realism migration. Canary plants are
now scheduled by the same realism planner that handles inert content,
keeping the orchestrator as the single decision point and avoiding
duplicate diurnal / persona / rate-limit logic in the canary
subsystem.
New surface:
- decnet/canary/cultivator.py: cultivate(plan, repo) builds a
CanaryContext, calls the right generator (canary_aws_creds ->
aws_creds, canary_mysql_dump -> mysql_dump, …), persists the
canary_tokens row before plant so the canary worker can attribute
callbacks even on plant-time previews. Resolves canary placements
to credible operator paths (~/.aws/credentials, ~/.ssh/id_rsa,
/var/backups/db_backup.sql).
- realism/planner.py adds 8 canary content_classes uniformly weighted
inside a 3% probability gate. Hard-capped: each tick at most one
canary; create branch falls through to inert otherwise.
- scheduler.pick_file dispatches canary content_class to the
cultivator; FileAction grows an optional content_bytes field so
binary canary artifacts (DOCX/PDF/honeydoc) survive the wire
intact instead of being utf-8 round-tripped.
- SSHDriver._run_file uses content_bytes when set, falls back to
encoding the str content otherwise.
Stealth (per feedback_stealth.md): cultivator does not introduce
any DECNET literal; the underlying generators are already
stealth-clean and the test suite asserts the contract holds.
Tests cover round-tripping every canary class through the cultivator,
verifying placement-path conventions, persona-login normalisation
("John Smith" -> /home/johnsmith/.aws/credentials), and the
no-DECNET-leak invariant.
Stage 6 of the realism migration. User-class file bodies (note,
todo, draft, script) optionally get LLM-authored content; system
classes (cron / daemon logs, /tmp caches) stay template-only because
formulaic *is* the right look for them.
New surface:
- realism.llm.circuit.LLMCircuitBreaker — process-local sliding-window
breaker. 3 consecutive failures trip open; 60s cooldown to half-open;
half-open success closes, failure re-opens. Protects the orchestrator
tick from sustained Ollama wedges (per-call timeout already covers
one-shot hangs).
- realism.prompts._style — em-dash suppression lifted from the
email prompt. Persona.uses_llms_heavily opts out per the
feedback_em_dash_llm_tell.md memory. Includes strip_em_dashes
belt-and-braces sub for output that slipped past the prompt rule.
- realism.prompts.filebody — class-conditioned prompts (note / todo
/ draft / script) with persona context, language pinning, output
shape rule.
- realism.bodies.make_body_with_llm — async wrapper around make_body
that calls the LLM when one is provided AND the breaker allows.
Falls back to template on timeout / error / empty / system-class.
Wiring:
- scheduler.pick_file accepts optional llm + llm_breaker + llm_timeout.
When the planner picks a create action and the content_class is a
user-class, the body_hint is replaced with the LLM-authored body
(or falls back to the deterministic body_hint).
- orchestrator.worker constructs get_llm() at startup gated by
DECNET_REALISM_LLM env var (any non-empty value enables; empty /
"off" / "none" / "0" disables). Passes llm + breaker through every
tick.
- decnet orchestrate gains --llm/--no-llm flag overriding the env var.
Stage 3b of the realism migration. A TODO.md planted on Monday gets a
checkbox flipped on Tuesday; a notes file grows a follow-up line; a
cron log gets a fresh entry tacked on. The synthetic_files row's
edit_count, last_modified, and content_hash advance.
New surface:
- EditAction dataclass (peer of FileAction in scheduler.py): carries
decky, path, persona, content_class, previous_body, mtime, and
synthetic_file_uuid for the worker's update path.
- realism.bodies.next_iteration(cls, persona, prev, rng): per-class
deterministic mutators. TODO flips an unchecked box and/or appends;
notes/drafts/scripts append; logs are append-only (mirroring real
log behaviour). Canary, cache_tmp, email raise KeyError —
unsupported.
- realism.planner.pick gains an edit branch: 60% create, 30% edit
(when an edit_candidate is supplied), 10% leave-alone. Returns
None on leave-alone — quiet ticks are realism too.
- scheduler.pick_file pre-fetches a single edit candidate via
repo.pick_random_synthetic_file_for_edit ~50% of ticks; the
planner decides whether to use it.
- SSHDriver._run_edit: turns next_iteration output into a
plant_file call (mtime-bumped, mode 0o644). Stashes new_body in
result.payload so the worker can hash it for synthetic_files.
- worker._bump_synthetic_file_after_edit: patches edit_count + 1,
last_modified=now, content_hash, last_body for the row UUID.
No-op when the row was pruned mid-flight.
- events.to_row / topic_for / event_type_for now recognise
EditAction (kind="file", action="file:edit").
Stage 3 of the realism migration. Replaces orchestrator/scheduler.py's
hardcoded _FILE_TEMPLATES/_USERS (3 templates emitting epoch-suffixed
filenames like notes-1777315854.txt with identical bodies per
template) with a persona-driven realism engine.
New surface:
- SyntheticFile SQLModel (synthetic_files table, UNIQUE on
decky_uuid+path) — per-(decky, path) state for the future
edit-in-place flow. Pre-v1, no _migrate_* helper.
- BaseRepository methods: record_synthetic_file,
update_synthetic_file, list_synthetic_files,
pick_random_synthetic_file_for_edit (used by stage 3b).
- realism/naming.py: per-content-class filename templates,
persona-conditioned. /var/log/cron.log + logrotate skeleton for
system-class; /home/<persona>/TODO.md, scratch.md, etc. for
user-class. Anti-regression test pins "no 8+ digit decimals in
basenames" (the realism failure today).
- realism/bodies.py: deterministic body templates per content_class.
TODO body uses checkbox markdown, script body has a shebang, cron
body matches syslog cron shape ("CRON[PID]: (user) CMD (...)").
- realism/planner.py: pick(deckies, now, rng) returns a Plan.
Diurnal-gated, weighted user/system content split (70/30 user
bias). Create-only in stage 3; edit branch lands in stage 3b.
Scheduler split:
- scheduler.pick is now traffic-only (sync).
- scheduler.pick_file is async, takes a repo, resolves personas
(Topology.email_personas for topology-source deckies; global
realism.personas_pool otherwise), and maps Plan -> FileAction.
- FileAction gains persona/content_class/mtime fields.
Worker:
- _one_tick rolls 50/50 between traffic and file each tick. After a
successful FileAction plant, _record_synthetic_file persists or
patches the synthetic_files row (catching the unique-constraint
collision on re-plant of the same path).
- SSHDriver._run_file passes action.mtime through to plant_file so
files don't all stamp at wall-clock-now.
Empty subpackage skeleton for the realism migration: ContentClass enum
(file/email/canary content categories), Plan dataclass (frozen, with
edit-action invariant), in_work_hours window check (wrap-around
supported, fail-open on parse error), and sample_mtime for backdated
file timestamps that snap into a persona's active hours.
Stage 1 of the orchestrator+canary realism unification — no
production caller wired yet; planner.pick is a stub returning None
until stage 3.