Realism
The realism content engine is what makes DECNET deckies look lived-in. Without it, a deployed honeypot has a frozen filesystem, mailboxes that never grow, and timestamps clustered at deploy time. Attackers notice. The realism library — decnet/realism/ — drives the orchestrator's per-tick file plants and email drops so each decky grows files at plausible hours, with persona-conditioned names and bodies, occasionally edited in place, and very rarely seeded with callback-bearing canaries.
This is the operator-facing guide. For the underlying module surface see Module Reference — Workers § Orchestrator.
Why this exists
Pre-realism, the orchestrator's file plants looked like this on a deployed decky:
$ ls /home/admin/
notes-1777254307.txt notes-1777260507.txt notes-1777266693.txt notes-1777274923.txt
$ cat notes-1777254307.txt
todo: rotate keys; check on backup task
Two tells:
- Filenames are unix epochs. No real user names a file
notes-1777315854.txt. They writenotes.txt,TODO.md,keys.txt. - Identical bodies. Every
notes-*.txthad the same one-line content because the generator was three hardcoded templates.
The realism engine fixes both — and adds edit-in-place, diurnal pacing, optional LLM enrichment, and canary cultivation on the same pacing.
Architecture in one paragraph
The orchestrator ticks every 60 s and rolls a weighted action kind: 45 % SSH traffic, 45 % file plant or edit, 10 % email. The file branch asks the realism planner for a Plan (decky, persona, content_class, action, mtime, body hint). The planner enforces a diurnal gate (only personas in their active_hours window are considered), weights content classes (user > system > canary), and decides create / edit / leave-alone. The plan flows through the SSH driver, which writes the bytes via base64-on-stdin docker exec with a backdated mtime via touch -d. After a successful plant or edit the worker persists or patches a synthetic_files row so the next tick can edit it again. When LLM enrichment is enabled, user-class bodies get one Ollama round-trip each; on timeout / error / breaker-trip the deterministic template is the fallback.
Content classes
Every planted artifact maps to exactly one ContentClass member (defined in decnet/realism/taxonomy.py).
| Class | Category | LLM-eligible | Examples |
|---|---|---|---|
note |
user | yes | ~/notes.txt, ~/scratch.md, ~/keys.txt |
todo |
user | yes | ~/TODO.md, ~/todo.txt, ~/things.md |
draft |
user | yes | ~/Q3-budget-DRAFT.md, ~/proposal.md |
script |
user | yes | ~/backup.sh, ~/cleanup.sh, ~/fix.py |
log_cron |
system | no | /var/log/cron.log, /var/log/cron.log.1, /var/log/cron.log.2.gz |
log_daemon |
system | no | /var/log/daemon.log, /var/log/syslog, /var/log/auth.log |
cache_tmp |
system | no | /tmp/.cache-XXXXXX (mkstemp shape) |
email |
yes | mail-decky maildir contents | |
canary_aws_creds |
canary | no | ~/.aws/credentials (passive) |
canary_env_file |
canary | no | ~/app/.env (HTTP callback) |
canary_git_config |
canary | no | ~/.git/config (HTTP callback) |
canary_ssh_key |
canary | no | ~/.ssh/id_rsa (DNS callback in comment) |
canary_honeydoc |
canary | no | ~/Documents/notes.html (HTTP callback) |
canary_honeydoc_docx |
canary | no | ~/Documents/Q3-Operations-Review.docx (DOCX with remote 1×1 image) |
canary_honeydoc_pdf |
canary | no | same as docx, PDF flavour |
canary_mysql_dump |
canary | no | /var/backups/db_backup.sql (replica-handshake DNS phone-home) |
System-class content is deliberately template-only. Real cron logs are formulaic — an LLM-authored cron log is more suspicious than a templated one. Canary classes are also template-only because their generators are deterministic by design (re-seeding from the same callback token must produce the same bytes for planter idempotency).
Personas
Personas are fictional employees the realism engine writes as. Each persona carries:
name,email,role— basic identity.tone—formal/direct/casual/technical/custom— drives the LLM voice.mannerisms— short list of stylistic ticks; 1–2 are randomly picked into each prompt.language— ISO 639-1; the LLM is instructed not to code-switch.active_hours—"HH:MM-HH:MM", supports wrap-around ("22:00-06:00"). The planner skips a persona outside its window.signature— optional verbatim block for emails.uses_llms_heavily— opt-out for the em-dash suppression (see below).
Two pools
- Topology pool —
Topology.email_personas, edited per topology via the dashboard's Persona Generation page (/topologies/:id/personas). MazeNET-topology deckies use this. - Global pool — a JSON file on disk, edited via
/realism/personason the dashboard ordecnet realism import-personas <file>on the CLI. Fleet (MACVLAN/IPVLAN) and SWARM-shard deckies use this. Path resolution:$DECNET_REALISM_PERSONAS→/etc/decnet/email_personas.json→~/.decnet/email_personas.json.
Files vary by user (admin vs ubuntu vs service), so a single decky can host files from multiple personas — the planner samples per tick, persists the picked persona on the synthetic_files row, and never binds one decky to a single fictional employee.
Em-dash suppression
Em-dashes (—) are a strong stylometric tell for LLM-authored prose. By default the prompt builder instructs the model to avoid them, and a belt-and-braces strip_em_dashes substitutes any that slip through. Personas with uses_llms_heavily=true opt out — they're meant to look like the kind of person who really does write that way.
Diurnal gating
Two helpers in decnet/realism/diurnal.py:
in_work_hours(window, now)— gate the planner so a persona's files only appear inside the persona's window. Wrap-around is supported. Malformed windows fail open (a typo never silences the whole fleet).sample_mtime(window, now, *, backdate_min_hours=0.5, backdate_max_days=14.0)— return a backdateddatetimewhose hour-of-day falls inside the window. Drivers pass this totouch -dafter every plant. The hour-snap is skipped when the candidate already lands in window; when it has to snap, the result is shifted back at least one day so it stays in the past.
Net effect: a ~/TODO.md planted during admin's 09:00–18:00 window will report mtimes inside that window, biased toward "edited recently" but never wall-clock-now.
Edit-in-place
When the planner picks action="edit", the orchestrator reads the previous body from the synthetic_files row, asks realism.bodies.next_iteration for a plausible mutation, writes it back with a fresh in-window mtime, and bumps edit_count + 1. Per content_class:
- TODO — flip an unchecked box to
[x], append a new item, or both. - Note / draft / script — append a new line / paragraph / comment.
- Log_cron / log_daemon — append a new syslog line (logs are append-only).
Canary classes, cache_tmp, and email don't support edits — the planner filters them out at candidate-selection time.
LLM enrichment
Optional. When DECNET_REALISM_LLM is set to a non-empty value (ollama / fake / etc.), the orchestrator builds an LLMBackend at startup and passes it through every tick. For user-class file bodies (note / todo / draft / script) the worker:
- Builds a class-conditioned prompt (
decnet/realism/prompts/filebody.py). - Calls
await asyncio.wait_for(llm.generate(prompt), timeout=DECNET_REALISM_TIMEOUT). - Falls back to the deterministic template on
LLMTimeout, error, empty output, or non-success. - Strips em-dashes (unless persona opted in) on the way out.
System-class content (logs, /tmp caches) and canary classes never invoke the LLM — those are template-only by design.
Circuit breaker
The per-call timeout protects one tick from one wedged Ollama; the breaker (decnet/realism/llm/circuit.py) protects the worker from a sustained problem. After 3 consecutive failures it flips open and short-circuits subsequent calls to the template fallback for 60 s, then half-opens to probe — success closes, failure re-opens with a fresh cooldown. State is process-local. Counters reset on any single success.
Canary cultivation
Roughly 3 % of file ticks land on a canary class. The cultivator (decnet/canary/cultivator.py):
- Maps the
canary_*content_class to a generator name (canary_aws_creds→aws_creds,canary_mysql_dump→mysql_dump, …). - Mints a fresh
callback_token(16 url-safe bytes). - Builds a
CanaryContextfrom$DECNET_CANARY_HTTP_BASEand$DECNET_CANARY_DNS_ZONE. - Calls the generator for the bytes.
- Persists a
canary_tokensrow before plant so the canary worker can attribute callbacks even on plant-time previews. - Returns a
CanaryArtifactwith the placement path resolved per-class (~/.aws/credentials,~/.ssh/id_rsa,/var/backups/db_backup.sql, …).
Required env: at least DECNET_CANARY_HTTP_BASE for HTTP-callback generators, DECNET_CANARY_DNS_ZONE for DNS-callback ones (ssh_key, mysql_dump). Without them the cultivator raises and the orchestrator falls through to a non-canary plan — the tick isn't wasted.
Stealth: the cultivator never adds the DECNET literal to artifact bytes. The underlying generators are already stealth-clean. A test asserts the contract holds (tests/canary/test_cultivator.py::test_cultivate_artifact_does_not_leak_decnet_string).
Volume and rate
Canary tokens are real: each carries a real DNS subdomain, a real HTTP slug, a real canary_tokens row, and (when tripped) a real alert. The 3 % gate is conservative on purpose — flooding the fleet makes the dashboard noisy and explodes the alert surface. If you want more, edit _CANARY_PROBABILITY in decnet/realism/planner.py; if you want fewer, do the inverse. There is no per-decky daily cap today (planner-level), but the per-(decky_uuid, path) UNIQUE on synthetic_files provides natural deduplication.
Storage
Two tables back this:
synthetic_files— per-(decky_uuid, path)row. Carriespersona,content_class,created_at,last_modified,edit_count,content_hash,last_body(capped at 64 KB). Schema indecnet/web/db/models/realism.py.canary_tokens— existing canary-subsystem table; cultivator writes one row per canary plant.
Two tables already in production receive the orchestrator's per-tick events:
orchestrator_events—kind ∈ {"traffic", "file"}. IncludesEditActionrows underkind="file",action="file:edit".orchestrator_emails—EmailActionrows.
Configuration
| Env var | Default | Effect |
|---|---|---|
DECNET_REALISM_LLM |
unset | Backend selector (ollama / fake / off). Unset / off / none / 0 / false / disabled disables enrichment; any other value enables. |
DECNET_REALISM_MODEL |
llama3.1 |
Ollama model name. |
DECNET_REALISM_TIMEOUT |
60 |
Per-call wall-clock cap (seconds). |
DECNET_REALISM_PERSONAS |
/etc/decnet/email_personas.json |
Global pool path override. |
DECNET_CANARY_HTTP_BASE |
unset | HTTP callback base (https://canary.example.test). |
DECNET_CANARY_DNS_ZONE |
unset | DNS zone (canary.example.test). |
Per-host overrides go in the orchestrator unit's EnvironmentFile ({install_dir}/.env.local), see Systemd-Setup.
CLI surface
decnet orchestrate [--llm/--no-llm]— the long-running worker. See CLI Reference § decnet orchestrate.decnet realism import-personas <PATH>— validate and install the global persona pool. See CLI Reference § decnet realism import-personas.
Dashboard
The dashboard's Persona Generation page edits both pools (per-topology and global). A synthetic-files browser ("files this decky has grown") and an LLM-status panel are open follow-ups; the data is already persisted, just not yet rendered.
Migration history
The realism library was extracted from the original decnet/orchestrator/emailgen/ worker in eight stages. Stage notes live in commit messages on dev; the highlights:
- Stage 2 —
emailgen/personas,emailgen/prompt,emailgen/global_pool,emailgen/llm/moved intodecnet/realism/. Env-var renameDECNET_EMAILGEN_*→DECNET_REALISM_*(clean break, pre-v1). - Stage 4 —
ActivityDriverABC +get_driver_for(action)factory;SSHDriver.plant_filestreams base64 via stdin (ARG_MAX-safe), honoursmtime. - Stage 5 — service collapse:
decnet-emailgen.servicedeleted,decnet emailgen rundeleted,EmailActionjoinedTrafficAction/FileActionin the orchestrator's tick. API URL/api/v1/emailgen/personas→/api/v1/realism/personas. CLIdecnet emailgen import-personas→decnet realism import-personas.
For the full story, git log --oneline | grep realism on dev.
See also
- Module Reference — Workers § Orchestrator
- Service-Bus —
orchestrator.{traffic,file,email}.{decky_id}topics - CLI Reference
- Environment Variables
- Security and Stealth — em-dash policy, no-DECNET-literal contract
DECNET
User docs
- Quick-Start
- Installation
- Requirements-and-Python-Versions
- CLI-Reference
- INI-Config-Format
- Custom-Services
- Services-Catalog
- Service-Personas
- Archetypes
- Distro-Profiles
- OS-Fingerprint-Spoofing
- Networking-MACVLAN-IPVLAN
- Deployment-Modes
- SWARM-Mode
- Tailscale-Global-Deployment
- Resource-Footprint
- MazeNET
- Remote-Updates
- Environment-Variables
- Teardown-and-State
- Database-Drivers
- Systemd-Setup
- Logging-and-Syslog
- Fingerprinting
- Service-Bus
- Realism
- Web-Dashboard
- REST-API-Reference
- Mutation-and-Randomization
- Troubleshooting
Developer docs
DECNET — honeypot deception-network framework. Pre-1.0, active development — use with caution. See Sponsors to support the project. Contact: samuel@securejump.cl