4 Commits

Author SHA1 Message Date
b86129e35e tests: realism migration regression coverage
Four gaps from the realism migration plan, plus one flaky test
fixed.

Added:

- tests/deploy/test_orchestrator_unit.py — replaces the dead
  test_emailgen_unit.py. Asserts:
  * decnet-orchestrator.service.j2 carries the DECNET_REALISM_*
    env block (LLM, MODEL, TIMEOUT, PERSONAS) so per-host tuning
    works without editing the .j2.
  * Legacy DECNET_EMAILGEN_* vars are NOT referenced — clean break
    contract from stage 5.
  * decnet.target wants orchestrator + canary, does NOT want
    decnet-emailgen.service. Anti-regression for service-collapse.
  * deploy/decnet-emailgen.service.j2 stays deleted.

- tests/orchestrator/test_worker_integration.py — new
  test_one_tick_email_branch_records_orchestrator_email. Pins the
  action-roll to email, seeds a topology with an IMAP mail decky +
  two personas, stubs LLM + docker-exec write paths, verifies an
  orchestrator_emails row + bus event land. Restores end-to-end
  email coverage that was lost when the pre-collapse
  test_worker_integration.py was deleted.

- tests/realism/test_synthetic_files_truncation.py — pins the 64KB
  last_body cap on create + edit, and documents the consequence:
  edit candidates carry a truncated snapshot of files that exceeded
  the cap. If a future change lifts the cap, _LIMIT in the test
  must lift with it.

Fixed flaky:

- tests/orchestrator/test_scheduler.py — two pick_file tests
  pinned to random.Random(1). Without a seed, the 3% canary gate
  (stage 7) and 10% leave-alone roll occasionally flaked the
  assertions because the _FakeRepo doesn't carry a
  create_canary_token method.

Note: the existing
test_realism_subprocess_import_personas_rejects_in_agent_mode
already covers agent-mode rejection of decnet realism
import-personas; no new gating test needed.
2026-04-27 17:29:25 -04:00
32eeb0c813 refactor(orchestrator): collapse decnet-emailgen.service into orchestrator
Stage 5 of the realism migration. Email generation is no longer a
separate worker / systemd unit / CLI subcommand — the orchestrator's
single tick loop covers SSH traffic, file plants, and email drops.
Going from 21 services to 20.

Worker:
- _one_tick rolls between traffic / file / email (45/45/10 weights).
  The 10% email weight at a 60s orchestrator interval produces ~one
  email per 10 minutes, close to the pre-collapse 5-minute cadence.
- get_driver_for(action) (stage 4) handles SSH vs Email dispatch.
- Quiet branches fall through so a (decky-set, persona-pool,
  mail-decky) shape that silences one branch doesn't waste the tick.
- Periodic prune covers both orchestrator_events and
  orchestrator_emails tables.

Deletions:
- deploy/decnet-emailgen.service.j2
- decnet/orchestrator/emailgen/worker.py
- decnet/cli/emailgen.py
- tests/orchestrator/emailgen/test_worker_integration.py

Renames (history-preserving):
- decnet/web/router/emailgen/ -> decnet/web/router/realism/
- tests/api/emailgen/        -> tests/api/realism/
- tests/cli/test_emailgen_*  -> tests/cli/test_realism_*

Public surface changes (clean break, pre-v1):
- API URL /api/v1/emailgen/personas -> /api/v1/realism/personas
- CLI `decnet emailgen import-personas` -> `decnet realism
  import-personas`. `decnet emailgen run` is gone — the orchestrator
  covers it.
- gating.py: emailgen master-only group replaced by realism.
- decnet-orchestrator.service.j2: DECNET_REALISM_* env block added.
- decnet.target: decnet-emailgen.service entry removed.
- frontend: PersonaGeneration.tsx fetches /realism/personas.
2026-04-27 16:33:04 -04:00
a8441481b5 fix(orchestrator): see fleet + shard deckies, not just topology rows
Switches _one_tick from list_running_topology_deckies to
list_running_deckies (the union view added in 095500a). Resolves the
permanent "no actionable deckies (running+ssh count=0)" log on hosts
running only unihost MACVLAN / IPVLAN decoys — the orchestrator now
sees fleet_deckies rows alongside MazeNET topology rows and SWARM
DeckyShard rows.

Also fixes the misleading log message: the old "running+ssh count=N"
reported the *pre-filter* total (count of all running deckies, not
the SSH-eligible subset that scheduler.pick actually evaluates). New
line breaks down running, ssh_eligible, and per-source counts so
debugging "why isn't it picking?" no longer requires reading
scheduler internals.

Regression test: orchestrator integration suite now seeds fleet_deckies
rows (not just topology_deckies) and verifies a tick picks them and
records an event with dst="local:fleet-*" — proving the original bug
on the operator's mothership is fixed.
2026-04-26 21:16:22 -04:00
4c37ece39e feat(orchestrator): MVP synthetic life-injection worker (SSH only)
Adds a new decnet orchestrate worker whose job is to keep the honeypot
ecosystem from looking suspiciously static — a frozen LAN with no
inter-host traffic and no filesystem aging is its own honeypot tell.

MVP scope:
- New OrchestratorEvent table + repo methods (purpose-built sibling
  to Log so synthetic events stay separable from attacker-driven ones).
- New orchestrator.{activity,file}.<decky_id> bus topics +
  system.orchestrator.health heartbeat.
- SSH-only driver. Traffic action runs python3 inside src container
  to TCP-connect dst:22 and read the SSH banner — real on-the-wire
  SSH-protocol traffic without shipping creds. File action drops or
  refreshes a small file via docker exec on the destination.
- Random scheduler (50/50 traffic/file when >=2 SSH-capable deckies
  are running). Diurnal shaping, role-aware pairing, and session-aware
  backoff are explicit non-goals for MVP.
- CLI registration, systemd unit (SupplementaryGroups=docker),
  worker-registry entry so the dashboard shows orchestrator health.
- 11 tests: scheduler policy, driver argv shape + injection-safety,
  end-to-end one-tick integration with FakeBus + SQLite.
2026-04-26 19:43:20 -04:00