Commit Graph

486 Commits

Author SHA1 Message Date
a07fb3fe08 feat(realism): canary cultivator on the realism contract
Stage 7 — final stage of the realism migration. Canary plants are
now scheduled by the same realism planner that handles inert content,
keeping the orchestrator as the single decision point and avoiding
duplicate diurnal / persona / rate-limit logic in the canary
subsystem.

New surface:

- decnet/canary/cultivator.py: cultivate(plan, repo) builds a
  CanaryContext, calls the right generator (canary_aws_creds ->
  aws_creds, canary_mysql_dump -> mysql_dump, …), persists the
  canary_tokens row before plant so the canary worker can attribute
  callbacks even on plant-time previews. Resolves canary placements
  to credible operator paths (~/.aws/credentials, ~/.ssh/id_rsa,
  /var/backups/db_backup.sql).
- realism/planner.py adds 8 canary content_classes uniformly weighted
  inside a 3% probability gate. Hard-capped: each tick at most one
  canary; create branch falls through to inert otherwise.
- scheduler.pick_file dispatches canary content_class to the
  cultivator; FileAction grows an optional content_bytes field so
  binary canary artifacts (DOCX/PDF/honeydoc) survive the wire
  intact instead of being utf-8 round-tripped.
- SSHDriver._run_file uses content_bytes when set, falls back to
  encoding the str content otherwise.

Stealth (per feedback_stealth.md): cultivator does not introduce
any DECNET literal; the underlying generators are already
stealth-clean and the test suite asserts the contract holds.

Tests cover round-tripping every canary class through the cultivator,
verifying placement-path conventions, persona-login normalisation
("John Smith" -> /home/johnsmith/.aws/credentials), and the
no-DECNET-leak invariant.
2026-04-27 16:47:59 -04:00
4e436da569 feat(realism): LLM enrichment for user-class file bodies
Stage 6 of the realism migration. User-class file bodies (note,
todo, draft, script) optionally get LLM-authored content; system
classes (cron / daemon logs, /tmp caches) stay template-only because
formulaic *is* the right look for them.

New surface:

- realism.llm.circuit.LLMCircuitBreaker — process-local sliding-window
  breaker. 3 consecutive failures trip open; 60s cooldown to half-open;
  half-open success closes, failure re-opens. Protects the orchestrator
  tick from sustained Ollama wedges (per-call timeout already covers
  one-shot hangs).
- realism.prompts._style — em-dash suppression lifted from the
  email prompt. Persona.uses_llms_heavily opts out per the
  feedback_em_dash_llm_tell.md memory. Includes strip_em_dashes
  belt-and-braces sub for output that slipped past the prompt rule.
- realism.prompts.filebody — class-conditioned prompts (note / todo
  / draft / script) with persona context, language pinning, output
  shape rule.
- realism.bodies.make_body_with_llm — async wrapper around make_body
  that calls the LLM when one is provided AND the breaker allows.
  Falls back to template on timeout / error / empty / system-class.

Wiring:

- scheduler.pick_file accepts optional llm + llm_breaker + llm_timeout.
  When the planner picks a create action and the content_class is a
  user-class, the body_hint is replaced with the LLM-authored body
  (or falls back to the deterministic body_hint).
- orchestrator.worker constructs get_llm() at startup gated by
  DECNET_REALISM_LLM env var (any non-empty value enables; empty /
  "off" / "none" / "0" disables). Passes llm + breaker through every
  tick.
- decnet orchestrate gains --llm/--no-llm flag overriding the env var.
2026-04-27 16:42:58 -04:00
b321e29002 feat(realism): EditAction read-modify-write of planted files
Stage 3b of the realism migration. A TODO.md planted on Monday gets a
checkbox flipped on Tuesday; a notes file grows a follow-up line; a
cron log gets a fresh entry tacked on. The synthetic_files row's
edit_count, last_modified, and content_hash advance.

New surface:

- EditAction dataclass (peer of FileAction in scheduler.py): carries
  decky, path, persona, content_class, previous_body, mtime, and
  synthetic_file_uuid for the worker's update path.
- realism.bodies.next_iteration(cls, persona, prev, rng): per-class
  deterministic mutators. TODO flips an unchecked box and/or appends;
  notes/drafts/scripts append; logs are append-only (mirroring real
  log behaviour). Canary, cache_tmp, email raise KeyError —
  unsupported.
- realism.planner.pick gains an edit branch: 60% create, 30% edit
  (when an edit_candidate is supplied), 10% leave-alone. Returns
  None on leave-alone — quiet ticks are realism too.
- scheduler.pick_file pre-fetches a single edit candidate via
  repo.pick_random_synthetic_file_for_edit ~50% of ticks; the
  planner decides whether to use it.
- SSHDriver._run_edit: turns next_iteration output into a
  plant_file call (mtime-bumped, mode 0o644). Stashes new_body in
  result.payload so the worker can hash it for synthetic_files.
- worker._bump_synthetic_file_after_edit: patches edit_count + 1,
  last_modified=now, content_hash, last_body for the row UUID.
  No-op when the row was pruned mid-flight.
- events.to_row / topic_for / event_type_for now recognise
  EditAction (kind="file", action="file:edit").
2026-04-27 16:38:17 -04:00
32eeb0c813 refactor(orchestrator): collapse decnet-emailgen.service into orchestrator
Stage 5 of the realism migration. Email generation is no longer a
separate worker / systemd unit / CLI subcommand — the orchestrator's
single tick loop covers SSH traffic, file plants, and email drops.
Going from 21 services to 20.

Worker:
- _one_tick rolls between traffic / file / email (45/45/10 weights).
  The 10% email weight at a 60s orchestrator interval produces ~one
  email per 10 minutes, close to the pre-collapse 5-minute cadence.
- get_driver_for(action) (stage 4) handles SSH vs Email dispatch.
- Quiet branches fall through so a (decky-set, persona-pool,
  mail-decky) shape that silences one branch doesn't waste the tick.
- Periodic prune covers both orchestrator_events and
  orchestrator_emails tables.

Deletions:
- deploy/decnet-emailgen.service.j2
- decnet/orchestrator/emailgen/worker.py
- decnet/cli/emailgen.py
- tests/orchestrator/emailgen/test_worker_integration.py

Renames (history-preserving):
- decnet/web/router/emailgen/ -> decnet/web/router/realism/
- tests/api/emailgen/        -> tests/api/realism/
- tests/cli/test_emailgen_*  -> tests/cli/test_realism_*

Public surface changes (clean break, pre-v1):
- API URL /api/v1/emailgen/personas -> /api/v1/realism/personas
- CLI `decnet emailgen import-personas` -> `decnet realism
  import-personas`. `decnet emailgen run` is gone — the orchestrator
  covers it.
- gating.py: emailgen master-only group replaced by realism.
- decnet-orchestrator.service.j2: DECNET_REALISM_* env block added.
- decnet.target: decnet-emailgen.service entry removed.
- frontend: PersonaGeneration.tsx fetches /realism/personas.
2026-04-27 16:33:04 -04:00
cb1872c52f feat(realism): synthetic_files table + planner wiring + scheduler swap
Stage 3 of the realism migration. Replaces orchestrator/scheduler.py's
hardcoded _FILE_TEMPLATES/_USERS (3 templates emitting epoch-suffixed
filenames like notes-1777315854.txt with identical bodies per
template) with a persona-driven realism engine.

New surface:

- SyntheticFile SQLModel (synthetic_files table, UNIQUE on
  decky_uuid+path) — per-(decky, path) state for the future
  edit-in-place flow. Pre-v1, no _migrate_* helper.
- BaseRepository methods: record_synthetic_file,
  update_synthetic_file, list_synthetic_files,
  pick_random_synthetic_file_for_edit (used by stage 3b).
- realism/naming.py: per-content-class filename templates,
  persona-conditioned. /var/log/cron.log + logrotate skeleton for
  system-class; /home/<persona>/TODO.md, scratch.md, etc. for
  user-class. Anti-regression test pins "no 8+ digit decimals in
  basenames" (the realism failure today).
- realism/bodies.py: deterministic body templates per content_class.
  TODO body uses checkbox markdown, script body has a shebang, cron
  body matches syslog cron shape ("CRON[PID]: (user) CMD (...)").
- realism/planner.py: pick(deckies, now, rng) returns a Plan.
  Diurnal-gated, weighted user/system content split (70/30 user
  bias). Create-only in stage 3; edit branch lands in stage 3b.

Scheduler split:

- scheduler.pick is now traffic-only (sync).
- scheduler.pick_file is async, takes a repo, resolves personas
  (Topology.email_personas for topology-source deckies; global
  realism.personas_pool otherwise), and maps Plan -> FileAction.
- FileAction gains persona/content_class/mtime fields.

Worker:

- _one_tick rolls 50/50 between traffic and file each tick. After a
  successful FileAction plant, _record_synthetic_file persists or
  patches the synthetic_files row (catching the unique-constraint
  collision on re-plant of the same path).
- SSHDriver._run_file passes action.mtime through to plant_file so
  files don't all stamp at wall-clock-now.
2026-04-27 16:22:07 -04:00
636c057cc5 refactor(orchestrator): extract ActivityDriver ABC + driver factory
Stage 4 of the realism migration. Lifts the driver Protocol into a
proper ABC with default plant_file/read_file methods (raise
NotImplementedError), and adds get_driver_for(action) so the
orchestrator worker can dispatch by action shape without isinstance
chains.

SSHDriver now inherits ActivityDriver and implements:
- plant_file: streams base64 via stdin (ARG_MAX-safe, mirrors
  decnet.canary.planter; commit c17b9e0). Honours mtime via touch -d
  so realism-planned files don't all stamp at wall-clock-now.
- read_file: docker exec cat with FileNotFoundError on rc=1, used by
  the upcoming EditAction (stage 3b).

EmailDriver inherits ActivityDriver. Driver alias kept for back-compat
during the migration; removed once realism stages 5-7 land.
2026-04-27 16:09:46 -04:00
0b9873982d refactor(realism): move emailgen LLM/personas/prompt into shared library
Lift the format-agnostic pieces from decnet/orchestrator/emailgen/
into the new decnet/realism/ library so file-class content generation
(stage 3 of the realism migration) can reuse them. Email-specific
delivery (RFC 2822 EML, IMAP/POP3 spool, thread chains) stays in
orchestrator/.

Renames (history-preserving git mv):
  emailgen/personas.py     -> realism/personas.py
  emailgen/prompt.py       -> realism/prompts/email.py
  emailgen/global_pool.py  -> realism/personas_pool.py
  emailgen/llm/            -> realism/llm/

Env-var clean break (pre-v1, no aliases):
  DECNET_EMAILGEN_LLM      -> DECNET_REALISM_LLM
  DECNET_EMAILGEN_MODEL    -> DECNET_REALISM_MODEL
  DECNET_EMAILGEN_TIMEOUT  -> DECNET_REALISM_TIMEOUT
  DECNET_EMAILGEN_PERSONAS -> DECNET_REALISM_PERSONAS
  DECNET_EMAILGEN_FAKE_OUTPUT -> DECNET_REALISM_FAKE_OUTPUT

Importers rewritten in: orchestrator/emailgen/scheduler.py,
orchestrator/drivers/email.py, web/router/{emailgen,topology}/
api_personas.py, cli/emailgen.py. Tests for moved modules relocated
to tests/realism/; tests for stay-put modules updated in place.

API URL `/api/v1/emailgen/personas` and CLI `decnet emailgen
import-personas` keep their public names until the service-collapse
commit (stage 5).
2026-04-27 16:05:43 -04:00
f57c621117 feat(realism): scaffold decnet/realism/ library
Empty subpackage skeleton for the realism migration: ContentClass enum
(file/email/canary content categories), Plan dataclass (frozen, with
edit-action invariant), in_work_hours window check (wrap-around
supported, fail-open on parse error), and sample_mtime for backdated
file timestamps that snap into a persona's active hours.

Stage 1 of the orchestrator+canary realism unification — no
production caller wired yet; planner.pick is a stub returning None
until stage 3.
2026-04-27 15:55:21 -04:00
6376523923 feat(canary): mysql_dump generator with phone-home replica payload
Mirrors the Canarytokens.org trick: a base64-wrapped CHANGE REPLICATION
SOURCE TO + START REPLICA block in the dump trailer. Importing the
file into MySQL resolves <slug>.<dns_zone> (DNS trip) and opens a 3306
replica handshake whose SOURCE_USER smuggles @@hostname and
@@lc_time_names of the victim DB.

DNS lookup alone is sufficient for detection via the existing canary
dns_server; capturing the smuggled metadata via a 3306 handshake
responder is a follow-up.
2026-04-27 13:52:55 -04:00
5ac8e0f91a feat(canary): honeydoc_docx + honeydoc_pdf generators
honeydoc previously emitted HTML only — operators picking 'Document'
out of the dropdown got a .html file dropped at /Documents/
quarterly_report.docx, which any attacker would clock the moment they
ran 'file' on it.

Two new generators that emit the real artifact format:

- honeydoc_docx: stdlib zipfile only. Builds a minimal but valid
  Office Open XML zip with the same Q3 review body as the HTML
  flavor and an external-image relationship pointing at the
  callback URL — same trick the operator-upload DOCX instrumenter
  uses, fetched on document open by Word and LibreOffice. Reuses
  _drawing() and _next_rid() from instrumenters/docx.py to keep
  the body/relationships shape identical between synthesised and
  instrumented files.

- honeydoc_pdf: pikepdf-backed. One-page PDF in the 14 base fonts
  (Helvetica, no font embedding), realistic body, /OpenAction /URI
  on the catalog so most viewers fire the callback on document
  open. Falls back to a clear error if pikepdf is missing so the
  operator can switch to honeydoc / honeydoc_docx.

Default placement paths now reflect each generator's true extension
(.html / .docx / .pdf) so the UI suggests something sensible. Both
generators surfaced in the New Token modal's generator dropdown.
2026-04-27 13:44:20 -04:00
c17b9e01c8 fix(canary): stream base64 payload via stdin to avoid ARG_MAX
Real-world plant() crashed with OSError [Errno 7] Argument list too
long when an artifact (honeydoc HTML / DOCX / PDF) base64-encoded
into the sh -c script body exceeded the kernel's argv limit (typically
128KB-2MB depending on the host).

Fix: keep the script trivial ('mkdir -p ... && base64 -d > path && ...')
and stream the encoded bytes through 'docker exec -i ... sh -c'
stdin instead. _run() grew an optional stdin_bytes parameter that's
piped into proc.communicate(input=...). The stdin path covers
arbitrarily large artifacts.

Tests updated:
- test_plant_argv_and_base64_round_trip now asserts the docker -i
  flag is present and the base64 payload reaches stdin (and notably
  is NOT in the script body).
- _FakeProc.communicate accepts input=None across the board so the
  patched fast path no longer trips on the new kwarg.
2026-04-27 13:37:19 -04:00
34c85346a6 feat(deploy): seed canary baseline at deploy time + tests
Hooks decnet.canary.planter.seed_baseline into the deploy() flow's
fleet-mirror step. After upserting a FleetDecky as 'running' we seed
the configured baseline canary set on the freshly-deployed decky.

Persona detection: read d.nmap_os (Windows -> windows path-mapping,
otherwise linux). Failures are logged and surface as state=failed
rows in the UI; the deploy itself MUST NOT abort (resilience
principle in CLAUDE.md).

Tests confirm:
- seed_baseline produces one row per configured generator per decky;
- the deployer source wires seed_baseline inside a try/except so a
  failure can't abort the deploy.
2026-04-27 13:19:08 -04:00
6c4ea706f8 feat(api): canary token CRUD router (/api/v1/canary) + tests
Two sub-routers under /api/v1/canary:

blobs (operator-uploaded artifacts, deduped by sha256):
- POST   /blobs          (multipart upload; admin)
- GET    /blobs          (list with token_count; admin)
- DELETE /blobs/{uuid}   (refcount-aware; 409 when referenced; admin)

tokens (per-decky planted artifacts):
- POST   /tokens                          (generate or instrument + plant; admin)
- GET    /tokens?decky_name=&kind=&state= (filter; viewer)
- GET    /tokens/{uuid}                   (detail; viewer)
- GET    /tokens/{uuid}/preview           (instrumented bytes; admin)
- GET    /tokens/{uuid}/triggers          (paged callback log; viewer)
- DELETE /tokens/{uuid}                   (revoke + bus event; admin)

XOR validation: exactly one of blob_uuid / generator must be set.
Path validation rejects relative/NUL/newlines/.. segments. Every
body-bearing route documents 400 plus 401/403/404 as applicable.

Stdlib MIME sniffer (no python-magic dep) covers PNG/JPEG/GIF/PDF/
HTML/XML/DOCX/XLSX/JSON/YAML/TOML/text/plain; everything else falls
through to passthrough.

Tests run end-to-end through the live FastAPI app (planter docker
exec is patched); 17 cases covering dedup, refcount, lifecycle,
XOR validation, path validation, and 404 paths.
2026-04-27 13:18:00 -04:00
f9513bb7dd feat(cli): register decnet canary subcommand + tests
decnet canary launches the HTTP + DNS callback receiver via
decnet.canary.worker.run. Mirrors the shape of decnet webhook
(typer command with --daemon flag, asyncio.run in the foreground).

Deliberately NOT added to MASTER_ONLY_COMMANDS — every host that
hosts deckies runs its own canary worker, and the bus events stay
local to that host (per-host webhook fanout handles SIEM egress).
2026-04-27 13:13:23 -04:00
fae3e0caa3 feat(canary): worker (HTTP + stdlib DNS callback receivers) + tests
decnet canary worker hosts both callback surfaces in one process:

- HTTP: a tiny FastAPI app on its own port (default 8088). The only
  meaningful route is GET /c/{slug} which looks up the slug, persists
  a CanaryTrigger, publishes canary.<id>.triggered, and returns a 1x1
  transparent GIF. Unknown slugs return the same response (stealth);
  no decnet strings leak in headers/banners; docs/openapi/redoc are
  disabled. X-Forwarded-For is honored.

- DNS: an authoritative UDP server for *.<canary_zone> using
  asyncio.DatagramProtocol with stdlib-only DNS wire-format parsing
  (no dnslib dep). Same lookup -> persist -> publish flow, plus a
  sinkhole A record (192.0.2.1) so the attacker's resolver doesn't
  loop on NXDOMAIN. Single-label slugs only; multi-label probes
  return NXDOMAIN. Pointer loops in malformed queries are caught
  (10-hop cap) so an adversarial packet can't wedge the parser.

Tests cover both surfaces without privileged sockets:
- HTTP via Starlette TestClient: known/unknown slug, headers, XFF,
  stealth-string assertions.
- DNS via direct DatagramProtocol drive: known slug -> ANSWER,
  unknown -> NXDOMAIN, pointer-loop -> ValueError, malformed
  packet -> silent drop.
2026-04-27 13:12:05 -04:00
8fb9bc5545 feat(canary): planter (docker exec injector) + tests
Plant / revoke / seed_baseline using the same docker-exec-with-sh-c
pattern proven by decnet/orchestrator/drivers/ssh.py:_run_file.

Each plant call composes a single sh script:
  mkdir -p <dirname> && printf %s <base64> | base64 -d > <path> &&
  chmod <mode> <path> && touch -d @<mtime> <path>

Base64-on-the-host / decode-in-the-container keeps binary artifacts
(DOCX/PDF/PNG) safe across the argv boundary; the placement_path,
mode, and mtime are shlex-quoted.

State transitions hit the repo: planted -> failed on docker error
with stderr captured into last_error. Bus events fire on success
(canary.<id>.placed) and on revoke (canary.<id>.revoked) — wrapped
in try/except so a downed bus never blocks a placement.

seed_baseline(decky_name, repo) is the deploy-hook entry point —
reads DECNET_CANARY_BASELINE (default git_config,env_file,honeydoc,
aws_creds), persists one row per generator, plants each. Failed
placements are logged but do NOT abort; the deployer hook treats
the return list as informational.
2026-04-27 13:08:18 -04:00
19ceff4417 feat(canary): operator-upload instrumenters + tests
Seven instrumenters that mutate operator-supplied artifacts to
embed the callback URL:

- passthrough — bytes unchanged; only DNS-callback tokens trip
  detection, with the slug embedded in the placement path
- plain      — substitutes {{CANARY_URL}}/{{CANARY_HOST}} placeholders;
  falls back to appending a comment line whose prefix adapts to the
  apparent file syntax (#, //, ;)
- html       — injects a 1x1 tracking pixel before </body>, appends
  if the close tag is missing
- docx       — direct zipfile manipulation (no python-docx dep):
  inserts an external-image Relationship into word/_rels/document.xml.rels
  and a matching <w:drawing> element before </w:body>
- xlsx       — sibling of docx; injects an external-image relationship
  into xl/_rels/workbook.xml.rels (orphan rels are still fetched on
  open by most viewers)
- pdf        — uses pikepdf to install /OpenAction /URI on the catalog;
  rejects with a clear message when pikepdf isn't installed
- image      — uses Pillow to embed slug + URL in PNG tEXt / JPEG
  comment; rejects with a clear message when Pillow isn't installed

DOCX and XLSX share the rId allocator + relationship injector via
the docx module; both work on stdlib zipfile only.

Tests synthesise minimal real DOCX/XLSX fixtures inline, round-trip
each instrumenter, and assert the callback URL ends up in the
mutated bytes while the file still parses.
2026-04-27 13:03:42 -04:00
c7658ea65e feat(canary): synthesised-artifact generators + tests
Five built-in generators that produce deterministic fake artifacts
keyed by the token slug:

- aws_creds  — passive [default]/[prod] credentials block, no
               callback wiring (AWS-key tokens require an external
               trap, which is post-v1)
- git_config — .git/config with origin url = http_base/c/<slug>/repo.git
- env_file   — .env with API_BASE_URL + WEBHOOK_NOTIFY_URL embedding
               the callback URL plus inert realism filler
- ssh_key    — PEM-shaped fake private key whose host comment carries
               <slug>.<dns_zone> when DNS is deployed, else the
               http_base host
- honeydoc   — minimal HTML report with a 1x1 tracking-pixel <img>
               whose src is the callback URL; fallback for the
               deploy-time baseline before the operator uploads a
               real DOCX/PDF

Tests assert byte-stability (same ctx -> same bytes), slug presence
in the embedded fields, that aws_creds is intentionally URL-free,
and that every artifact carries operator-facing notes for the
preview endpoint.
2026-04-27 12:59:19 -04:00
8f19adecfe feat(canary): package scaffolding (base/factory/paths/storage) + tests
Mirrors the decnet.intel layout (base + factory + lazy concrete
imports). Defines:

- CanaryArtifact / CanaryContext dataclasses + the generator and
  instrumenter ABCs they share
- factory dispatch for generators (git_config/env_file/ssh_key/
  aws_creds/honeydoc) and instrumenters (docx/xlsx/pdf/html/image/
  plain/passthrough), plus pick_instrumenter_for_mime() for MIME-driven
  dispatch on operator uploads
- persona-aware default placement paths (Linux vs. Windows-shaped)
  and absolute-path validation that the API will use to validate
  operator-supplied placement_path values
- on-disk blob store: sha256-keyed two-level fan-out, idempotent
  writes, refcount-aware unlink (the DB row is the source of truth)

Also covers prior commits' tests (bus topics, models, repo CRUD)
under tests/canary/. 79 tests, all pass.
2026-04-27 12:56:01 -04:00
6a0d140e91 feat(db): canary token repository CRUD
Adds the abstract surface on BaseRepository and the SQLModel-backed
implementation (shared by SQLite and MySQL) for:

- canary blobs (upsert-by-sha256, list-with-refcount, refcount-aware delete)
- canary tokens (create, slug lookup, list with filters, state update)
- canary triggers (record+bump-counters atomically, list, attribute)

The triggers path is a single session that inserts the row and bumps the
parent token's counters together, so a subscriber that reads the token
right after the bus event sees the updated count. Blob delete refuses
while any token (including revoked) still references the blob; pre-v1
revoked tokens stick around for forensic value.
2026-04-27 12:48:24 -04:00
813f14bf2a feat(db): canary token tables (blob/token/trigger)
Three new tables for the canary tokens feature:

- canary_blobs       — operator-uploaded source artifacts, deduped by sha256
- canary_tokens      — one planted artifact in one decky; carries the
                       callback slug, generator/instrumenter, and lifecycle
- canary_triggers    — append-only log of every callback hit; attacker_id
                       back-filled by the correlator

Pydantic request/response shapes live in the same file per the
single-source-of-truth convention. No migrations file — pre-v1
SQLModel.metadata.create_all() covers it.
2026-04-27 12:45:41 -04:00
914c911984 feat(bus): canary token bus topics (placed/triggered/revoked)
Reserved topic family for the upcoming canary-tokens feature so the
correlator and webhook fanout can subscribe to canary.> from day one.
No producers yet; planter, decnet canary worker, and API will publish
in subsequent commits.
2026-04-27 12:43:23 -04:00
828165783e feat(templates): standalone NTLMSSP Type 3 parser + decnet-init wrapper
* decnet/templates/{rdp,smb}/ntlmssp.py — minimal Type 3 (Authenticate)
  parser shared between the SMB and RDP-NLA templates.  Lands NTLM
  creds in the universal Credential table with secret_kind=ntlmssp_v1
  / ntlmssp_v2 and secret_b64 = base64 of the NtChallengeResponse so
  the bounty pipeline can feed the right hashcat mode.
* scripts/decnet-init.sh — convenience wrapper around `sudo decnet init
  --force` that targets the current working directory; saves operators
  retyping the install paths during dev iterations.
2026-04-27 10:12:30 -04:00
f046634d6e feat(web): Persona Generation page under AUTOMATION
New dashboard surface for editing the global emailgen persona pool —
the JSON file fleet (MACVLAN/IPVLAN) and SWARM-shard mail deckies pull
from.  MazeNET topology personas are out of scope here; they're
configured per-topology in the topology editor.

Backend:
* GET/PUT /api/v1/emailgen/personas — admin-write, viewer-read.  PUT
  validates with the same Pydantic schema the worker uses
  (parse_personas), drops invalid entries with a warning, returns 400
  only when the entire payload fails.  Path is operator-discoverable
  on every response so a CLI-driven backup workflow stays visible.

Frontend:
* PersonaGeneration.tsx + .css — table + add/edit modal with the full
  EmailPersona schema (name, email, role, tone, mannerisms list,
  language, signature, active hours, reply latency, uses_llms_heavily).
  Local edits are batched; explicit "SAVE CHANGES" writes back, with a
  dirty-indicator pill and a "DISCARD" reset.  Email uniqueness is
  enforced client-side so the scheduler never picks the same persona
  as both sender + recipient.
* Sidebar AUTOMATION group gains a "Persona Generation" entry next to
  Orchestrator; route registered at /persona-generation.

The worker reads the same on-disk file the API writes — see
decnet.orchestrator.emailgen.global_pool.  The API resets the
in-process cache on every read/write so the worker picks up dashboard
edits within its next tick rather than waiting on mtime.
2026-04-27 09:55:42 -04:00
818aebadfc feat(web): emailgen events in Orchestrator page
The SSE pipe at /orchestrator/events/stream was already streaming
'orchestrator.email.{decky_uuid}' events (the subscription is for the
'orchestrator.>' wildcard), but the consumer side dropped them on the
floor.  Three fixes to close the loop:

* useOrchestratorStream.ts now registers an 'email' SSE listener — the
  EventSource silently ignores frames whose event name has no listener,
  so missing this entry meant every email frame was dropped before
  reaching the page's onEvent handler.

* /api/v1/orchestrator/events accepts kind=email and dispatches to
  list_orchestrator_emails, adapting rows to the existing wire shape:
  subject -> action, sender_email -> src_decky_uuid, recipient_email
  -> dst_decky_uuid, plus email-specific extras (thread_id, language,
  mail_decky_uuid, message_id, in_reply_to) ride along as top-level
  keys.

* Orchestrator.tsx gains an 'email' tab in the kind filter and a
  branch in the row renderer / inspector that:
   - shows full sender / recipient (no UUID truncation),
   - chips the language code next to the subject,
   - relabels ACTION as SUBJECT in the inspector and surfaces
     thread / in-reply-to / mail-decky details.

The 'all' tab continues to show traffic+file only (today's behavior);
operators see emails by switching to the email tab.  A union view at
the API layer is the obvious follow-up but not necessary for now.
2026-04-26 22:56:48 -04:00
73692b52f0 feat(emailgen): gate as master-only
Two-layer gating per CLAUDE.md:
- registration-time: emailgen added to MASTER_ONLY_GROUPS so agents
  don't see the sub-app in 'decnet --help' at all.
- body-guard: _require_master_mode('emailgen ...') at the top of every
  sub-command body so a direct callable import (third-party tooling)
  still bails on agent hosts.

Matches the convention used for 'swarm', 'topology', 'geoip'.  SWARM
agents push their generated mail through the master's emailgen worker
(or none at all); cross-agent emailgen federation stays out of scope.
2026-04-26 22:45:59 -04:00
6d520eaa6f refactor(emailgen): pluggable LLM backend (base/factory/impl)
Lift the Ollama subprocess shell-out out of EmailDriver and into a
proper provider subpackage shape:

  decnet/orchestrator/emailgen/llm/
    base.py        — LLMBackend Protocol + LLMResult + LLMTimeout
    factory.py     — get_llm() reads DECNET_EMAILGEN_LLM
    impl/ollama.py — current 'ollama run' subprocess path
    impl/fake.py   — canned-output backend used by tests

Driver now takes an LLMBackend on construction (or inherits the
factory default).  Tests inject FakeBackend instead of monkeypatching
the subprocess layer, which is cleaner and ~10x faster.  Swapping
Ollama for the Anthropic API / vLLM / llama.cpp is now a third branch
in factory.py; no driver rewrite needed.

Mirrors the convention used by decnet.web.db.factory + decnet.bus.factory
per the provider-subpackages-from-day-one rule in memory.
2026-04-26 22:43:36 -04:00
4badc75fb2 feat(emailgen): global persona pool + Date-stamped EML mtimes
Two changes that unwind earlier MazeNET-only assumptions and fix a
realism tell:

1. Persona resolution is now per-decky-source, not topology-only.  The
   scheduler walks the union view (list_running_deckies, including
   fleet MACVLAN/IPVLAN + SWARM shards) and picks the right persona
   list for each source:
     * topology decky -> Topology.email_personas (per-topology richness
       preserved)
     * fleet / shard  -> a single host-wide pool loaded from disk
       (DECNET_EMAILGEN_PERSONAS, /etc/decnet/email_personas.json, or
       ~/.decnet/email_personas.json)
   Operators install the global pool via 'decnet emailgen
   import-personas <file>' which validates with the same Pydantic
   schema the worker uses.

2. The driver now runs 'touch -d <Date>' inside the docker exec right
   after the EML write so file mtime matches the email's RFC 2822
   Date: header.  Without this an attacker 'ls -lt'ing the spool sees
   every email clustered inside the worker's tick window — the
   cluster itself was a stylometric tell.

CLI now exposes 'decnet emailgen' as a sub-app with 'run' (default,
backwards-compatible with bare 'decnet emailgen') and 'import-personas'.
list_running_deckies carries topology_id through so consumers can resolve
the parent topology without a second round-trip.
2026-04-26 22:39:16 -04:00
2979997442 feat(templates): IMAP/POP3 servers read EML spool from emailgen
When IMAP_EMAIL_SEED / POP3_EMAIL_SEED points at a directory of .eml
files (the orchestrator emailgen worker's drop path,
/var/spool/decnet-emails/ by convention), the bait mailbox is replaced
with those LLM-generated, persona-driven, threaded messages.  Empty /
missing dir keeps the hardcoded fallback so a fresh deployment is never
silent.  Cached with mtime invalidation + a short TTL so a hot mailbox
doesn't pay the parse cost on every IMAP/POP3 command.

Replaces the DEBT-026 stub on both templates that named the env var but
never wired it through.
2026-04-26 22:21:01 -04:00
3ee55ec341 feat(emailgen): Ollama-driven fake email worker for IMAP/POP3 deckies
Second orchestrator worker (decnet emailgen) that drips persona-driven,
threaded, multi-language fake emails into running mail deckies.  Personas
live on Topology.email_personas; topology-wide language_default falls
through to any persona that doesn't pin its own.  Em-dashes are
suppressed at the prompt layer by default and only lifted for personas
explicitly marked uses_llms_heavily — em-dashes are an LLM tell and a
flat corpus of em-dashed mail is a giveaway.

EML delivery writes into /var/spool/decnet-emails/<thread>/<msg>.eml on
the mail decky via docker exec; wiring the IMAP/POP3 templates to read
from that spool (replacing the hardcoded _BAIT_EMAILS) is the next step.
2026-04-26 22:16:19 -04:00
9650366d34 fix(orchestrator): drop topology_deckies FK on event src/dst columns
Once the orchestrator started seeing fleet + SWARM shard sources via
list_running_deckies (a844148), every event row landing on a fleet decky
broke the FK to topology_deckies — the column now carries opaque ids
("local:omega-decky" for fleet, "host_uuid:decky_name" for shards) that
will never match topology_deckies.uuid.

Symptom on the operator's mothership:
  IntegrityError 1452 — orchestrator_events_ibfk_2 FK violated on every
  tick once the reconciler populated fleet_deckies.

Index on dst_decky_uuid is preserved (the dashboard reads
"events for this decky" frequently); only the FK is removed.  Keeps
data integrity loose by design — events are append-only history that
should outlive the deckies they reference.

Existing MySQL deployments need the FK dropped manually:
  ALTER TABLE orchestrator_events
    DROP FOREIGN KEY orchestrator_events_ibfk_2,
    DROP FOREIGN KEY orchestrator_events_ibfk_1;

SQLite users are unaffected — SQLite doesn't enforce FKs by default.
2026-04-26 21:40:06 -04:00
c3518e3159 feat(workers): surface clusterer, campaign-clusterer, reconciler in panel
The Workers panel (Config → Workers tab) hardcodes its row list in
KNOWN_WORKERS — by design, so a rogue publisher can't inject UI rows.
Three heartbeat-emitting workers were missing:

  * clusterer            — behavioral clustering (decnet/clustering/)
  * campaign-clusterer   — campaign assembly  (decnet/clustering/campaign/)
  * reconciler           — host-local fleet convergence (added in 430262e)

Each already publishes on system.<name>.health via run_health_heartbeat,
so they show up live the moment they're added to the registry — no
frontend or subscriber wiring needed (Config.tsx renders whatever
/workers returns).

Also added to _PREFERRED_ORDER in start-all so START ALL WORKERS brings
them up in dependency-friendly order: data-plane → reconciler → intel
→ clustering → output → orchestrator.

Three deployable units (listener, web, swarmctl) intentionally remain
absent from KNOWN_WORKERS — they don't emit heartbeats (CLI / static
server / one-shot tooling), so they'd permanently render as UNKNOWN
and confuse operators.  Adding them is a separate decision that needs
a "synthesize installed-but-silent rows" pass on the registry.
2026-04-26 21:31:34 -04:00
430262e01a feat(fleet): systemd unit + bus signal for fleet reconciler
Two pieces, one PR because they share a deployment surface:

1. systemd. decnet-reconciler.service.j2 mirrors the orchestrator unit
   shape (docker group, hardened sandbox, append-logs).  Read-only
   /var/lib/decnet so it can read decnet-state.json without write
   access.  Auto-discovered by `decnet init` via the existing
   decnet-*.service.j2 glob — no init.py change needed.  Added to
   decnet.target so `systemctl start decnet.target` brings it up
   alongside collector / sniffer / mutator / etc.  Also added to the
   agent reaper script so self-destruct cleans it up on workers.

2. Bus signal. reconcile_once now publishes
   `decky.<host_uuid:name>.state` on every insert / delete /
   state-changed transition.  Reuses the existing DECKY_STATE topic
   family (no bus/topics.py change → no wiki update needed per the
   bus-signals doc rule).  Composite host_uuid:name segment keeps
   fleet rows distinguishable from MazeNET TopologyDecky rows whose
   ids are bare UUIDs.  Quiet ticks publish nothing — convergence
   means silence.

Bus is plumbed through the worker, defaults to None for unit-test
callers.  publish_safely keeps the source-of-truth contract: DB write
is authoritative, the publish is best-effort notification.

Captures previous_state into a local before update_fleet_decky_state
runs — a fake repo that mutates rows in-place would otherwise see the
post-update state and report previous == current.  Real repos don't
have this concern but the fix is cheap and makes the function less
order-dependent.
2026-04-26 21:21:36 -04:00
a8441481b5 fix(orchestrator): see fleet + shard deckies, not just topology rows
Switches _one_tick from list_running_topology_deckies to
list_running_deckies (the union view added in 095500a). Resolves the
permanent "no actionable deckies (running+ssh count=0)" log on hosts
running only unihost MACVLAN / IPVLAN decoys — the orchestrator now
sees fleet_deckies rows alongside MazeNET topology rows and SWARM
DeckyShard rows.

Also fixes the misleading log message: the old "running+ssh count=N"
reported the *pre-filter* total (count of all running deckies, not
the SSH-eligible subset that scheduler.pick actually evaluates). New
line breaks down running, ssh_eligible, and per-source counts so
debugging "why isn't it picking?" no longer requires reading
scheduler internals.

Regression test: orchestrator integration suite now seeds fleet_deckies
rows (not just topology_deckies) and verifies a tick picks them and
records an event with dst="local:fleet-*" — proving the original bug
on the operator's mothership is fixed.
2026-04-26 21:16:22 -04:00
f775223a83 feat(fleet): reconciler converges JSON ↔ DB ↔ docker
Adds decnet.fleet.reconciler — a pure async function plus a long-lived
worker — that periodically reconciles the three sources of truth on a
DECNET host:

  1. decnet-state.json (CLI-canonical fleet record)
  2. fleet_deckies table (DB mirror, written by engine.deployer)
  3. docker inspect (actual per-container runtime state)

Drift handling:
  * JSON has X, DB doesn't       → INSERT (deploy ran with DB offline)
  * DB has X (this host), JSON doesn't → DELETE (teardown ran with DB offline)
  * Both have X, docker disagrees → flip state to running/failed/degraded
  * Docker socket unreachable    → leave existing state alone (don't
                                    torch every row to torn_down)

Cross-host safety: deletions are scoped to host_uuid for the local host;
a master that runs both a local fleet and swarm workers will never
clobber a peer's slice.

CLI:
  decnet reconcile --once            # one-shot, prints counts
  decnet reconcile [--interval N]    # long-lived worker, mirrors
                                     # orchestrator's lifecycle (control
                                     # listener + heartbeat + tick loop)

Promotes decnet/fleet.py → decnet/fleet/ package so the reconciler can
live alongside it without name collision (build_deckies_from_ini and
all_service_names re-exported unchanged via __init__.py).

14 new tests cover state aggregation rules, all four drift directions,
host_uuid scoping, docker-unreachable safety, and worker shutdown via
the bus control event.
2026-04-26 21:14:48 -04:00
8814902999 docs(api): clarify fleet_deckies + JSON dual-write happens in engine.deployer
The unihost API path delegates to engine.deployer.deploy(), which now
writes both decnet-state.json (existing) and the fleet_deckies DB
table (added in 646aeec).  Comment makes the single-sink design
explicit so future maintainers don't add a parallel save_state /
upsert_fleet_decky call here.

No behavioral change — every fleet-creation path on every host (CLI
deploy, this unihost API path, and per-worker SWARM agent deploys)
already routes through the engine.deployer single sink.
2026-04-26 21:08:44 -04:00
646aeeca40 feat(deployer): mirror fleet deploy/teardown into fleet_deckies table
CLI deploy now writes both surfaces: decnet-state.json (existing,
canonical for offline / no-API hosts) and the new fleet_deckies DB
table (visible to orchestrator, web dashboard, REST API).

Best-effort: a DB outage logs a warning and returns. The JSON file
remains the source of truth for `decnet status`, `decnet teardown`,
sniffer, and collector — operators on a CLI-only host keep working.

_run_async helper bridges sync deploy() into the async repository.
Always uses a fresh thread because the API handler at
web.router.fleet.api_deploy_deckies invokes deploy() from inside a
FastAPI event loop, which would otherwise break asyncio.run.

Verified end-to-end against MySQL: deploy mirror inserts rows, union
view (list_running_deckies) returns them with source="fleet",
teardown mirror removes them. Works from both sync (CLI) and async
(API handler) call sites.
2026-04-26 21:05:50 -04:00
095500ae9a feat(db): FleetDecky table mirrors decnet-state.json into the DB
Adds a fleet_deckies table so DB-only consumers (orchestrator, web
dashboard, REST API) can see unihost / MACVLAN / IPVLAN deckies
without reading the JSON state file. Mirrors DeckyShard field-for-field.

Composite PK (host_uuid, name) future-proofs for a mothership that
runs both a local fleet and acts as a swarm master. host_uuid defaults
to the "local" sentinel — no FK to swarm_hosts because the local
mothership isn't enrolled as a worker.

Repo additions: upsert_fleet_decky, delete_fleet_decky,
list_fleet_deckies, list_running_fleet_deckies,
update_fleet_decky_state, plus list_running_deckies which unions
topology + fleet + shard sources for the orchestrator.

Smoke-tested round-trip against MySQL: upsert, list_running, union
view (source="fleet"), delete.
2026-04-26 21:00:01 -04:00
c595d039bd feat(sniffer): ISN sequence classifier (reuses seq_class helper)
Mirrors the IP-ID classifier for TCP ISN values: per-source-IP rolling
deque (maxlen=8) populated from each inbound SYN's tcp.seq, classified
on every emission. A 'random' verdict is the modern norm; 'incremental',
'zero', or 'constant' indicates legacy stacks or hand-rolled raw-socket
tooling — a strong fingerprint signal.

Active prober now also captures server_isn (single sample, not classified
in-flight; downstream consumers correlating multi-probe results can apply
seq_class.classify_sequence themselves).

Profiler rollup carries the latest non-'unknown' label into
attacker.tcp_fingerprint. Dedup key already covers isn_class from
the previous commit, so transitions emit cleanly.

UI surfaces ISN class as a colour-coded tag with a ⚠ glyph for
non-random verdicts, since they're the genuinely interesting case.
2026-04-26 20:30:24 -04:00
0e40cc8ae1 feat(sniffer): IP-ID sequence classifier (random/incremental/zero/constant)
Adds a per-source-IP rolling sample buffer (deque, maxlen=8) for IP-ID
values seen on attacker SYNs and a stdlib-only classifier in
decnet/sniffer/seq_class.py. Each new SYN appends ip.id and re-classifies
the buffer; the result is logged on tcp_syn_fingerprint events alongside
sample count.

The dedup key now folds in ipid_class so a transition from 'unknown' to
a definitive verdict emits exactly one fresh event instead of being
suppressed by the old (os|options) key. Profiler rollup carries the
latest non-'unknown' label into attacker.tcp_fingerprint.

UI surfaces it as a colour-coded tag in the TCP STACK panel: random
neutral, incremental amber, zero/constant green (the strong signal).
2026-04-26 20:28:32 -04:00
b0b08754d0 feat(fingerprint): ToS/DSCP/ECN extraction in active + passive TCP fingerprint
Active prober now reads ip.tos from the SYN-ACK and emits tos/dscp/ecn
alongside the existing TTL/window/options fields. dscp is folded into the
fingerprint hash so different DSCP markings produce distinct signatures.

Passive sniffer logs the same three fields on tcp_syn_fingerprint events;
profiler rollup carries them into the attacker tcp_fingerprint snapshot;
AttackerDetail's TCP STACK panel now surfaces DSCP and ECN cells.
2026-04-26 20:25:37 -04:00
3de19eb102 feat(orchestrator): periodic prune of orchestrator_events
Every 100 ticks, trim per-dst_decky_uuid history down to 10000 rows
(oldest first). Keeps the events table bounded on long-running fleets
without paying the cost on every write.
2026-04-26 19:58:43 -04:00
5b5ff54fa2 feat(web): orchestrator events read API + SSE stream
GET /api/v1/orchestrator/events — paginated list with optional
kind=traffic|file filter. GET /api/v1/orchestrator/events/stream —
SSE: snapshot on connect, live forward of orchestrator.> bus events
mapped to 'traffic' / 'file' SSE event names.

Repo gains list_orchestrator_events(limit, offset, kind?, since_ts?),
count_orchestrator_events(kind?), and prune_orchestrator_events
(per_dst_cap=10000) for periodic worker-side trimming.
2026-04-26 19:58:12 -04:00
900c0c3ef5 refactor(bus): rename ORCHESTRATOR_ACTIVITY → ORCHESTRATOR_TRAFFIC
Aligns the bus token with the DB column value; OrchestratorEvent.kind
is 'traffic'/'file' but the topic was 'activity'/'file'. The asymmetry
made consumer code (UI filter, SSE event names) need a translation
layer. No external subscribers existed yet.
2026-04-26 19:53:40 -04:00
4c37ece39e feat(orchestrator): MVP synthetic life-injection worker (SSH only)
Adds a new decnet orchestrate worker whose job is to keep the honeypot
ecosystem from looking suspiciously static — a frozen LAN with no
inter-host traffic and no filesystem aging is its own honeypot tell.

MVP scope:
- New OrchestratorEvent table + repo methods (purpose-built sibling
  to Log so synthetic events stay separable from attacker-driven ones).
- New orchestrator.{activity,file}.<decky_id> bus topics +
  system.orchestrator.health heartbeat.
- SSH-only driver. Traffic action runs python3 inside src container
  to TCP-connect dst:22 and read the SSH banner — real on-the-wire
  SSH-protocol traffic without shipping creds. File action drops or
  refreshes a small file via docker exec on the destination.
- Random scheduler (50/50 traffic/file when >=2 SSH-capable deckies
  are running). Diurnal shaping, role-aware pairing, and session-aware
  backoff are explicit non-goals for MVP.
- CLI registration, systemd unit (SupplementaryGroups=docker),
  worker-registry entry so the dashboard shows orchestrator health.
- 11 tests: scheduler policy, driver argv shape + injection-safety,
  end-to-end one-tick integration with FakeBus + SQLite.
2026-04-26 19:43:20 -04:00
d531cea536 feat(web): read-only campaigns API + SSE + frontend
API: /api/v1/campaigns (paginated list), /api/v1/campaigns/{uuid}
(soft-merge chain follow), /api/v1/campaigns/{uuid}/identities
(member identities), and /api/v1/campaigns/events (SSE under
campaign.> + JWT-via-?token=, snapshot-on-connect). Mirror of the
identity router; same auth, same shape, same OpenAPI tags pattern.

Frontend: CampaignDetail.tsx page (same visual vocabulary as
IdentityDetail), useCampaignStream hook (mirror of
useIdentityStream), /campaigns/:id route, IdentityDetail's
CAMPAIGN badge becomes clickable and navigates to the campaign.
useIdentityStream now listens for identity.campaign.assigned so
the badge appears live without a manual refresh.
2026-04-26 09:20:17 -04:00
75af00c9c8 test(clustering): full-bound passes through production campaign clusterer
Runs the chained identity + campaign clustering pipeline against all
seven fixtures via from_synthetic / from_synthetic_identity adapters
and ratchets every YAML floor to 1.0 — the production clusterer
(and the reference clusterers used in the per-fixture tests) all
score perfectly across ARI / homogeneity / completeness /
singleton_recall on each fixture.

Three substrate fixes surfaced by the ratchet:

- Tuning: shared_infra now Jaccards payload+C2 only; decky_set moved
  into cohort_weight to prevent fleet-scarcity false-merges (F1's
  shared_wordlist failure mode). Tier weight raised to 1.0 so
  shared payload+C2 alone crosses threshold (F5's intended pass).
- Adapter: from_synthetic_identity now reads SyntheticSession
  started_at + duration_s for session_windows and per-decky
  timestamps (the production-row adapter still uses start_ts/end_ts
  when available).
- Fixture data: paused_campaign.yaml's JA3 collided exactly with
  vpn_hopping.yaml's (same TLS extension list). The collision
  fused two unrelated campaigns under the chained identity layer
  in the noise_floor composite. Made paused's JA3 distinct.

Also wires Campaign / CampaignsResponse into models/__init__.py's
__all__ that was missed in the schema commit.
2026-04-26 09:13:59 -04:00
6936a1426c feat(clustering): campaign-clusterer worker + bus topics + CLI
The campaign clusterer worker mirrors the identity-side worker shell
(bus connect, heartbeat, control listener, slow-tick fallback) but
wakes on identity.> instead of attacker.> — campaign-level work is
gated on identity-layer changes, not raw observations.

The connected-components implementation reads identities via
list_identities_for_clustering, projects them with from_identity_row,
runs union-find over combined_campaign_weight, writes campaigns rows,
sets attacker_identities.campaign_id, and runs the same revocable-
merge pass as the identity layer (a merged-out campaign whose
identities no longer co-cluster with the winner gets revoked).

Bus: adds campaign.> family (formed / identity.assigned / merged /
unmerged) plus the cross-family identity.campaign.assigned so
existing identity-stream subscribers see the badge update without
having to subscribe to campaign.>. Wiki Service-Bus.md updated in
wiki-checkout in the same wave per the project's bus-signals
discipline.

CLI: decnet campaign-clusterer registered as master-only via
MASTER_ONLY_COMMANDS; --poll-interval / --daemon mirror the identity
clusterer command surface.
2026-04-26 09:04:00 -04:00
0946bab424 feat(clustering): campaign-level similarity primitives
The signal taxonomy for the campaign clusterer (next commit). Mirror
of the identity-layer module but with edge families that don't
translate 1:1: phase-handoff (load-bearing for F5 multi_operator —
the signal the identity-side fingerprint-disagreement veto deliberately
isn't), shared-infra (vetoed at identity level, primary positive
signal here), temporal-overlap (pairwise-relative — F7 invariance
preserved), cohort (weak supporting weight only).

Tier weights tuned so phase-handoff alone crosses threshold (F5),
shared-infra + temporal-overlap together cross (canonical co-op
pattern), and shared-infra + cohort together do NOT (F1
shared_wordlist's failure mode). The F7 time-shift invariant is
explicitly tested on every time-bearing edge and on the combined
weight.
2026-04-26 08:57:46 -04:00
0a1cf65ddb feat(db): Campaign SQLModel + repo write/read methods
Adds the campaigns table and the BaseRepository / SQLModelRepository
methods that the campaign-clusterer worker (next commit) needs to
populate it. Mirrors the AttackerIdentity layer: schema_version from
day one for federation gossip, soft-merge via merged_into_uuid with a
chain-walking get_campaign_by_uuid, list_campaigns excluding merged-
out rows while list_all_campaigns returns the unfiltered set for the
revoke pass. attacker_identities.campaign_id gets a real FK now that
the target table exists.
2026-04-26 08:54:28 -04:00