Worker unit mirrors decnet-webhook.service shape: simple type, runs
as the decnet user/group, append-style log file, full security
hardening (NoNewPrivileges/ProtectSystem/ProtectHome/PrivateTmp/
LockPersonality + the rest). Added /var/lib/decnet to ReadWritePaths
because the API process persists operator-uploaded canary blobs there.
CAP_NET_BIND_SERVICE granted (ambient + bounded) so an operator who
overrides DECNET_CANARY_DNS_PORT to 53 or HTTP_PORT to 80/443 in
.env.local doesn't need to fight systemd. The defaults stay
unprivileged (5353 / 8088).
Added decnet-canary.service to decnet.target so 'systemctl start
decnet.target' brings it up alongside the rest of the workers.
decnet init auto-discovers deploy/decnet-*.service.j2 files (per
decnet/cli/init.py:_install_units) so no further wiring needed —
running 'decnet init' on a fresh host installs the new unit.
Static tests confirm the unit references decnet canary, depends on
the bus, carries the standard security directives, and is listed
in the master target.
Hooks decnet.canary.planter.seed_baseline into the deploy() flow's
fleet-mirror step. After upserting a FleetDecky as 'running' we seed
the configured baseline canary set on the freshly-deployed decky.
Persona detection: read d.nmap_os (Windows -> windows path-mapping,
otherwise linux). Failures are logged and surface as state=failed
rows in the UI; the deploy itself MUST NOT abort (resilience
principle in CLAUDE.md).
Tests confirm:
- seed_baseline produces one row per configured generator per decky;
- the deployer source wires seed_baseline inside a try/except so a
failure can't abort the deploy.
Two sub-routers under /api/v1/canary:
blobs (operator-uploaded artifacts, deduped by sha256):
- POST /blobs (multipart upload; admin)
- GET /blobs (list with token_count; admin)
- DELETE /blobs/{uuid} (refcount-aware; 409 when referenced; admin)
tokens (per-decky planted artifacts):
- POST /tokens (generate or instrument + plant; admin)
- GET /tokens?decky_name=&kind=&state= (filter; viewer)
- GET /tokens/{uuid} (detail; viewer)
- GET /tokens/{uuid}/preview (instrumented bytes; admin)
- GET /tokens/{uuid}/triggers (paged callback log; viewer)
- DELETE /tokens/{uuid} (revoke + bus event; admin)
XOR validation: exactly one of blob_uuid / generator must be set.
Path validation rejects relative/NUL/newlines/.. segments. Every
body-bearing route documents 400 plus 401/403/404 as applicable.
Stdlib MIME sniffer (no python-magic dep) covers PNG/JPEG/GIF/PDF/
HTML/XML/DOCX/XLSX/JSON/YAML/TOML/text/plain; everything else falls
through to passthrough.
Tests run end-to-end through the live FastAPI app (planter docker
exec is patched); 17 cases covering dedup, refcount, lifecycle,
XOR validation, path validation, and 404 paths.
decnet canary launches the HTTP + DNS callback receiver via
decnet.canary.worker.run. Mirrors the shape of decnet webhook
(typer command with --daemon flag, asyncio.run in the foreground).
Deliberately NOT added to MASTER_ONLY_COMMANDS — every host that
hosts deckies runs its own canary worker, and the bus events stay
local to that host (per-host webhook fanout handles SIEM egress).
decnet canary worker hosts both callback surfaces in one process:
- HTTP: a tiny FastAPI app on its own port (default 8088). The only
meaningful route is GET /c/{slug} which looks up the slug, persists
a CanaryTrigger, publishes canary.<id>.triggered, and returns a 1x1
transparent GIF. Unknown slugs return the same response (stealth);
no decnet strings leak in headers/banners; docs/openapi/redoc are
disabled. X-Forwarded-For is honored.
- DNS: an authoritative UDP server for *.<canary_zone> using
asyncio.DatagramProtocol with stdlib-only DNS wire-format parsing
(no dnslib dep). Same lookup -> persist -> publish flow, plus a
sinkhole A record (192.0.2.1) so the attacker's resolver doesn't
loop on NXDOMAIN. Single-label slugs only; multi-label probes
return NXDOMAIN. Pointer loops in malformed queries are caught
(10-hop cap) so an adversarial packet can't wedge the parser.
Tests cover both surfaces without privileged sockets:
- HTTP via Starlette TestClient: known/unknown slug, headers, XFF,
stealth-string assertions.
- DNS via direct DatagramProtocol drive: known slug -> ANSWER,
unknown -> NXDOMAIN, pointer-loop -> ValueError, malformed
packet -> silent drop.
Plant / revoke / seed_baseline using the same docker-exec-with-sh-c
pattern proven by decnet/orchestrator/drivers/ssh.py:_run_file.
Each plant call composes a single sh script:
mkdir -p <dirname> && printf %s <base64> | base64 -d > <path> &&
chmod <mode> <path> && touch -d @<mtime> <path>
Base64-on-the-host / decode-in-the-container keeps binary artifacts
(DOCX/PDF/PNG) safe across the argv boundary; the placement_path,
mode, and mtime are shlex-quoted.
State transitions hit the repo: planted -> failed on docker error
with stderr captured into last_error. Bus events fire on success
(canary.<id>.placed) and on revoke (canary.<id>.revoked) — wrapped
in try/except so a downed bus never blocks a placement.
seed_baseline(decky_name, repo) is the deploy-hook entry point —
reads DECNET_CANARY_BASELINE (default git_config,env_file,honeydoc,
aws_creds), persists one row per generator, plants each. Failed
placements are logged but do NOT abort; the deployer hook treats
the return list as informational.
Seven instrumenters that mutate operator-supplied artifacts to
embed the callback URL:
- passthrough — bytes unchanged; only DNS-callback tokens trip
detection, with the slug embedded in the placement path
- plain — substitutes {{CANARY_URL}}/{{CANARY_HOST}} placeholders;
falls back to appending a comment line whose prefix adapts to the
apparent file syntax (#, //, ;)
- html — injects a 1x1 tracking pixel before </body>, appends
if the close tag is missing
- docx — direct zipfile manipulation (no python-docx dep):
inserts an external-image Relationship into word/_rels/document.xml.rels
and a matching <w:drawing> element before </w:body>
- xlsx — sibling of docx; injects an external-image relationship
into xl/_rels/workbook.xml.rels (orphan rels are still fetched on
open by most viewers)
- pdf — uses pikepdf to install /OpenAction /URI on the catalog;
rejects with a clear message when pikepdf isn't installed
- image — uses Pillow to embed slug + URL in PNG tEXt / JPEG
comment; rejects with a clear message when Pillow isn't installed
DOCX and XLSX share the rId allocator + relationship injector via
the docx module; both work on stdlib zipfile only.
Tests synthesise minimal real DOCX/XLSX fixtures inline, round-trip
each instrumenter, and assert the callback URL ends up in the
mutated bytes while the file still parses.
Five built-in generators that produce deterministic fake artifacts
keyed by the token slug:
- aws_creds — passive [default]/[prod] credentials block, no
callback wiring (AWS-key tokens require an external
trap, which is post-v1)
- git_config — .git/config with origin url = http_base/c/<slug>/repo.git
- env_file — .env with API_BASE_URL + WEBHOOK_NOTIFY_URL embedding
the callback URL plus inert realism filler
- ssh_key — PEM-shaped fake private key whose host comment carries
<slug>.<dns_zone> when DNS is deployed, else the
http_base host
- honeydoc — minimal HTML report with a 1x1 tracking-pixel <img>
whose src is the callback URL; fallback for the
deploy-time baseline before the operator uploads a
real DOCX/PDF
Tests assert byte-stability (same ctx -> same bytes), slug presence
in the embedded fields, that aws_creds is intentionally URL-free,
and that every artifact carries operator-facing notes for the
preview endpoint.
Mirrors the decnet.intel layout (base + factory + lazy concrete
imports). Defines:
- CanaryArtifact / CanaryContext dataclasses + the generator and
instrumenter ABCs they share
- factory dispatch for generators (git_config/env_file/ssh_key/
aws_creds/honeydoc) and instrumenters (docx/xlsx/pdf/html/image/
plain/passthrough), plus pick_instrumenter_for_mime() for MIME-driven
dispatch on operator uploads
- persona-aware default placement paths (Linux vs. Windows-shaped)
and absolute-path validation that the API will use to validate
operator-supplied placement_path values
- on-disk blob store: sha256-keyed two-level fan-out, idempotent
writes, refcount-aware unlink (the DB row is the source of truth)
Also covers prior commits' tests (bus topics, models, repo CRUD)
under tests/canary/. 79 tests, all pass.
Adds the abstract surface on BaseRepository and the SQLModel-backed
implementation (shared by SQLite and MySQL) for:
- canary blobs (upsert-by-sha256, list-with-refcount, refcount-aware delete)
- canary tokens (create, slug lookup, list with filters, state update)
- canary triggers (record+bump-counters atomically, list, attribute)
The triggers path is a single session that inserts the row and bumps the
parent token's counters together, so a subscriber that reads the token
right after the bus event sees the updated count. Blob delete refuses
while any token (including revoked) still references the blob; pre-v1
revoked tokens stick around for forensic value.
Three new tables for the canary tokens feature:
- canary_blobs — operator-uploaded source artifacts, deduped by sha256
- canary_tokens — one planted artifact in one decky; carries the
callback slug, generator/instrumenter, and lifecycle
- canary_triggers — append-only log of every callback hit; attacker_id
back-filled by the correlator
Pydantic request/response shapes live in the same file per the
single-source-of-truth convention. No migrations file — pre-v1
SQLModel.metadata.create_all() covers it.
Reserved topic family for the upcoming canary-tokens feature so the
correlator and webhook fanout can subscribe to canary.> from day one.
No producers yet; planter, decnet canary worker, and API will publish
in subsequent commits.
* Dashboard / Layout / index CSS — flexbox cleanup so the sidebar
scrolls independently and dashboard panels fill available height
without overflowing the viewport (min-height: 0 on the flex
ancestors that were collapsing).
* pyproject.toml — add sqlite_vec runtime dep (groundwork for an
embeddings-backed feature ANTI is wiring up separately).
* decnet/templates/{rdp,smb}/ntlmssp.py — minimal Type 3 (Authenticate)
parser shared between the SMB and RDP-NLA templates. Lands NTLM
creds in the universal Credential table with secret_kind=ntlmssp_v1
/ ntlmssp_v2 and secret_b64 = base64 of the NtChallengeResponse so
the bounty pipeline can feed the right hashcat mode.
* scripts/decnet-init.sh — convenience wrapper around `sudo decnet init
--force` that targets the current working directory; saves operators
retyping the install paths during dev iterations.
New dashboard surface for editing the global emailgen persona pool —
the JSON file fleet (MACVLAN/IPVLAN) and SWARM-shard mail deckies pull
from. MazeNET topology personas are out of scope here; they're
configured per-topology in the topology editor.
Backend:
* GET/PUT /api/v1/emailgen/personas — admin-write, viewer-read. PUT
validates with the same Pydantic schema the worker uses
(parse_personas), drops invalid entries with a warning, returns 400
only when the entire payload fails. Path is operator-discoverable
on every response so a CLI-driven backup workflow stays visible.
Frontend:
* PersonaGeneration.tsx + .css — table + add/edit modal with the full
EmailPersona schema (name, email, role, tone, mannerisms list,
language, signature, active hours, reply latency, uses_llms_heavily).
Local edits are batched; explicit "SAVE CHANGES" writes back, with a
dirty-indicator pill and a "DISCARD" reset. Email uniqueness is
enforced client-side so the scheduler never picks the same persona
as both sender + recipient.
* Sidebar AUTOMATION group gains a "Persona Generation" entry next to
Orchestrator; route registered at /persona-generation.
The worker reads the same on-disk file the API writes — see
decnet.orchestrator.emailgen.global_pool. The API resets the
in-process cache on every read/write so the worker picks up dashboard
edits within its next tick rather than waiting on mtime.
The SSE pipe at /orchestrator/events/stream was already streaming
'orchestrator.email.{decky_uuid}' events (the subscription is for the
'orchestrator.>' wildcard), but the consumer side dropped them on the
floor. Three fixes to close the loop:
* useOrchestratorStream.ts now registers an 'email' SSE listener — the
EventSource silently ignores frames whose event name has no listener,
so missing this entry meant every email frame was dropped before
reaching the page's onEvent handler.
* /api/v1/orchestrator/events accepts kind=email and dispatches to
list_orchestrator_emails, adapting rows to the existing wire shape:
subject -> action, sender_email -> src_decky_uuid, recipient_email
-> dst_decky_uuid, plus email-specific extras (thread_id, language,
mail_decky_uuid, message_id, in_reply_to) ride along as top-level
keys.
* Orchestrator.tsx gains an 'email' tab in the kind filter and a
branch in the row renderer / inspector that:
- shows full sender / recipient (no UUID truncation),
- chips the language code next to the subject,
- relabels ACTION as SUBJECT in the inspector and surfaces
thread / in-reply-to / mail-decky details.
The 'all' tab continues to show traffic+file only (today's behavior);
operators see emails by switching to the email tab. A union view at
the API layer is the obvious follow-up but not necessary for now.
Plug emailgen into the systemd-supervised fleet:
- New deploy/decnet-emailgen.service.j2 mirroring decnet-orchestrator's
shape: simple service, restart-on-failure, docker supplementary group
(driver shells `docker exec` to drop EMLs into the spool), the same
hardening directives as the rest of the fleet.
- decnet.target now Wants both decnet-emailgen.service and
decnet-orchestrator.service. Orchestrator's absence from the target
was a historical oversight — fixing it here while the file is open.
`decnet init` already globs deploy/decnet-*.service.j2 so the new unit
ships automatically; no init-side change needed. Emailgen-specific env
knobs (DECNET_EMAILGEN_LLM, _MODEL, _PERSONAS, _TIMEOUT) are documented
in the unit and operator-tunable via /opt/decnet/.env.local.
Two-layer gating per CLAUDE.md:
- registration-time: emailgen added to MASTER_ONLY_GROUPS so agents
don't see the sub-app in 'decnet --help' at all.
- body-guard: _require_master_mode('emailgen ...') at the top of every
sub-command body so a direct callable import (third-party tooling)
still bails on agent hosts.
Matches the convention used for 'swarm', 'topology', 'geoip'. SWARM
agents push their generated mail through the master's emailgen worker
(or none at all); cross-agent emailgen federation stays out of scope.
Lift the Ollama subprocess shell-out out of EmailDriver and into a
proper provider subpackage shape:
decnet/orchestrator/emailgen/llm/
base.py — LLMBackend Protocol + LLMResult + LLMTimeout
factory.py — get_llm() reads DECNET_EMAILGEN_LLM
impl/ollama.py — current 'ollama run' subprocess path
impl/fake.py — canned-output backend used by tests
Driver now takes an LLMBackend on construction (or inherits the
factory default). Tests inject FakeBackend instead of monkeypatching
the subprocess layer, which is cleaner and ~10x faster. Swapping
Ollama for the Anthropic API / vLLM / llama.cpp is now a third branch
in factory.py; no driver rewrite needed.
Mirrors the convention used by decnet.web.db.factory + decnet.bus.factory
per the provider-subpackages-from-day-one rule in memory.
Two changes that unwind earlier MazeNET-only assumptions and fix a
realism tell:
1. Persona resolution is now per-decky-source, not topology-only. The
scheduler walks the union view (list_running_deckies, including
fleet MACVLAN/IPVLAN + SWARM shards) and picks the right persona
list for each source:
* topology decky -> Topology.email_personas (per-topology richness
preserved)
* fleet / shard -> a single host-wide pool loaded from disk
(DECNET_EMAILGEN_PERSONAS, /etc/decnet/email_personas.json, or
~/.decnet/email_personas.json)
Operators install the global pool via 'decnet emailgen
import-personas <file>' which validates with the same Pydantic
schema the worker uses.
2. The driver now runs 'touch -d <Date>' inside the docker exec right
after the EML write so file mtime matches the email's RFC 2822
Date: header. Without this an attacker 'ls -lt'ing the spool sees
every email clustered inside the worker's tick window — the
cluster itself was a stylometric tell.
CLI now exposes 'decnet emailgen' as a sub-app with 'run' (default,
backwards-compatible with bare 'decnet emailgen') and 'import-personas'.
list_running_deckies carries topology_id through so consumers can resolve
the parent topology without a second round-trip.
When IMAP_EMAIL_SEED / POP3_EMAIL_SEED points at a directory of .eml
files (the orchestrator emailgen worker's drop path,
/var/spool/decnet-emails/ by convention), the bait mailbox is replaced
with those LLM-generated, persona-driven, threaded messages. Empty /
missing dir keeps the hardcoded fallback so a fresh deployment is never
silent. Cached with mtime invalidation + a short TTL so a hot mailbox
doesn't pay the parse cost on every IMAP/POP3 command.
Replaces the DEBT-026 stub on both templates that named the env var but
never wired it through.
Second orchestrator worker (decnet emailgen) that drips persona-driven,
threaded, multi-language fake emails into running mail deckies. Personas
live on Topology.email_personas; topology-wide language_default falls
through to any persona that doesn't pin its own. Em-dashes are
suppressed at the prompt layer by default and only lifted for personas
explicitly marked uses_llms_heavily — em-dashes are an LLM tell and a
flat corpus of em-dashed mail is a giveaway.
EML delivery writes into /var/spool/decnet-emails/<thread>/<msg>.eml on
the mail decky via docker exec; wiring the IMAP/POP3 templates to read
from that spool (replacing the hardcoded _BAIT_EMAILS) is the next step.
Mirrors the CredentialsInspector pattern: clicking a row opens a
right-edge drawer with the full event payload pretty-printed and
copyable. The table view truncates the src/dst id to 8 chars; the
drawer shows the full identifier plus a SOURCE chip
(TOPOLOGY / FLEET / SHARD) so operators can tell at a glance whether
the orchestrator hit a MazeNET decky, a unihost fleet decky, or a
SWARM shard.
Source detection is purely client-side based on id shape — bare UUID
→ topology, "local:*" → fleet, "<host>:*" → shard. The server
already returns a normalized id from list_running_deckies; this
inspector just labels it.
Backdrop click closes via target===currentTarget guard (per the
React stop-propagation memory: never use stopPropagation on drawer
panels — it breaks native event delegation).
Live (in-flight stream) events use synthetic uuids prefixed "live-";
the drawer hides the EVENT UUID row and shows "LIVE EVENT" in the
header for those, since the server-side id won't exist until the
backend persists the row.
Once the orchestrator started seeing fleet + SWARM shard sources via
list_running_deckies (a844148), every event row landing on a fleet decky
broke the FK to topology_deckies — the column now carries opaque ids
("local:omega-decky" for fleet, "host_uuid:decky_name" for shards) that
will never match topology_deckies.uuid.
Symptom on the operator's mothership:
IntegrityError 1452 — orchestrator_events_ibfk_2 FK violated on every
tick once the reconciler populated fleet_deckies.
Index on dst_decky_uuid is preserved (the dashboard reads
"events for this decky" frequently); only the FK is removed. Keeps
data integrity loose by design — events are append-only history that
should outlive the deckies they reference.
Existing MySQL deployments need the FK dropped manually:
ALTER TABLE orchestrator_events
DROP FOREIGN KEY orchestrator_events_ibfk_2,
DROP FOREIGN KEY orchestrator_events_ibfk_1;
SQLite users are unaffected — SQLite doesn't enforce FKs by default.
The Workers panel (Config → Workers tab) hardcodes its row list in
KNOWN_WORKERS — by design, so a rogue publisher can't inject UI rows.
Three heartbeat-emitting workers were missing:
* clusterer — behavioral clustering (decnet/clustering/)
* campaign-clusterer — campaign assembly (decnet/clustering/campaign/)
* reconciler — host-local fleet convergence (added in 430262e)
Each already publishes on system.<name>.health via run_health_heartbeat,
so they show up live the moment they're added to the registry — no
frontend or subscriber wiring needed (Config.tsx renders whatever
/workers returns).
Also added to _PREFERRED_ORDER in start-all so START ALL WORKERS brings
them up in dependency-friendly order: data-plane → reconciler → intel
→ clustering → output → orchestrator.
Three deployable units (listener, web, swarmctl) intentionally remain
absent from KNOWN_WORKERS — they don't emit heartbeats (CLI / static
server / one-shot tooling), so they'd permanently render as UNKNOWN
and confuse operators. Adding them is a separate decision that needs
a "synthesize installed-but-silent rows" pass on the registry.
Two pieces, one PR because they share a deployment surface:
1. systemd. decnet-reconciler.service.j2 mirrors the orchestrator unit
shape (docker group, hardened sandbox, append-logs). Read-only
/var/lib/decnet so it can read decnet-state.json without write
access. Auto-discovered by `decnet init` via the existing
decnet-*.service.j2 glob — no init.py change needed. Added to
decnet.target so `systemctl start decnet.target` brings it up
alongside collector / sniffer / mutator / etc. Also added to the
agent reaper script so self-destruct cleans it up on workers.
2. Bus signal. reconcile_once now publishes
`decky.<host_uuid:name>.state` on every insert / delete /
state-changed transition. Reuses the existing DECKY_STATE topic
family (no bus/topics.py change → no wiki update needed per the
bus-signals doc rule). Composite host_uuid:name segment keeps
fleet rows distinguishable from MazeNET TopologyDecky rows whose
ids are bare UUIDs. Quiet ticks publish nothing — convergence
means silence.
Bus is plumbed through the worker, defaults to None for unit-test
callers. publish_safely keeps the source-of-truth contract: DB write
is authoritative, the publish is best-effort notification.
Captures previous_state into a local before update_fleet_decky_state
runs — a fake repo that mutates rows in-place would otherwise see the
post-update state and report previous == current. Real repos don't
have this concern but the fix is cheap and makes the function less
order-dependent.
Switches _one_tick from list_running_topology_deckies to
list_running_deckies (the union view added in 095500a). Resolves the
permanent "no actionable deckies (running+ssh count=0)" log on hosts
running only unihost MACVLAN / IPVLAN decoys — the orchestrator now
sees fleet_deckies rows alongside MazeNET topology rows and SWARM
DeckyShard rows.
Also fixes the misleading log message: the old "running+ssh count=N"
reported the *pre-filter* total (count of all running deckies, not
the SSH-eligible subset that scheduler.pick actually evaluates). New
line breaks down running, ssh_eligible, and per-source counts so
debugging "why isn't it picking?" no longer requires reading
scheduler internals.
Regression test: orchestrator integration suite now seeds fleet_deckies
rows (not just topology_deckies) and verifies a tick picks them and
records an event with dst="local:fleet-*" — proving the original bug
on the operator's mothership is fixed.
Adds decnet.fleet.reconciler — a pure async function plus a long-lived
worker — that periodically reconciles the three sources of truth on a
DECNET host:
1. decnet-state.json (CLI-canonical fleet record)
2. fleet_deckies table (DB mirror, written by engine.deployer)
3. docker inspect (actual per-container runtime state)
Drift handling:
* JSON has X, DB doesn't → INSERT (deploy ran with DB offline)
* DB has X (this host), JSON doesn't → DELETE (teardown ran with DB offline)
* Both have X, docker disagrees → flip state to running/failed/degraded
* Docker socket unreachable → leave existing state alone (don't
torch every row to torn_down)
Cross-host safety: deletions are scoped to host_uuid for the local host;
a master that runs both a local fleet and swarm workers will never
clobber a peer's slice.
CLI:
decnet reconcile --once # one-shot, prints counts
decnet reconcile [--interval N] # long-lived worker, mirrors
# orchestrator's lifecycle (control
# listener + heartbeat + tick loop)
Promotes decnet/fleet.py → decnet/fleet/ package so the reconciler can
live alongside it without name collision (build_deckies_from_ini and
all_service_names re-exported unchanged via __init__.py).
14 new tests cover state aggregation rules, all four drift directions,
host_uuid scoping, docker-unreachable safety, and worker shutdown via
the bus control event.
The unihost API path delegates to engine.deployer.deploy(), which now
writes both decnet-state.json (existing) and the fleet_deckies DB
table (added in 646aeec). Comment makes the single-sink design
explicit so future maintainers don't add a parallel save_state /
upsert_fleet_decky call here.
No behavioral change — every fleet-creation path on every host (CLI
deploy, this unihost API path, and per-worker SWARM agent deploys)
already routes through the engine.deployer single sink.
CLI deploy now writes both surfaces: decnet-state.json (existing,
canonical for offline / no-API hosts) and the new fleet_deckies DB
table (visible to orchestrator, web dashboard, REST API).
Best-effort: a DB outage logs a warning and returns. The JSON file
remains the source of truth for `decnet status`, `decnet teardown`,
sniffer, and collector — operators on a CLI-only host keep working.
_run_async helper bridges sync deploy() into the async repository.
Always uses a fresh thread because the API handler at
web.router.fleet.api_deploy_deckies invokes deploy() from inside a
FastAPI event loop, which would otherwise break asyncio.run.
Verified end-to-end against MySQL: deploy mirror inserts rows, union
view (list_running_deckies) returns them with source="fleet",
teardown mirror removes them. Works from both sync (CLI) and async
(API handler) call sites.
Adds a fleet_deckies table so DB-only consumers (orchestrator, web
dashboard, REST API) can see unihost / MACVLAN / IPVLAN deckies
without reading the JSON state file. Mirrors DeckyShard field-for-field.
Composite PK (host_uuid, name) future-proofs for a mothership that
runs both a local fleet and acts as a swarm master. host_uuid defaults
to the "local" sentinel — no FK to swarm_hosts because the local
mothership isn't enrolled as a worker.
Repo additions: upsert_fleet_decky, delete_fleet_decky,
list_fleet_deckies, list_running_fleet_deckies,
update_fleet_decky_state, plus list_running_deckies which unions
topology + fleet + shard sources for the orchestrator.
Smoke-tested round-trip against MySQL: upsert, list_running, union
view (source="fleet"), delete.
TTL extraction was already wired in the active prober and passive sniffer
plus profiler rollup; the checkbox was just stale. TCP/IP stack now
includes ToS/DSCP/ECN, IP-ID sequence classification, and ISN sequence
classification as of the previous three commits.
Mirrors the IP-ID classifier for TCP ISN values: per-source-IP rolling
deque (maxlen=8) populated from each inbound SYN's tcp.seq, classified
on every emission. A 'random' verdict is the modern norm; 'incremental',
'zero', or 'constant' indicates legacy stacks or hand-rolled raw-socket
tooling — a strong fingerprint signal.
Active prober now also captures server_isn (single sample, not classified
in-flight; downstream consumers correlating multi-probe results can apply
seq_class.classify_sequence themselves).
Profiler rollup carries the latest non-'unknown' label into
attacker.tcp_fingerprint. Dedup key already covers isn_class from
the previous commit, so transitions emit cleanly.
UI surfaces ISN class as a colour-coded tag with a ⚠ glyph for
non-random verdicts, since they're the genuinely interesting case.
Adds a per-source-IP rolling sample buffer (deque, maxlen=8) for IP-ID
values seen on attacker SYNs and a stdlib-only classifier in
decnet/sniffer/seq_class.py. Each new SYN appends ip.id and re-classifies
the buffer; the result is logged on tcp_syn_fingerprint events alongside
sample count.
The dedup key now folds in ipid_class so a transition from 'unknown' to
a definitive verdict emits exactly one fresh event instead of being
suppressed by the old (os|options) key. Profiler rollup carries the
latest non-'unknown' label into attacker.tcp_fingerprint.
UI surfaces it as a colour-coded tag in the TCP STACK panel: random
neutral, incremental amber, zero/constant green (the strong signal).
Active prober now reads ip.tos from the SYN-ACK and emits tos/dscp/ecn
alongside the existing TTL/window/options fields. dscp is folded into the
fingerprint hash so different DSCP markings produce distinct signatures.
Passive sniffer logs the same three fields on tcp_syn_fingerprint events;
profiler rollup carries them into the attacker tcp_fingerprint snapshot;
AttackerDetail's TCP STACK panel now surfaces DSCP and ECN cells.
Replaces inline styles + .bounty-root reuse with a dedicated
.orchestrator-root scope. Adds animated status pill (live/connecting/
error), bordered seg-group kind filter that matches DeckyFleet's
fleet-filter-group, dedicated kind chips (matrix-green for traffic,
violet for file), failure-row tint, and a brief 'fresh' tint for
just-prepended live rows that fades after 5s.
DEBT-042 — orchestrator failure-count badge is computed from the
in-memory SSE window; remediation is a dedicated stats endpoint.
DEBT-043 — no frontend test framework configured; the planned
Orchestrator.tsx component test couldn't be written without first
adding vitest + RTL.
New /orchestrator route. Paginated read-only event list with kind
filter (all|traffic|file), pause-stream toggle, in-window failure
badge ('X failures / 1h'), and an SSE-driven 'live' status pill.
Streamed rows prepend on top up to a 500-row in-memory cap.
Sidebar gains an AUTOMATION nav group; Orchestrator is the first
child. Future workers (mutator/prober activity) plug in as siblings.
Every 100 ticks, trim per-dst_decky_uuid history down to 10000 rows
(oldest first). Keeps the events table bounded on long-running fleets
without paying the cost on every write.
GET /api/v1/orchestrator/events — paginated list with optional
kind=traffic|file filter. GET /api/v1/orchestrator/events/stream —
SSE: snapshot on connect, live forward of orchestrator.> bus events
mapped to 'traffic' / 'file' SSE event names.
Repo gains list_orchestrator_events(limit, offset, kind?, since_ts?),
count_orchestrator_events(kind?), and prune_orchestrator_events
(per_dst_cap=10000) for periodic worker-side trimming.
Aligns the bus token with the DB column value; OrchestratorEvent.kind
is 'traffic'/'file' but the topic was 'activity'/'file'. The asymmetry
made consumer code (UI filter, SSE event names) need a translation
layer. No external subscribers existed yet.
Adds a new decnet orchestrate worker whose job is to keep the honeypot
ecosystem from looking suspiciously static — a frozen LAN with no
inter-host traffic and no filesystem aging is its own honeypot tell.
MVP scope:
- New OrchestratorEvent table + repo methods (purpose-built sibling
to Log so synthetic events stay separable from attacker-driven ones).
- New orchestrator.{activity,file}.<decky_id> bus topics +
system.orchestrator.health heartbeat.
- SSH-only driver. Traffic action runs python3 inside src container
to TCP-connect dst:22 and read the SSH banner — real on-the-wire
SSH-protocol traffic without shipping creds. File action drops or
refreshes a small file via docker exec on the destination.
- Random scheduler (50/50 traffic/file when >=2 SSH-capable deckies
are running). Diurnal shaping, role-aware pairing, and session-aware
backoff are explicit non-goals for MVP.
- CLI registration, systemd unit (SupplementaryGroups=docker),
worker-registry entry so the dashboard shows orchestrator health.
- 11 tests: scheduler policy, driver argv shape + injection-safety,
end-to-end one-tick integration with FakeBus + SQLite.
Adds proper /identities and /campaigns list pages following the
Bounty/Attackers convention (page-header + page-title-group +
controls-row + logs-section + logs-table + EmptyState). Both pages
live-update via the existing identity / campaign SSE streams.
Sidebar: Attackers, Identities, Campaigns now group under a
THREAT DATA NavGroup, matching the SWARM grouping pattern.
CampaignDetail and IdentityDetail rewritten to use the house class
system (page-header / logs-section / chip / dim-chip) instead of
inline styles. The campaign chip on IdentityDetail navigates to
/campaigns/:uuid; both pages share a small fp-group helper for
fingerprint listings (added to Dashboard.css).
decnet-clusterer.service.j2 ships the identity clusterer that
landed last session (was overlooked) — bus-woken on attacker.>,
publishes identity.> events.
decnet-campaign-clusterer.service.j2 ships the campaign clusterer
from this session — bus-woken on identity.>, publishes campaign.>
events plus the cross-family identity.campaign.assigned. After=
decnet-clusterer.service so the identity layer is up before the
campaign layer reads its rows.
decnet.target Wants both new units. Both follow the same security
hardening profile as enrich + reuse-correlator.
API: /api/v1/campaigns (paginated list), /api/v1/campaigns/{uuid}
(soft-merge chain follow), /api/v1/campaigns/{uuid}/identities
(member identities), and /api/v1/campaigns/events (SSE under
campaign.> + JWT-via-?token=, snapshot-on-connect). Mirror of the
identity router; same auth, same shape, same OpenAPI tags pattern.
Frontend: CampaignDetail.tsx page (same visual vocabulary as
IdentityDetail), useCampaignStream hook (mirror of
useIdentityStream), /campaigns/:id route, IdentityDetail's
CAMPAIGN badge becomes clickable and navigates to the campaign.
useIdentityStream now listens for identity.campaign.assigned so
the badge appears live without a manual refresh.
Runs the chained identity + campaign clustering pipeline against all
seven fixtures via from_synthetic / from_synthetic_identity adapters
and ratchets every YAML floor to 1.0 — the production clusterer
(and the reference clusterers used in the per-fixture tests) all
score perfectly across ARI / homogeneity / completeness /
singleton_recall on each fixture.
Three substrate fixes surfaced by the ratchet:
- Tuning: shared_infra now Jaccards payload+C2 only; decky_set moved
into cohort_weight to prevent fleet-scarcity false-merges (F1's
shared_wordlist failure mode). Tier weight raised to 1.0 so
shared payload+C2 alone crosses threshold (F5's intended pass).
- Adapter: from_synthetic_identity now reads SyntheticSession
started_at + duration_s for session_windows and per-decky
timestamps (the production-row adapter still uses start_ts/end_ts
when available).
- Fixture data: paused_campaign.yaml's JA3 collided exactly with
vpn_hopping.yaml's (same TLS extension list). The collision
fused two unrelated campaigns under the chained identity layer
in the noise_floor composite. Made paused's JA3 distinct.
Also wires Campaign / CampaignsResponse into models/__init__.py's
__all__ that was missed in the schema commit.
The campaign clusterer worker mirrors the identity-side worker shell
(bus connect, heartbeat, control listener, slow-tick fallback) but
wakes on identity.> instead of attacker.> — campaign-level work is
gated on identity-layer changes, not raw observations.
The connected-components implementation reads identities via
list_identities_for_clustering, projects them with from_identity_row,
runs union-find over combined_campaign_weight, writes campaigns rows,
sets attacker_identities.campaign_id, and runs the same revocable-
merge pass as the identity layer (a merged-out campaign whose
identities no longer co-cluster with the winner gets revoked).
Bus: adds campaign.> family (formed / identity.assigned / merged /
unmerged) plus the cross-family identity.campaign.assigned so
existing identity-stream subscribers see the badge update without
having to subscribe to campaign.>. Wiki Service-Bus.md updated in
wiki-checkout in the same wave per the project's bus-signals
discipline.
CLI: decnet campaign-clusterer registered as master-only via
MASTER_ONLY_COMMANDS; --poll-interval / --daemon mirror the identity
clusterer command surface.