First contract commit of TTP tagging. Shapes only — no behavior.
- TTPTag SQLModel: deterministic UUIDv5 PK; (source_kind, source_id)
discriminated provenance; nullable attacker_uuid + identity_uuid
with ON DELETE CASCADE; native sqlalchemy.JSON evidence column;
required attack_release; CheckConstraint('attacker_uuid IS NOT
NULL OR identity_uuid IS NOT NULL'); composite indexes for the
primary query patterns (identity_uuid+technique_id,
attacker_uuid+technique_id, technique_id+created_at); __init__
guard raising ValueError with both anchor names in the message
(belt-and-braces for MySQL <8.0.16 where CHECK is silent).
- compute_tag_uuid(): RFC-4122 UUIDv5 over the six tag-identity
fields under a fixed _TTP_TAG_NS. Pure, deterministic, replay-safe.
- Per-source_kind evidence TypedDicts (CommandEvidence,
IntelEvidence, EmailEvidence, CanaryFingerprintEvidence) — PII
rule lives in the type: EmailEvidence has no field for raw rcpt
addresses or body bytes.
- TTPRule + TTPRuleState tables for the DatabaseRuleStore (E.1.11).
- All symbols re-exported from decnet.web.db.models per the
package's existing convention.
Tests for invariants (CHECK behavior, evidence round-trip across
SQLite+MySQL, idempotency property, init-guard ordering) land in
E.2.1/E.2.2 with xfail-strict markers per Appendix E discipline.
- Add _MixinBase abstract class to _helpers.py: declares _session(),
_deserialize_attacker(), _assert_pending(), _check_and_bump_version(),
and list_running_topology_deckies() so mypy can see cross-mixin contracts
- Add _require(val, msg) helper for narrowing T | None → T
- Inherit _MixinBase in all 26 leaf mixin classes
- Wrap SQLAlchemy column method calls (.is_(), .like(), .notin_(), .in_(),
.contains()) with col() from sqlmodel — fixes attr-defined false positives
caused by pydantic plugin typing class-level fields as Python value types
- Wrap select(Model.field) with select(col(Model.field)) for column projections
- Add pyproject.toml [[tool.mypy.overrides]] to disable arg-type in
sqlmodel_repo.*: pydantic plugin resolves .where(Model.field == v) as
where(bool), a false positive; call-arg still catches real argument errors
- Remove 9 stale # type: ignore comments (logging, helpers, credentials)
- Fix telemetry.py traced() overload no-redef + misc
- Fix logs.py datetime/str operator and nullable PK comparison with col()
- sqlmodel_repo/ now has 0 mypy errors
Replace repo: BaseRepository with a structural TopologyRepository protocol
in persistence.py and allocator.py. All read methods now return typed DTOs
(TopologySummary, LANRow, DeckyRow, EdgeRow) instead of raw dicts, eliminating
silent field-shape regressions across the topology subsystem.
TopologySummary gains email_personas and language_default so api_personas.py
can continue reading those fields via attribute access. hydrate() converts
DTOs to dicts before passing to _backfill_decky_configs, keeping the mutable
working-state function dict-based at its boundary. All production callers
(router handlers, mutator, CLI, heartbeat) migrated from dict/get access to
attribute access. 134 tests pass.
MutationRow.op was str despite _MUTATION_OPS existing; Topology.mode/status,
TopologyDecky.state, TopologyMutation.op/state carried valid values only in
comments; deferred json import had no justification.
- Promote _MUTATION_OPS before table classes so table fields can reference it
- Add sa_column=Column(String) on each Literal-annotated table field to satisfy
SQLModel 0.0.38 column-type inference
- Move import json to module top; remove deferred import inside _decode_json_payload
- MutationRow.op: str -> _MUTATION_OPS
Pure tarball construction (_build_tarball, _render_*, _iter_included,
_SYSTEMD_UNITS) moved to decnet/swarm/bundle_builder.py — no FastAPI
dependency, independently testable. EnrollBundleRequest/Response moved
to decnet/web/db/models/swarm.py alongside the other swarm DTOs.
Router drops from 504 to 260 lines; keeps only the in-memory token
registry, sweeper, and endpoints.
Attacker probe emails are now forwarded by the master (realism worker)
rather than inside the MACVLAN container, which has no internet gateway.
- New smtp.probe.pending bus topic: ingester publishes when smtp_relay
message_stored fires; worker subscribes and does the actual delivery
- decnet/orchestrator/drivers/smtp_relay.py: pure-sync forward_probe()
reads the .eml from disk and sends via smtplib on a thread executor
- worker.py: _run_smtp_probe_listener + _handle_probe_pending subtask;
limit enforced via count_probe_relays() (DB-backed, restart-safe)
- bounties.py: count_probe_relays() query on probe_relay bounty type
- fleet.py: get_fleet_decky_by_name() to pull service config from DB
- services/smtp_relay.py: upstream_* and probe_limit fields defined in
config_schema but NOT injected into container env (credentials stay
out of docker env vars)
- ingester.py: stripped of smtplib; publishes probe.pending and exits
- tests: assert upstream keys absent from container environment
Once a fingerprint canary's HTTP beacon passes all 4 validation layers
and the trigger row lands, the token is immediately set to state=revoked
and canary.<id>.revoked is published on the bus. The slug lookup is
tightened to only return planted tokens, so subsequent requests to the
same URL silently return the transparent GIF without persisting anything
(stealth posture preserved). Plain http/dns canaries with no
fingerprint_nonce are not affected.
Changes:
- sqlmodel_repo/canary.py: add state == "planted" filter to
get_canary_token_by_slug so revoked slugs resolve to None
- worker.py: after record_canary_trigger, if parsed_fp survived all
layers and token has a fingerprint_nonce, call
update_canary_token_state("revoked") + publish CANARY_REVOKED; errors
are best-effort (trigger row already landed)
- test_worker_http.py: assert state=revoked in test_fp_valid_nonce_persists;
new test_fp_deregisters_slug_after_valid_hit (second hit records nothing);
new test_plain_http_canary_not_deregistered (env_file stays planted)
Adds per-mint nonce gating, structural shape validation, mint UUID
consistency checks, and a per-(token, IP) rate limiter to the canary
worker so attackers who extract a canary from a decky filesystem cannot
poison fingerprint forensics by replaying or forging ?d= submissions.
Changes:
base.py
fingerprint_nonce: Optional[str] added to CanaryArtifact so generators
can surface the nonce to the cultivator without coupling the generator
directly to DB code.
obfuscator.py
nonce_for(callback_token, mint_uuid): HMAC-SHA256 keyed on
DECNET_CANARY_FINGERPRINT_SECRET, truncated to 16 hex chars.
FingerprintSecretMissing raised at mint time if env var is unset.
render_fingerprint_js() now accepts nonce= and substitutes MINT_NONCE.
fingerprint_payload.js
New MINT_NONCE placeholder. Appended as &k= on all beacon URLs (bare-open,
single-shot, chunked). Using &k= avoids colliding with &n= (chunk total).
fingerprint_html.py / fingerprint_svg.py
Derive nonce via nonce_for() and pass to render_fingerprint_js(). Set
artifact.fingerprint_nonce so the cultivator can persist it.
cultivator.py
Passes fingerprint_nonce into create_canary_token() when present on the
artifact; NULL for all non-fingerprint generators.
canary.py (model)
fingerprint_nonce: Optional[str] = Field(default=None, max_length=16)
added to CanaryToken. None for non-fingerprint tokens.
worker.py
_extract_fingerprint now returns (meta_dict, parsed_fp) tuple.
_record_hit accepts parsed_fp + raw_nonce and runs 4 layers after
token lookup: nonce match, shape check, mint UUID consistency, rate limit.
Each failure sets _fp_invalid_* flag and drops structured _fp.
Trigger row always lands regardless.
tests/canary/conftest.py
Session-scoped autouse fixture sets DECNET_CANARY_FINGERPRINT_SECRET so
fingerprint generator and worker tests work offline.
tests
5 new worker HTTP tests and 2 new generator tests covering each
validation layer.
- DeckyServiceAddRequest gains an optional `config: dict` field, validated
against the service's config_schema before any state mutation (400 on
bad type, no half-written rows).
- Engine: add_service threads `config` into _add_topology_service /
_add_fleet_service, persisting validated cfg to decky_config.service_config
BEFORE compose regen so the first `up -d --build` materialises the env on
the new container. No follow-up apply needed.
- Frontend: shared AddServiceConfigModal — same wizard accordion shape, used by:
* DeckyCard's ADD SERVICE picker (Fleet & MazeNET inspectors via shared component)
* MazeNET Inspector's ADD SERVICE picker
* MazeNET palette drag-drop onto a deployed decky
Empty-schema services short-circuit to a one-click add (no modal flash).
Operator can cancel; errors surface in the modal.
- Tests: add_service config plumbing — persist, drop unknown keys, 400-equivalent
on bad types, back-compat empty-config.
- Drive-by: fix stale repo-method names in test_services_live.py
(create_topology_decky → add_topology_decky, get_topology_decky → list+pick helper,
service.added → service_added topic).
- GET /topologies/services/{name}/schema serves the declared ServiceConfigField
metadata so the Inspector can auto-render forms.
- PUT /(topologies/{id}/)deckies/{decky}/services/{svc}/config persists the
validated dict (DB + compose); container untouched (Save).
- POST /(topologies/{id}/)deckies/{decky}/services/{svc}/apply persists then
force-recreates <decky>-<svc> so the new env takes effect (Apply, destructive).
- New engine helper update_service_config wires both fleet and topology paths
through the existing _persist_fleet_change / _rerender_topology_compose
machinery; emits decky.<name>.service_config_changed on the bus.
Two related fixes that came out of running the W5 tests locally:
1. tests/__init__.py — empty file, makes 'tests/' a package so pytest
stops inserting it into sys.path. Without it, 'tests/docker/'
(the docker-image test category) shadowed the installed docker SDK
on every engine-touching test in the repo:
module 'docker' has no attribute 'DockerClient'
Pytest's default --import-mode=prepend was the culprit; making
tests/ a package is the cheapest fix and doesn't change
--import-mode for the whole tree.
2. delete_topology_decky / delete_topology_edge / delete_lan grow an
'enforce_pending: bool = True' kwarg. Default preserves the HTTP
CRUD guard (api_decky_crud / api_edge_crud / api_lan_crud get the
409 for free). apply_remove_decky / apply_detach_decky /
apply_remove_lan now pass enforce_pending=False — the mutator
queue is the live-editing surface and has its own active-topology
gating; the repo's pending-only guard was for design-time CRUD
that mustn't bypass it. Without this, apply_remove_decky was
silently broken on active topologies pre-W5; W5's new test
surfaced it on first run.
10/10 new W5 tests pass; 58/58 across mutator + topology suites.
Adds a fleet_singletons array to ServiceCatalogResponse so per-decky
add UIs can filter out services like LLMNR that run once fleet-wide
(and would 422 server-side at the live add endpoint).
The existing 'services: list[str]' field is unchanged for back-compat
with MazeNET/useMazeApi.ts:257; the new field is additive.
decnet_web/src/hooks/useServiceRegistry.ts wraps the endpoint with a
module-scoped cache (registry only changes on BYOS install / plugin
drop, neither of which happens mid-session) and exposes a precomputed
.perDecky list so consumers don't need to re-derive the diff.
decnet.engine.services_live exposes add_service / remove_service for
both fleet and topology decky scopes. The host's _compose() wrapper
already supported per-service targeting (up --no-deps -d <svc>,
stop, rm -f); what was missing was the orchestration around it:
* add: validate against decnet.services.registry (rejects unknown +
fleet_singleton); persist the new services list; re-render the
per-scope compose file (so future redeploys reflect the change);
run docker compose up -d --no-deps --build <decky>-<svc>.
* remove: stop + rm -f the service container; persist; re-render
compose so a future up -d doesn't bring it back.
Both publish decky.<name>.service.added / .removed on the bus, with
the post-mutation services list. Topic constants added to
decnet.bus.topics; the matching wiki entry in wiki-checkout/Service-Bus.md
ships in a separate commit on the wiki repo (wiki-checkout/ is gitignored).
Four new admin endpoints:
* POST/DELETE /api/v1/deckies/{name}/services{,/svc}
* POST/DELETE /api/v1/topologies/{id}/deckies/{name}/services{,/svc}
ServiceMutationError messages are mapped at the API boundary to 404
(decky/topology missing), 409 (idempotency violation), 422 (unknown
or fleet_singleton service).
Extracts the docker-exec-with-base64-stdin pattern out of canary/planter
and orchestrator/drivers/ssh into a shared decnet.decky_io package.
Both consumers now delegate; the canary planter test still proves the
contract end-to-end.
Adds POST/DELETE /api/v1/deckies/files for arbitrary file drops.
Container resolution is shared with the canary path: topology_id absent
means fleet (<name>-ssh), present routes through resolve_decky_container
which picks <name>-ssh when the topology decky exposes ssh, else the
topology base container decnet_t_<id8>_<name>.
Path validation rejects relative paths and '..' traversal at the request
model layer. Bad base64 → 400; unknown topology → 404; decky not in
topology → 422; docker exec failure → 409.
POST /api/v1/canary/tokens grows an optional topology_id field. When
present, the server hydrates the topology, validates the named decky is
in it, and resolves the docker container via
planter.resolve_topology_container — <name>-ssh if the decky exposes ssh,
else the topology base container. Absent ⇒ fleet semantics, unchanged.
The token row gets a nullable topology_id column (no migration helper
per pre-v1 policy). GET /api/v1/canary/tokens accepts ?topology_id= as
a filter. DELETE re-resolves the container at revoke time so a
redeployed topology is still reachable.
422 when the named decky isn't in the topology; 404 when the topology
itself doesn't exist.