Move artifact path validation + symlink-escape check out of the
admin-gated download endpoint into decnet/artifacts/paths.py so the
TTP EmailLifter can disk-reach .eml files at tag-time without
duplicating regex/root logic (DEBT-047).
The router now catches ArtifactPathError and re-raises HTTPException(400);
behavior is unchanged.
"T1595" alone is opaque; "T1595 — Active Scanning" tells you the
story at a glance. The names come from a backend-side static catalogue
pinned to the same ATT&CK release as the rule engine
(_ATTACK_RELEASE = "v15.1") — names are the canonical MITRE labels,
not author-supplied strings on rules, so a rule author can't typo a
name and the entire fleet sees the typo.
- New `decnet/ttp/attack_catalog.py` with `TECHNIQUE_NAMES` covering
every technique_id + sub_technique_id emitted by `rules/ttp/`
(R0001..R0058 → 69 IDs in the v0 pack).
- `IdentityTechniqueRow` / `TechniqueRollupRow` / `CampaignTechniqueRow`
/ `TTPTagDetailRow` gain optional `technique_name` /
`sub_technique_name` fields. Repo + router populate them from the
catalogue at row-construction time. None when an ID isn't in the
catalogue — UI falls back to the bare ID.
- Coverage test (`tests/ttp/test_attack_catalog.py`) walks every
YAML rule and asserts every emitted ID has a catalogue entry, so
a future rule author who forgets to update the catalogue gets a
loud failure rather than a silent UI fallback.
Frontend:
- `TTPsObservedSection` shows "T1595.002 — Active Scanning:
Vulnerability Scanning" instead of just the ID, with overflow
ellipsis + tooltip for narrow viewports. Inspector header /
TECHNIQUE row also surface the names.
The TTPsObservedSection rollup tells the operator "we saw T1059" but
not why. Click any technique row → side drawer opens listing every
ttp_tag row in scope with the persisted evidence JSON, firing
rule_id / rule_version, source_kind / source_id, confidence, and
created_at. Mirrors the CredentialReuseInspector / BountyInspector
pattern (drawer-backdrop + bd-head/bd-body + kvs grid).
Backend:
- New `GET /api/v1/ttp/tags/by-{scope}/{uuid}/{technique_id}`
(`scope ∈ {identity, attacker, session}`, optional
`?sub_technique_id=`, `?limit=` capped to 1000). Returns raw
TTPTag rows newest-first.
- New `TTPTagDetailRow` Pydantic model + re-export.
- New repo method `list_tags_by_scope_and_technique` on
TTPMixin (+ abstract on BaseRepository) — single query branched
on scope; identity scope projects through `Attacker.identity_id`
the same way `list_techniques_by_identity` does.
- Tests: evidence round-trips, sub_technique filter, JWT-required,
empty scope, unknown scope rejected.
Frontend:
- New `TTPInspector.tsx` + `TTPInspector.css` (violet accent, slide
animation, focus-trapped panel matching the existing inspector
family).
- `TTPsObservedSection`'s TechniqueBar is now click+keyboard
activatable; clicking opens the inspector for that
(technique, sub_technique) tuple.
mypy clean. 532 passed in the targeted sweep.
The endpoint was a contract-phase stub returning `[]` even though the
RuleStore loaded all 58 YAML rules at worker startup. UI saw an empty
table; operators couldn't tell whether anything was wired up.
- `api_list_rules` now calls `get_rule_store().load_compiled()` and
serializes each CompiledRule + its operational state into a
RuleCatalogueRow. Sorted by rule_id for stable golden snapshots.
- Add `description: str` to RuleSchema (pydantic) and CompiledRule
(NamedTuple, defaulted) + propagate through `_compile_one` so the
catalogue surfaces the human-readable YAML description, not just
the slug-style `name`.
- Update `tests/ttp/test_rule_engine.py` _fields assertion for the
new column; new `tests/api/ttp/test_rules_catalogue.py` pins the
catalogue contents (R0001/R0014 presence, row shape, sort order).
Worker behaviour is unchanged: it was already loading rules
correctly. This is purely a read-side wiring fix on the operator API.
Five GET rollup endpoints (techniques, by-identity, by-attacker,
by-campaign, by-session) and the Navigator export (fleet +
per-identity) now call into the TTPMixin methods. Rule catalogue
endpoint still returns [] — backed by the RuleStore which lands
at E.3.5/E.3.6.
Mounts /api/v1/ttp/* with empty-list / empty-Navigator responses.
GET endpoints viewer-gated; POST/DELETE /rules/{rule_id}/state
admin-gated server-side. POST parses JSON manually so a malformed
body returns the documented 400 (per feedback_schemathesis_400).
Drops xfail-strict markers from E.2.8 tests now that the router is
mounted; 26 tests pass against the contract handlers.
Replace repo: BaseRepository with a structural TopologyRepository protocol
in persistence.py and allocator.py. All read methods now return typed DTOs
(TopologySummary, LANRow, DeckyRow, EdgeRow) instead of raw dicts, eliminating
silent field-shape regressions across the topology subsystem.
TopologySummary gains email_personas and language_default so api_personas.py
can continue reading those fields via attribute access. hydrate() converts
DTOs to dicts before passing to _backfill_decky_configs, keeping the mutable
working-state function dict-based at its boundary. All production callers
(router handlers, mutator, CLI, heartbeat) migrated from dict/get access to
attribute access. 134 tests pass.
await inside a threading.Lock yields to the event loop while the OS
thread still holds the lock — potential deadlock under FastAPI thread
pool dispatch. asyncio.Lock is the correct primitive for async
critical sections. Also fixed stale diurnal.py docstring that had the
delegation direction backwards.
_parse_weights was silently dropping content_class values that don't
belong on their target list with no operator feedback. Changed it to
return (weights, dropped), apply_payload to collect and return all
dropped names, and put_config to include dropped_entries in the
response when non-empty.
get_config was calling planner.apply_payload on every GET request, racing
concurrent reads on module-level globals. Added a _hydrated flag + lock
so DB hydration runs at most once per process lifetime; put_config marks
it done too. Test fixture resets the flag between tests.
ServiceNotFoundError (→ 404) and ServiceConflictError (→ 409) replace the
"not found" / "already on" / "not on" substring checks in _map_mutation_error;
base ServiceMutationError still maps to 422. Fixes three pre-existing test
status-code assertions (201 vs 200 on POST endpoints).
Pure tarball construction (_build_tarball, _render_*, _iter_included,
_SYSTEMD_UNITS) moved to decnet/swarm/bundle_builder.py — no FastAPI
dependency, independently testable. EnrollBundleRequest/Response moved
to decnet/web/db/models/swarm.py alongside the other swarm DTOs.
Router drops from 504 to 260 lines; keeps only the in-memory token
registry, sweeper, and endpoints.
- DeckyServiceAddRequest gains an optional `config: dict` field, validated
against the service's config_schema before any state mutation (400 on
bad type, no half-written rows).
- Engine: add_service threads `config` into _add_topology_service /
_add_fleet_service, persisting validated cfg to decky_config.service_config
BEFORE compose regen so the first `up -d --build` materialises the env on
the new container. No follow-up apply needed.
- Frontend: shared AddServiceConfigModal — same wizard accordion shape, used by:
* DeckyCard's ADD SERVICE picker (Fleet & MazeNET inspectors via shared component)
* MazeNET Inspector's ADD SERVICE picker
* MazeNET palette drag-drop onto a deployed decky
Empty-schema services short-circuit to a one-click add (no modal flash).
Operator can cancel; errors surface in the modal.
- Tests: add_service config plumbing — persist, drop unknown keys, 400-equivalent
on bad types, back-compat empty-config.
- Drive-by: fix stale repo-method names in test_services_live.py
(create_topology_decky → add_topology_decky, get_topology_decky → list+pick helper,
service.added → service_added topic).
- GET /topologies/services/{name}/schema serves the declared ServiceConfigField
metadata so the Inspector can auto-render forms.
- PUT /(topologies/{id}/)deckies/{decky}/services/{svc}/config persists the
validated dict (DB + compose); container untouched (Save).
- POST /(topologies/{id}/)deckies/{decky}/services/{svc}/apply persists then
force-recreates <decky>-<svc> so the new env takes effect (Apply, destructive).
- New engine helper update_service_config wires both fleet and topology paths
through the existing _persist_fleet_change / _rerender_topology_compose
machinery; emits decky.<name>.service_config_changed on the bus.
Bus topic segments are NATS-style tokens and the validator at
bus/topics.py:402 rejects '.', '*', '>', whitespace. My W3 constants
'service.added' / 'service.removed' tripped this on every live
add/remove call:
ValueError: topic segment 'service.added' may not contain '.', ...
Renamed both to underscore form: DECKY_SERVICE_ADDED = 'service_added'.
Aligned the SSE forwarder's name mapping (decky.<name>.service_added →
SSE event 'decky.service_added') and the frontend's
useTopologyStream listener + MazeNET.tsx event handler. Also updated
the wiki entry with a note about the underscore.
The /topologies/{id}/events SSE proxy now subscribes to two bus
patterns concurrently and merges them through a bounded asyncio.Queue:
* topology.{id}.> — lifecycle (status, mutation.*) — unchanged.
* decky.> — per-decky events, filtered by payload.topology_id
so a fleet decky sharing a name with a topology
decky doesn't leak across.
_sse_name_for routes 'decky.<name>.service.added' to the SSE event
name 'decky.service.added' (kept the prefix so the frontend doesn't
collide with topology lifecycle events that share leaf names like
'status').
useTopologyStream surfaces the two new event names; MazeNET.tsx's
onStreamEvent optimistically patches the matching node's services
list so a second tab reflects shape changes without a refetch.
Adds a fleet_singletons array to ServiceCatalogResponse so per-decky
add UIs can filter out services like LLMNR that run once fleet-wide
(and would 422 server-side at the live add endpoint).
The existing 'services: list[str]' field is unchanged for back-compat
with MazeNET/useMazeApi.ts:257; the new field is additive.
decnet_web/src/hooks/useServiceRegistry.ts wraps the endpoint with a
module-scoped cache (registry only changes on BYOS install / plugin
drop, neither of which happens mid-session) and exposes a precomputed
.perDecky list so consumers don't need to re-derive the diff.
decnet.engine.services_live exposes add_service / remove_service for
both fleet and topology decky scopes. The host's _compose() wrapper
already supported per-service targeting (up --no-deps -d <svc>,
stop, rm -f); what was missing was the orchestration around it:
* add: validate against decnet.services.registry (rejects unknown +
fleet_singleton); persist the new services list; re-render the
per-scope compose file (so future redeploys reflect the change);
run docker compose up -d --no-deps --build <decky>-<svc>.
* remove: stop + rm -f the service container; persist; re-render
compose so a future up -d doesn't bring it back.
Both publish decky.<name>.service.added / .removed on the bus, with
the post-mutation services list. Topic constants added to
decnet.bus.topics; the matching wiki entry in wiki-checkout/Service-Bus.md
ships in a separate commit on the wiki repo (wiki-checkout/ is gitignored).
Four new admin endpoints:
* POST/DELETE /api/v1/deckies/{name}/services{,/svc}
* POST/DELETE /api/v1/topologies/{id}/deckies/{name}/services{,/svc}
ServiceMutationError messages are mapped at the API boundary to 404
(decky/topology missing), 409 (idempotency violation), 422 (unknown
or fleet_singleton service).
Extracts the docker-exec-with-base64-stdin pattern out of canary/planter
and orchestrator/drivers/ssh into a shared decnet.decky_io package.
Both consumers now delegate; the canary planter test still proves the
contract end-to-end.
Adds POST/DELETE /api/v1/deckies/files for arbitrary file drops.
Container resolution is shared with the canary path: topology_id absent
means fleet (<name>-ssh), present routes through resolve_decky_container
which picks <name>-ssh when the topology decky exposes ssh, else the
topology base container decnet_t_<id8>_<name>.
Path validation rejects relative paths and '..' traversal at the request
model layer. Bad base64 → 400; unknown topology → 404; decky not in
topology → 422; docker exec failure → 409.
POST /api/v1/canary/tokens grows an optional topology_id field. When
present, the server hydrates the topology, validates the named decky is
in it, and resolves the docker container via
planter.resolve_topology_container — <name>-ssh if the decky exposes ssh,
else the topology base container. Absent ⇒ fleet semantics, unchanged.
The token row gets a nullable topology_id column (no migration helper
per pre-v1 policy). GET /api/v1/canary/tokens accepts ?topology_id= as
a filter. DELETE re-resolves the container at revoke time so a
redeployed topology is still reachable.
422 when the named decky isn't in the topology; 404 when the topology
itself doesn't exist.
Services now print RFC 5424 to stdout; Docker captures via json-file driver.
A new host-side collector (decnet.web.collector) streams docker logs from all
running decky service containers and writes RFC 5424 + parsed JSON to the host
log file. The existing ingester continues to tail the .json file unchanged.
rsyslog can consume the .log file independently — no DECNET involvement needed.
Removes: bind-mount volume injection, _LOG_NETWORK bridge, log_target config
field and --log-target CLI flag, TCP syslog forwarding from service templates.