Land the `decnet bus` worker and `get_bus()` factory. Transport is a host-local UNIX-domain socket (0660, group=decnet); authz is the file mode. Wire framing is a tiny verb-line + 4-byte-BE length + orjson body. NATS-style wildcard topics (`*`, `>`). At-most-once, fire-and-forget — DB stays the source of truth. `FakeBus` / `NullBus` for tests and the disabled path. Cross-host federation is deferred to a future `--bridge-tcp` mode; DEBT-030 is master-only and unblocked.
17 KiB
DECNET — Technical Debt Register
Last updated: 2026-04-21 — All addressable debt cleared. Severity: 🔴 Critical · 🟠 High · 🟡 Medium · 🟢 Low
🔴 Critical
DEBT-001 — Hardcoded JWT fallback secret ✅ RESOLVED
File: decnet/env.py:15
Fixed in commit b6b046c. DECNET_JWT_SECRET is now required; startup raises ValueError if unset or set to a known-bad value.
DEBT-002 — Default admin credentials in code ✅ CLOSED (by design)
DECNET_ADMIN_PASSWORD defaults to "admin" intentionally — the web dashboard enforces a password change on first login (must_change_password=1). Startup enforcement removed as it broke tooling without adding meaningful security.
DEBT-003 — Hardcoded LDAP password placeholder ✅ CLOSED (false positive)
templates/ldap/server.py:73 — "<sasl_or_unknown>" is a log label for SASL auth attempts, not an operational credential. The LDAP template is a honeypot; it has no bind password of its own.
DEBT-004 — Wildcard CORS with no origin restriction ✅ RESOLVED
File: decnet/web/api.py:48-54
Fixed in commit b6b046c. allow_origins now uses DECNET_CORS_ORIGINS (env var, defaults to http://localhost:8080). allow_methods and allow_headers tightened to explicit allowlists.
🟠 High
DEBT-005 — Auth module has zero test coverage ✅ RESOLVED
File: decnet/web/auth.py
Comprehensive test suite added in tests/api/ covering login, password change, token validation, and JWT edge cases.
DEBT-006 — Database layer has zero test coverage ✅ RESOLVED
File: decnet/web/sqlite_repository.py
tests/api/test_repository.py added — covers log insertion, bounty CRUD, histogram queries, stats summary, and fuzz testing of all query paths. In-memory SQLite with StaticPool ensures full isolation.
DEBT-007 — Web API routes mostly untested ✅ RESOLVED
Files: decnet/web/router/ (all sub-modules)
Full coverage added across tests/api/ — fleet, logs, bounty, stream, auth all have dedicated test modules with both functional and fuzz test cases.
DEBT-008 — Auth token accepted via query string ✅ RESOLVED
File: decnet/web/dependencies.py:33-34
Query-string token fallback removed. get_current_user now accepts only Authorization: Bearer <token> header. Tokens no longer appear in access logs or browser history.
DEBT-009 — Inconsistent and unstructured logging across templates ✅ CLOSED (false positive)
All service templates already import from decnet_logging and use syslog_line() for structured output. The print(line, flush=True) present in some templates is the intentional Docker stdout channel for container log forwarding — not unstructured debug output.
DEBT-010 — decnet_logging.py duplicated across all 19 service templates ✅ RESOLVED
decnet_logging.py duplicated across all 19 service templatesFiles: templates/*/decnet_logging.py
All 22 per-directory copies deleted. Canonical source lives at templates/decnet_logging.py. deployer.py now calls _sync_logging_helper() before docker compose up — it copies the canonical file into each active template build context automatically.
🟡 Medium
DEBT-011 — No database migration system
File: decnet/web/db/sqlite/repository.py
Schema is created during startup via SQLModel.metadata.create_all. There is no Alembic or equivalent migration layer. Schema changes across deployments require manual intervention or silently break existing databases.
Status: Architectural. Deferred — requires Alembic integration and migration history bootstrapping.
DEBT-012 — No environment variable validation schema ✅ RESOLVED
File: decnet/env.py
DECNET_API_PORT and DECNET_WEB_PORT now validated via _port() — enforces integer type and 1–65535 range, raises ValueError with a clear message on bad input.
DEBT-013 — Unvalidated input on decky_name route parameter ✅ RESOLVED
decky_name route parameterFile: decnet/web/router/fleet/api_mutate_decky.py:10
decky_name now declared as Path(..., pattern=r"^[a-z0-9\-]{1,64}$") — FastAPI rejects non-matching values with 422 before any downstream processing.
DEBT-014 — Streaming endpoint has no error handling ✅ RESOLVED
File: decnet/web/router/stream/api_stream_events.py
event_generator() now wrapped in try/except. asyncio.CancelledError is handled silently (clean disconnect). All other exceptions log server-side via log.exception() and yield an event: error SSE frame to the client.
DEBT-015 — Broad exception detail leaked to API clients ✅ RESOLVED
File: decnet/web/router/fleet/api_deploy_deckies.py:78
Raw exception message no longer returned to client. Full exception now logged server-side via log.exception(). Client receives generic "Deployment failed. Check server logs for details.".
DEBT-016 — Unvalidated log query parameters ✅ RESOLVED
File: decnet/web/router/logs/api_get_logs.py:12-19
search capped at max_length=512. start_time and end_time validated against ^\d{4}-\d{2}-\d{2}[ T]\d{2}:\d{2}:\d{2}$ regex pattern. FastAPI rejects invalid input with 422.
DEBT-017 — Silent DB lock retry during startup ✅ RESOLVED
File: decnet/web/api.py:20-26
Each retry attempt now emits log.warning("DB init attempt %d/5 failed: %s", attempt, exc). After all retries exhausted, log.error() is emitted so degraded startup is always visible in logs.
DEBT-018 — No Docker HEALTHCHECK in any template ✅ RESOLVED
Files: All 20 templates/*/Dockerfile
All 24 Dockerfiles updated with:
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD kill -0 1 || exit 1
DEBT-019 — Most template containers run as root ✅ RESOLVED
Files: All templates/*/Dockerfile except Cowrie
All 24 Dockerfiles now create a decnet system user, use setcap cap_net_bind_service+eip on the Python binary (allows binding ports < 1024 without root), and drop to USER decnet before ENTRYPOINT.
DEBT-020 — Swagger/OpenAPI disabled in production ✅ RESOLVED
File: decnet/web/api.py:43-45
All route decorators now declare responses={401: {"description": "Not authenticated"}, 422: {"description": "Validation error"}}. OpenAPI schema is complete for all endpoints.
DEBT-021 — sqlite_repository.py is a god module ✅ RESOLVED
sqlite_repository.py is a god module~File: decnet/web/sqlite_repository.py (400 lines)
Fully refactored to decnet/web/db/ modular layout: models.py (SQLModel schema), repository.py (abstract base), sqlite/repository.py (SQLite implementation), sqlite/database.py (engine/session factory). Commit de84cc6.
DEBT-026 — IMAP/POP3 bait emails not configurable via service config
Files: templates/imap/server.py, templates/pop3/server.py, decnet/services/imap.py, decnet/services/pop3.py
Bait emails are hardcoded. A stub env var IMAP_EMAIL_SEED is read but currently ignored. Full implementation requires:
IMAP_EMAIL_SEEDpoints to a JSON file with a list of{from_, to, subject, date, body}dicts.templates/imap/server.pyloads and merges/replaces_BAIT_EMAILSfrom that file at startup.decnet/services/imap.pycompose_fragment()readsservice_cfg["email_seed"]and injectsIMAP_EMAIL_SEED+ a bind-mount for the seed file into the compose fragment.- Same pattern for POP3 (
POP3_EMAIL_SEED).
Status: Stub in place — full wiring deferred to next session.
DEBT-027 — Dynamic Bait Store
Files: templates/redis/server.py, templates/ftp/server.py
The bait store and honeypot files are hardcoded. A dynamic injection framework should be created to populate this payload across different honeypots.
Status: Deferred — out of current scope.
DEBT-028 — Test coverage for api_deploy_deckies.py
File: decnet/web/router/fleet/api_deploy_deckies.py (24% coverage)
The deploy endpoint exercises Docker Compose orchestration via decnet.engine.deploy, which creates MACVLAN/IPvlan networks and runs docker compose up. Meaningful tests require mocking the entire Docker SDK + subprocess layer, coupling tightly to implementation details.
Status: Deferred — test after Docker-in-Docker CI is available.
DEBT-029 — Service-wide pub/sub bus worker (decnet bus) ✅ RESOLVED
Files: decnet/bus/ (worker.py, factory.py, unix_client.py, unix_server.py, protocol.py, fake.py, base.py, topics.py), decnet/cli/bus.py, deploy/decnet-bus.service, tests/bus/ (62 tests green).
CLAUDE.md promises a ServiceBus worker and a get_bus() factory, but neither exists. Today there is no event plumbing between workers: mutator, correlator, profiler, sniffer, and prober cannot publish state transitions to interested consumers. The web SSE endpoint (/stream) polls the DB every ~1s inside its generator loop as a result. Downstream features that need this infrastructure: live topology mutations (DEBT-030), pulsating/live topology visualization, automatic mutations, network traffic simulation, attacker-pool push updates.
MVP scope (host-local):
decnet buslong-running worker, systemd-supervised like every other worker. Runs on every host — master and each swarm agent — independently.- Transport: UNIX-domain socket (default
/run/decnet/bus.sock, fallback~/.decnet/bus.sockin dev). Kernel-authenticated peer delivery; authorization is socket file permissions (0660, group=decnet). No TCP, no mTLS, no external broker. - Wire protocol: tiny hand-rolled framing — 1 ASCII verb line (
PUB <topic>,SUB <pattern>,EVT <topic>,HELLO,BYE) + 4-byte big-endian body length + orjson body. Sharedmatches(pattern, topic)helper implements NATS-style wildcards (*= one token,>= one-or-more trailing tokens). - Factory
get_bus()returns a client withpublish(topic, payload)/subscribe(pattern) -> Subscription(async ctx + async iterator). In-processFakeBusfor unit tests;NullBuswhenDECNET_BUS_ENABLED=false. - Topic hierarchy locked early:
topology.{id}.mutation.{state},topology.{id}.status,decky.{id}.state,decky.{id}.traffic,attacker.observed,system.log,system.bus.health. - Delivery semantics: at-most-once, fire-and-forget. Per-subscriber bounded queue with drop-oldest on overflow. No replay, no persistence, no queue groups, no ordering guarantees. DB remains the source of truth; the bus is the notification layer only.
- First consumer proving end-to-end: SSE route for topology events (DEBT-030).
- Later: migrate
/streamoff its internal poll loop onto the bus for global events.
Cross-host federation is out of MVP scope. Each host runs its own bus — swarm agents and the master do not share a bus substrate. If a use case emerges that requires cross-host pub/sub, it will land as a decnet bus --bridge-tcp mode that proxies the UNIX socket over the existing swarm mTLS infra. DEBT-030 is master-only and therefore unblocked by this deferral.
Status: ✅ Resolved — MVP shipped. Host-local UNIX-socket bus, get_bus() factory, decnet bus worker with heartbeats, systemd unit, 62 unit/integration tests green. DEBT-030 is now unblocked.
DEBT-030 — Live (hot) topology mutations via web UI
Files: decnet/web/router/topology/api_mutations.py (enqueue endpoint already exists), decnet/mutator/engine.py + ops.py (reconciler already applies all 7 ops), web/src/hooks/useMazeApi.ts (missing enqueue methods), web/src/components/MazeNET.tsx (editor treats every topology as pending).
Backend is already there:
TopologyMutationtable (decnet/web/db/models.py:322-358) supportsadd_lan,remove_lan,attach_decky,detach_decky,remove_decky,update_decky,update_lan.POST /topologies/{id}/mutationsenqueues, gated toactive|degraded.- Mutator watch loop (
decnet/mutator/engine.py:136-190) claims atomically, dispatches toops.py, does Docker best-effort, flips topology todegradedon failure.
Gap is entirely in the frontend + event delivery:
useMazeApi.tshas noenqueueMutation()peer todeployTopology(); editor edits onactivetopologies currently no-op / 4xx.- No mutation-status UI (pending / applying / applied / failed badges, audit log).
- No server→client push channel for mutation state transitions — depends on DEBT-029.
Design (agreed):
- Staged buffer (client-side, Zustand, not persisted): every editor action pushes a
TopologyMutationontopendingOps[]. Undo = pop. Reset = clear. - Apply (N changes) button opens a diff modal rendering ops in plain English, then POSTs the batch. Batch carries the
topology.versionobserved when staging began; server returns 409 on drift. - Batch atomicity = honest partial. Server enqueues N rows in order; mutator applies one-by-one. If op 3 fails, 1-2 stay applied, topology flips to
degraded, user decides to fix-forward or enqueue a manual revert. (Docker ops aren't transactional; pretending otherwise causes worse bugs than honesty.) - Visual states compose per existing rule:
pending-mutation,applying,failedlayer on top ofrunning / inactive / selected, never replace them. - Push via SSE over the bus (not polling): new route
GET /api/v1/topologies/{id}/eventssubscribes totopology.{id}.*on the service bus and forwards as SSE. Envelope:{v, type, ts, payload}. Day-one event types:mutation.enqueued|applying|applied|failed,topology.status_changed,topology.version_bumped. Room to grow:decky.state_changed,decky.traffic,attacker.observed. - Separate from
/streamdeliberately: different auth scopes, different fan-out shape (per-topology vs global), different failure isolation. Two routes, one bus.
Status: Deferred — blocked on DEBT-029 (bus worker). Once the bus exists, this is ~1-2 days for route + frontend + tests.
🟢 Low
DEBT-022 — Debug print() in correlation engine ✅ CLOSED (false positive)
print() in correlation enginedecnet/correlation/engine.py:20 — The print() call is inside the module docstring as a usage example, not in executable code. No production code path affected.
DEBT-023 — Unpinned base Docker images
Files: All templates/*/Dockerfile
debian:bookworm-slim and similar tags are used without digest pinning. Image contents can silently change on docker pull, breaking reproducibility and supply-chain integrity.
Status: Deferred — requires docker pull access to resolve current digests for each base image.
DEBT-024 — Stale service version hardcoded in Redis template ✅ RESOLVED
File: templates/redis/server.py:15
REDIS_VERSION updated from "7.0.12" to "7.2.7" (current stable).
DEBT-025 — No lock file for Python dependencies ✅ RESOLVED
Files: Project root
requirements.lock generated via pip freeze. Reproducible installs now available via pip install -r requirements.lock.
Summary
| ID | Severity | Area | Status |
|---|---|---|---|
| ✅ | Security / Auth | resolved b6b046c |
|
| ✅ | Security / Auth | closed (by design) | |
| ✅ | Security / Infra | closed (false positive) | |
| ✅ | Security / API | resolved b6b046c |
|
| ✅ | Testing | resolved | |
| ✅ | Testing | resolved | |
| ✅ | Testing | resolved | |
| ✅ | Security / Auth | resolved | |
| ✅ | Observability | closed (false positive) | |
| ✅ | Code Duplication | resolved | |
| DEBT-011 | 🟡 Medium | DB / Migrations | deferred (Alembic scope) |
| ✅ | Config | resolved | |
| ✅ | Security / Input | resolved | |
| ✅ | Reliability | resolved | |
| ✅ | Security / API | resolved | |
| ✅ | Security / API | resolved | |
| ✅ | Reliability | resolved | |
| ✅ | Infra | resolved | |
| ✅ | Security / Infra | resolved | |
| ✅ | Docs | resolved | |
| ✅ | Architecture | resolved de84cc6 |
|
| ✅ | Code Quality | closed (false positive) | |
| DEBT-023 | 🟢 Low | Infra | deferred (needs docker pull) |
| ✅ | Infra | resolved | |
| ✅ | Build | resolved | |
| DEBT-026 | 🟡 Medium | Features | deferred (out of scope) |
| DEBT-027 | 🟡 Medium | Features | deferred (out of scope) |
| DEBT-028 | 🟡 Medium | Testing | deferred (needs DinD CI) |
| DEBT-029 | 🟡 Medium | Architecture / Bus | ✅ resolved |
| DEBT-030 | 🟡 Medium | Web / Live mutations | deferred (unblocked) |
Remaining open: DEBT-011 (Alembic), DEBT-023 (image pinning), DEBT-026 (modular mailboxes), DEBT-027 (Dynamic bait store), DEBT-028 (deploy endpoint tests), DEBT-029 (service bus worker), DEBT-030 (live topology mutations) Estimated remaining effort: ~12 hours + ~3 days for DEBT-029/030