The /topologies/{id}/events SSE proxy now subscribes to two bus
patterns concurrently and merges them through a bounded asyncio.Queue:
* topology.{id}.> — lifecycle (status, mutation.*) — unchanged.
* decky.> — per-decky events, filtered by payload.topology_id
so a fleet decky sharing a name with a topology
decky doesn't leak across.
_sse_name_for routes 'decky.<name>.service.added' to the SSE event
name 'decky.service.added' (kept the prefix so the frontend doesn't
collide with topology lifecycle events that share leaf names like
'status').
useTopologyStream surfaces the two new event names; MazeNET.tsx's
onStreamEvent optimistically patches the matching node's services
list so a second tab reflects shape changes without a refetch.
Adds a fleet_singletons array to ServiceCatalogResponse so per-decky
add UIs can filter out services like LLMNR that run once fleet-wide
(and would 422 server-side at the live add endpoint).
The existing 'services: list[str]' field is unchanged for back-compat
with MazeNET/useMazeApi.ts:257; the new field is additive.
decnet_web/src/hooks/useServiceRegistry.ts wraps the endpoint with a
module-scoped cache (registry only changes on BYOS install / plugin
drop, neither of which happens mid-session) and exposes a precomputed
.perDecky list so consumers don't need to re-derive the diff.
decnet.engine.services_live exposes add_service / remove_service for
both fleet and topology decky scopes. The host's _compose() wrapper
already supported per-service targeting (up --no-deps -d <svc>,
stop, rm -f); what was missing was the orchestration around it:
* add: validate against decnet.services.registry (rejects unknown +
fleet_singleton); persist the new services list; re-render the
per-scope compose file (so future redeploys reflect the change);
run docker compose up -d --no-deps --build <decky>-<svc>.
* remove: stop + rm -f the service container; persist; re-render
compose so a future up -d doesn't bring it back.
Both publish decky.<name>.service.added / .removed on the bus, with
the post-mutation services list. Topic constants added to
decnet.bus.topics; the matching wiki entry in wiki-checkout/Service-Bus.md
ships in a separate commit on the wiki repo (wiki-checkout/ is gitignored).
Four new admin endpoints:
* POST/DELETE /api/v1/deckies/{name}/services{,/svc}
* POST/DELETE /api/v1/topologies/{id}/deckies/{name}/services{,/svc}
ServiceMutationError messages are mapped at the API boundary to 404
(decky/topology missing), 409 (idempotency violation), 422 (unknown
or fleet_singleton service).
Extracts the docker-exec-with-base64-stdin pattern out of canary/planter
and orchestrator/drivers/ssh into a shared decnet.decky_io package.
Both consumers now delegate; the canary planter test still proves the
contract end-to-end.
Adds POST/DELETE /api/v1/deckies/files for arbitrary file drops.
Container resolution is shared with the canary path: topology_id absent
means fleet (<name>-ssh), present routes through resolve_decky_container
which picks <name>-ssh when the topology decky exposes ssh, else the
topology base container decnet_t_<id8>_<name>.
Path validation rejects relative paths and '..' traversal at the request
model layer. Bad base64 → 400; unknown topology → 404; decky not in
topology → 422; docker exec failure → 409.
POST /api/v1/canary/tokens grows an optional topology_id field. When
present, the server hydrates the topology, validates the named decky is
in it, and resolves the docker container via
planter.resolve_topology_container — <name>-ssh if the decky exposes ssh,
else the topology base container. Absent ⇒ fleet semantics, unchanged.
The token row gets a nullable topology_id column (no migration helper
per pre-v1 policy). GET /api/v1/canary/tokens accepts ?topology_id= as
a filter. DELETE re-resolves the container at revoke time so a
redeployed topology is still reachable.
422 when the named decky isn't in the topology; 404 when the topology
itself doesn't exist.
Topology deploys now plant the configured canary baseline set on every
decky in the topology, mirroring the fleet-deploy hook. Containers are
resolved via resolve_topology_container — <decky>-ssh when the decky
exposes an ssh service, else the topology base container
decnet_t_<id8>_<decky>.
The planter's plant/revoke/seed_baseline grow an optional container=
kwarg; default preserves the fleet <name>-ssh resolution.
The Bounty Vault page only read from the Bounty table, but
inotifywait-captured file drops (event_type=file_captured) and SMTP
quarantined messages (event_type=message_stored) were only landing in
the Logs table. AttackerDetail's tabs queried logs directly, so they
showed up per-attacker but were invisible on the global Vault page.
Mirror both events into Bounty as bounty_type=artifact with
payload.kind ∈ {file, mail} so the existing dedup
(bounty_type, attacker_ip, payload) collapses repeats by sha256. Add an
ARTIFACTS segment to the Vault filter row, plus dedicated render
branches: file drops show orig_path + size + writer attribution; mail
shows subject + From + attachment count + size, with the Mail icon
distinguishing them from FileText for file drops.
Forward-only — existing logs stay where they are. A backfill pass would
be straightforward (read Log WHERE event_type IN ('file_captured',
'message_stored') and feed each row through _extract_bounty) but is out
of scope here.
sshd, pam_unix, sudo, CRON, systemd, kernel, rsyslogd, and dbus-daemon
all share the SSH/telnet decky containers and write to the same syslog
socket as DECNET's own emitters. Their output was being parsed and
ingested into the JSON stream, the dashboard, and the profiler — pure
noise: sshd's "Failed password for root from X" duplicates the
auth-helper's structured auth_attempt event, pam_unix repeats it again,
CRON/systemd say nothing about attacker behavior.
Drop these APP-NAMEs in _should_ingest before the JSON write and bus
publish. Raw .log file still captures everything for forensics. The
denylist is overridable with DECNET_COLLECTOR_DROP_APPS so operators
can extend it without code changes.
Add --rfc5424 --msgid command to the logger invocation in SSH and telnet
decky bashrc. MSGID arrives as "command" instead of NIL, which is what
the profiler's _COMMAND_EVENT_TYPES filter expects. The parser heuristic
shipped in d4591b3 stays as a safety net for any future emitter that
forgets the flags or for inflight pre-rebuild containers.
SSH/telnet decky containers emit shell commands via `logger -t bash "CMD …"`
which produces RFC 5424 lines with MSGID=NIL. Both parsers were leaving
event_type="-", so the behavioral profiler's `_COMMAND_EVENT_TYPES` filter
silently dropped them — the IP profile existed but no command transcripts
or artifacts. Confirmed in the wild: 44/48 events from one attacker were
event_type="-".
Rewrite event_type to "command" in both parsers when MSGID=NIL and the
msg starts with "CMD ". Correlation parser also extracts the cmd= payload
into fields["command"] so the profiler can build the transcript; collector
parser leaves fields={} to avoid duplicate pills in the dashboard.
Cowrie was exposing an SSH daemon on port 22 alongside the telnet service
even when COWRIE_SSH_ENABLED=false, contaminating deployments that did not
request an SSH service.
New implementation mirrors the SSH service pattern:
- busybox telnetd in foreground mode on port 23
- /bin/login for real PAM authentication (brute-force attempts logged)
- rsyslog RFC 5424 bridge piped to stdout for Docker log capture
- Configurable root password and hostname via env vars
- No Cowrie dependency
real_ssh was a separate service name pointing to the same template and
behaviour as ssh. Merged them: ssh is now the single real-OpenSSH service.
- Rename templates/real_ssh/ → templates/ssh/
- Remove decnet/services/real_ssh.py
- Deaddeck archetype updated: services=["ssh"]
- Merge test_real_ssh.py into test_ssh.py (includes deaddeck + logging tests)
- Drop decnet.services.real_ssh from test_build module list
The collector subprocess was spawned via 'python3 -m decnet.cli collect'
but cli.py had no 'if __name__ == __main__: app()' guard. Python executed
the module, defined all functions, then exited cleanly with code 0 without
ever calling the collect command. No output, no log file, exit 0 — silent
non-start every time.
Also route collector stderr to <log_file>.collector.log so future crashes
are visible instead of disappearing into DEVNULL.
Collector and mutator watcher subprocesses were spawned without
start_new_session=True, leaving them in the parent's process group.
SIGHUP (sent when the controlling terminal closes) killed both
processes silently — stdout/stderr were DEVNULL so the crash was
invisible.
Also update test_services and test_composer to reflect the ssh plugin
no longer using Cowrie env vars (replaced with SSH_ROOT_PASSWORD /
SSH_HOSTNAME matching the real_ssh plugin).
Scraps the Cowrie emulation layer. The real_ssh template now runs a
genuine sshd backed by a three-layer logging stack forwarded to stdout
as RFC 5424 for the DECNET collector:
auth,authpriv.* → rsyslogd → named pipe → stdout (logins/failures)
user.* → rsyslogd → named pipe → stdout (PROMPT_COMMAND cmds)
sudo syslog=auth → rsyslogd → named pipe → stdout (privilege escalation)
sudo logfile → /var/log/sudo.log (local backup with I/O)
The ssh.py service plugin now points to templates/real_ssh and drops all
COWRIE_* / NODE_NAME env vars, sharing the same compose fragment shape as
real_ssh.py.
_load_service_container_names() reads decnet-state.json and builds the
exact set of expected container names ({decky}-{service}). is_service_container()
and is_service_event() do a direct set lookup — no regex, no label
inspection, no heuristics.
Two bugs caused the log file to never be written:
1. is_service_container() used regex '^decky-\d+-\w' which only matched
the old decky-01-smtp naming style. Actual containers are named
omega-decky-smtp, relay-decky-smtp, etc. Fixed by using Docker Compose
labels instead: com.docker.compose.project=decnet + non-empty
depends_on discriminates service containers from base (sleep infinity)
containers reliably regardless of decky naming convention.
Added is_service_event() for the Docker events path.
2. The collector was only started when --api was used. Added a 'collect'
CLI subcommand (decnet collect --log-file <path>) and wired it into
deploy as an auto-started background process when --api is not in use.
Default log path: /var/log/decnet/decnet.log
When --parallel is set:
- DOCKER_BUILDKIT=1 is injected into the subprocess environment to
ensure BuildKit is active regardless of host daemon config
- docker compose build runs first (all images built concurrently)
- docker compose up -d follows without --build (no redundant checks)
Without --parallel the original up --build path is preserved.
--parallel and --no-cache compose correctly (build --no-cache).
The BASE_IMAGE build arg was being unconditionally overwritten by
composer.py with the decky's distro build_base (debian:bookworm-slim),
turning the conpot container into a bare Debian image with no conpot
installation — hence the silent restart loop.
Two fixes:
1. composer.py: use args.setdefault() so services that pre-declare
BASE_IMAGE in their compose_fragment() win over the distro default.
2. conpot.py: pre-declare BASE_IMAGE=honeynet/conpot:latest in build
args so it always uses the upstream image regardless of decky distro.
Also removed the USER decnet switch from the conpot Dockerfile. The
upstream image already runs as the non-root 'conpot' user; switching to
'decnet' broke pkg_resources because conpot's eggs live under
/home/conpot/.local and are only on sys.path for that user.
Windows: both 0 (no ICMP rate limiting — matches real Windows behavior)
Linux: 1000ms / mask 6168 (kernel defaults)
BSD: 250ms / mask 6168 (FreeBSD default is faster than Linux)
Embedded/Cisco: both 0 (most firmware doesn't rate-limit ICMP)
These affect nmap's IE and U1 probe groups which measure ICMP error
response timing to closed UDP ports. Windows responds to all probes
instantly while Linux throttles to ~1/sec.
Tests: 10 new cases (5 per sysctl). Suite: 822 passed.
ip_no_pmtu_disc controls PMTU discovery for UDP/ICMP paths only.
TI=Z originates from ip_select_ident() in the kernel TCP stack setting
IP ID=0 for DF=1 TCP packets — a namespace-scoped sysctl cannot change this.
The previous commit was based on incorrect root-cause analysis.
When ip_no_pmtu_disc=0 the Linux kernel sets DF=1 on TCP packets and uses
IP ID=0 (RFC 6864). nmap's TI=Z fingerprint has no Windows match in its DB,
causing 91% confidence guesses of 'Linux 2.4/2.6 embedded' regardless of
TTL being 128. Setting ip_no_pmtu_disc=1 allows non-zero IP ID generation.
Trade-off: DF bit is not set on outgoing packets (slightly wrong for Windows)
but TI=Z is far more damaging to the spoof than losing DF accuracy.
Add tcp_timestamps, tcp_window_scaling, tcp_sack, tcp_ecn, ip_no_pmtu_disc,
and tcp_fin_timeout to every OS profile in OS_SYSCTLS.
All 6 are network-namespace-scoped and safe to set per-container without
--privileged. They directly influence nmap's OPS, WIN, ECN, and T2-T6
probe groups, making OS family detection significantly more convincing.
Key changes:
- tcp_timestamps=0 for windows/embedded/cisco (strongest Windows discriminator)
- tcp_ecn=2 for linux (ECN offer), 0 for all others
- tcp_sack=0 / tcp_window_scaling=0 for embedded/cisco
- ip_no_pmtu_disc=1 for embedded/cisco (DF bit ICMP behaviour)
- Expose _REQUIRED_SYSCTLS frozenset for completeness assertions
Tests: 88 new test cases across all OS families and composer integration.
Total suite: 812 passed.
- decnet/services/smtp_relay.py: open relay variant of smtp, same template
with SMTP_OPEN_RELAY=1 baked into the environment
- tests/service_testing/__init__.py: init so pytest discovers the subdirectory
Services now print RFC 5424 to stdout; Docker captures via json-file driver.
A new host-side collector (decnet.web.collector) streams docker logs from all
running decky service containers and writes RFC 5424 + parsed JSON to the host
log file. The existing ingester continues to tail the .json file unchanged.
rsyslog can consume the .log file independently — no DECNET involvement needed.
Removes: bind-mount volume injection, _LOG_NETWORK bridge, log_target config
field and --log-target CLI flag, TCP syslog forwarding from service templates.
- decnet/env.py: DECNET_JWT_SECRET and DECNET_ADMIN_PASSWORD are now
required env vars; startup raises ValueError if unset or set to a
known-bad default ("admin", "password", etc.)
- decnet/env.py: add DECNET_CORS_ORIGINS (comma-separated, defaults to
http://localhost:8080) replacing the previous allow_origins=["*"]
- decnet/web/api.py: use DECNET_CORS_ORIGINS and tighten allow_methods
and allow_headers to explicit lists
- tests/conftest.py: set required env vars at module level so test
collection works without real credentials
- tests/test_web_api.py, test_web_api_fuzz.py: use DECNET_ADMIN_PASSWORD
from env instead of hardcoded "admin"
Closes DEBT-001, DEBT-002, DEBT-004