Replaces the single persistent open() with inode-based reopen logic.
If decnet.log or decnet.json is deleted or renamed by logrotate, the
next write detects the stale inode, closes the old handle, and creates
a fresh file — preventing silent data loss to orphaned inodes.
- Ingester now loads byte-offset from DB on startup (key: ingest_worker_position)
and saves it after each batch — prevents full re-read on every API restart
- On file truncation/rotation the saved offset is reset to 0
- Profiler worker now loads last_log_id from DB on startup — every restart
becomes an incremental update instead of a full cold rebuild
- Updated all affected tests to mock get_state/set_state; added new tests
covering position restore, set_state call, truncation reset, and cursor
restore/cold-start paths
Cold start fetched all logs in one bulk query then processed them in a tight
synchronous loop with no yields, blocking the asyncio event loop for seconds
on datasets of 30K+ rows. This stalled every concurrent await — including the
SSE stream generator's initial DB calls — causing the dashboard to show
INITIALIZING SENSORS indefinitely.
Changes:
- Drop _cold_start() and get_all_logs_raw(); uninitialized state now runs the
same cursor loop as incremental, starting from last_log_id=0
- Yield to the event loop after every _BATCH_SIZE rows (asyncio.sleep(0))
- Add SSE keepalive comment as first yield so the connection flushes before
any DB work begins
- Add Cache-Control/X-Accel-Buffering headers to StreamingResponse
Existing MySQL databases hit a DataError when the commands/fingerprints
JSON blobs exceed 64 KiB (TEXT limit). _BIG_TEXT emits MEDIUMTEXT only
at CREATE TABLE time; create_all() is a no-op on existing columns.
Add MySQLRepository._migrate_column_types() that queries
information_schema and issues ALTER TABLE … MODIFY COLUMN … MEDIUMTEXT
for the five affected columns (commands, fingerprints, services, deckies,
state.value) whenever they are still TEXT. Called from an overridden
initialize() after _migrate_attackers_table() and before create_all().
Add tests/test_mysql_migration.py covering: ALTER issued for TEXT columns,
no-op for already-MEDIUMTEXT, idempotency, DEFAULT clause correctness,
and initialize() call order.
- test_mysql_backend_live.py: live integration tests for MySQL connections
- test_mysql_histogram_sql.py: dialect-specific histogram query tests
- test_mysql_url_builder.py: MySQL connection string construction
- mysql_spinup.sh: Docker spinup script for local MySQL testing
Connection-lifecycle events (connect, disconnect, accept, close) fire once
per TCP connection. During a portscan or credential-stuffing run this
firehoses the SQLite ingester with tiny WAL writes and starves all reads
until the queue drains.
The collector now deduplicates these events by
(attacker_ip, decky, service, event_type) over a 1-second window before
writing to the .json ingestion stream. The raw .log file is untouched, so
rsyslog/SIEM still see every event for forensic fidelity.
Tunable via DECNET_COLLECTOR_RL_WINDOW_SEC and DECNET_COLLECTOR_RL_EVENT_TYPES.
The live test modules set DECNET_CONTRACT_TEST=true at module level,
which persisted across xdist workers and caused the mutate endpoint
to short-circuit before the mock was reached. Clear the env var in
affected tests with monkeypatch.delenv.
21 live tests covering all background workers against real resources:
collector (real Docker daemon), ingester (real filesystem + DB),
attacker worker (real DB profiles), sniffer (real network interfaces),
API lifespan (real health endpoint), and cross-service cascade isolation.
9 tests covering auth enforcement, component reporting, status
transitions, degraded mode, and real DB/Docker state validation.
Runs with -m live alongside other live service tests.
23 tests verifying that each background worker degrades gracefully
when its dependencies are unavailable, and that failures don't cascade:
- Collector: Docker unavailable, no state file, empty fleet
- Ingester: missing log file, unset env var, malformed JSON, fatal DB
- Attacker: DB errors, empty database
- Sniffer: missing interface, no state, scapy crash, non-decky traffic
- API lifespan: all workers failing, DB init failure, sniffer import fail
- Cascade: collector→ingester, ingester→attacker, sniffer→collector, DB→sniffer
Replace per-decky sniffer containers with a single host-side sniffer
that monitors all traffic on the MACVLAN interface. Runs as a background
task in the FastAPI lifespan alongside the collector, fully fault-isolated
so failures never crash the API.
- Add fleet_singleton flag to BaseService; sniffer marked as singleton
- Composer skips fleet_singleton services in compose generation
- Fleet builder excludes singletons from random service assignment
- Extract TLS fingerprinting engine from templates/sniffer/server.py
into decnet/sniffer/ package (parameterized for fleet-wide use)
- Sniffer worker maps packets to deckies via IP→name state mapping
- Original templates/sniffer/server.py preserved for future use
Extends the prober with two new active probe types alongside JARM:
- HASSHServer: SSH server fingerprinting via KEX_INIT algorithm ordering
(MD5 hash of kex;enc_s2c;mac_s2c;comp_s2c, pure stdlib)
- TCP/IP stack: OS/tool fingerprinting via SYN-ACK analysis using scapy
(TTL, window size, DF bit, MSS, TCP options ordering, SHA256 hash)
Worker probe cycle now runs three phases per IP with independent
per-type port tracking. Ingester extracts bounties for all three
fingerprint types.
Reverts commits 8c249f6, a6c7cfd, 7ff5703. The SSH log relay approach
requires container redeployment and doesn't retroactively fix existing
attacker profiles. Rolling back to reassess the approach.
New log_relay.py replaces raw 'cat' on the rsyslog pipe. Intercepts
sshd and bash lines and re-emits them as structured RFC 5424 events:
login_success, session_opened, disconnect, connection_closed, command.
Parsers updated to accept non-nil PROCID (sshd uses PID).
The SSH honeypot logs commands via PROMPT_COMMAND logger as:
<14>1 ... bash - - - CMD uid=0 pwd=/root cmd=ls
These lines had service=bash and event_type=-, so the attacker worker
never recognized them as commands. Both the collector and correlation
parsers now detect the CMD pattern and normalize to service=ssh,
event_type=command, with uid/pwd/command in fields.
New GET /attackers/{uuid}/commands?limit=&offset=&service= endpoint
serves commands with server-side pagination and optional service filter.
AttackerDetail frontend fetches commands from this endpoint with
page controls. Service badge filter now drives both the API query
and the local fingerprint filter.
API now accepts ?service=https to filter attackers by targeted service.
Service badges are clickable in both the attacker list and detail views,
navigating to a filtered view. Active filter shows as a dismissable tag.
TLS-wrapped variant of the HTTP honeypot. Auto-generates a self-signed
certificate on startup if none is provided. Supports all the same persona
options (fake_app, server_header, custom_body, etc.) plus TLS_CERT,
TLS_KEY, and TLS_CN configuration.
EHLO/HELO require a domain or address-literal argument. Previously
the server accepted bare EHLO with no argument and responded 250,
which deviates from the spec and makes the honeypot easier to
fingerprint.
The collector kept streaming stale container IDs after a redeploy,
causing new service logs to never reach decnet.log. Now _kill_api()
also matches and SIGTERMs any running decnet.cli collect process.
Two bugs fixed:
- data_received only split on CRLF, so clients sending bare LF (telnet, nc,
some libraries) got no responses at all. Now splits on LF and strips
trailing CR, matching real Postfix behavior.
- AUTH PLAIN without inline credentials set state to "await_plain" but no
handler existed for that state, causing the next line to be dispatched as
a normal command. Added the missing state handler.
Migrate Attacker model from IP-based to UUID-based primary key with
auto-migration for old schema. Add GET /attackers (paginated, search,
sort) and GET /attackers/{uuid} API routes. Rewrite Attackers.tsx as
a card grid with full threat info and create AttackerDetail.tsx as a
dedicated detail page with back navigation, stats, commands table,
and fingerprints.
- Modify Rfc5424Formatter to read decnet_component from LogRecord
and use it as RFC 5424 APP-NAME field (falls back to 'decnet')
- Add get_logger(component) factory in decnet/logging/__init__.py
with _ComponentFilter that injects decnet_component on each record
- Wire all five layers to their component tag:
cli -> 'cli', engine -> 'engine', api -> 'api' (api.py, ingester,
routers), mutator -> 'mutator', collector -> 'collector'
- Add structured INFO/DEBUG/WARNING/ERROR log calls throughout each
layer per the defined vocabulary; DEBUG calls are suppressed unless
DECNET_DEVELOPER=true
- Add tests/test_logging.py covering factory, filter, formatter
component-awareness, fallback behaviour, and level gating
- Fixed CLI tests by patching local imports at source (psutil, os, Path).
- Fixed Collector tests by globalizing docker.from_env mock.
- Stabilized SSE stream tests via AsyncMock and immediate generator termination to prevent hangs.
- Achieved >80% coverage on CLI (84%), Collector (97%), and DB Repository (100%).
- Implemented SMTP Relay service tests (100%).
Spins up each service's server.py in a real subprocess via a free ephemeral
port (PORT env var), connects with real protocol clients, and asserts both
correct protocol behavior and RFC 5424 log output.
- 44 live tests across 10 services: http, ftp, smtp, redis, mqtt,
mysql, postgres, mongodb, pop3, imap
- Shared conftest.py: _ServiceProcess (bg reader thread + queue),
free_port, live_service fixture, assert_rfc5424 helper
- PORT env var added to all 10 targeted server.py templates
- New pytest marker `live`; excluded from default addopts run
- requirements-live-tests.txt: flask, twisted + protocol clients
MongoDB had the same infinite-loop bug as MSSQL (msg_len=0 → buffer never
shrinks in while loop). Postgres, MySQL, and MQTT had related length-field
issues (stuck state, resource exhaustion, overlong remaining-length).
Also fixes an existing MongoDB _op_reply struct.pack format bug (extra 'q'
specifier caused struct.error on any OP_QUERY response).
Adds 53 regression + protocol boundary tests across MSSQL, MongoDB,
Postgres, MySQL, and MQTT, including a _run_with_timeout threading harness
to catch infinite loops and @pytest.mark.fuzz hypothesis tests for each.
Cowrie was exposing an SSH daemon on port 22 alongside the telnet service
even when COWRIE_SSH_ENABLED=false, contaminating deployments that did not
request an SSH service.
New implementation mirrors the SSH service pattern:
- busybox telnetd in foreground mode on port 23
- /bin/login for real PAM authentication (brute-force attempts logged)
- rsyslog RFC 5424 bridge piped to stdout for Docker log capture
- Configurable root password and hostname via env vars
- No Cowrie dependency
real_ssh was a separate service name pointing to the same template and
behaviour as ssh. Merged them: ssh is now the single real-OpenSSH service.
- Rename templates/real_ssh/ → templates/ssh/
- Remove decnet/services/real_ssh.py
- Deaddeck archetype updated: services=["ssh"]
- Merge test_real_ssh.py into test_ssh.py (includes deaddeck + logging tests)
- Drop decnet.services.real_ssh from test_build module list
Collector and mutator watcher subprocesses were spawned without
start_new_session=True, leaving them in the parent's process group.
SIGHUP (sent when the controlling terminal closes) killed both
processes silently — stdout/stderr were DEVNULL so the crash was
invisible.
Also update test_services and test_composer to reflect the ssh plugin
no longer using Cowrie env vars (replaced with SSH_ROOT_PASSWORD /
SSH_HOSTNAME matching the real_ssh plugin).
Scraps the Cowrie emulation layer. The real_ssh template now runs a
genuine sshd backed by a three-layer logging stack forwarded to stdout
as RFC 5424 for the DECNET collector:
auth,authpriv.* → rsyslogd → named pipe → stdout (logins/failures)
user.* → rsyslogd → named pipe → stdout (PROMPT_COMMAND cmds)
sudo syslog=auth → rsyslogd → named pipe → stdout (privilege escalation)
sudo logfile → /var/log/sudo.log (local backup with I/O)
The ssh.py service plugin now points to templates/real_ssh and drops all
COWRIE_* / NODE_NAME env vars, sharing the same compose fragment shape as
real_ssh.py.
_load_service_container_names() reads decnet-state.json and builds the
exact set of expected container names ({decky}-{service}). is_service_container()
and is_service_event() do a direct set lookup — no regex, no label
inspection, no heuristics.
Two bugs caused the log file to never be written:
1. is_service_container() used regex '^decky-\d+-\w' which only matched
the old decky-01-smtp naming style. Actual containers are named
omega-decky-smtp, relay-decky-smtp, etc. Fixed by using Docker Compose
labels instead: com.docker.compose.project=decnet + non-empty
depends_on discriminates service containers from base (sleep infinity)
containers reliably regardless of decky naming convention.
Added is_service_event() for the Docker events path.
2. The collector was only started when --api was used. Added a 'collect'
CLI subcommand (decnet collect --log-file <path>) and wired it into
deploy as an auto-started background process when --api is not in use.
Default log path: /var/log/decnet/decnet.log