Commit Graph

254 Commits

Author SHA1 Message Date
c384a3103a refactor: separate engine, collector, mutator, and fleet into independent subpackages
- decnet/engine/ — container lifecycle (deploy, teardown, status); _kill_api removed
- decnet/collector/ — Docker log streaming (moved from web/collector.py)
- decnet/mutator/ — mutation engine (no longer imports from cli or duplicates deployer code)
- decnet/fleet.py — shared decky-building logic extracted from cli.py

Cross-contamination eliminated:
- web router no longer imports from decnet.cli
- mutator no longer imports from decnet.cli
- cli no longer imports from decnet.web
- _kill_api() moved to cli (process management, not engine concern)
- _compose_with_retry duplicate removed from mutator
2026-04-12 00:26:22 -04:00
c79f96f321 refactor(ssh): consolidate real_ssh into ssh, remove duplication
real_ssh was a separate service name pointing to the same template and
behaviour as ssh. Merged them: ssh is now the single real-OpenSSH service.

- Rename templates/real_ssh/ → templates/ssh/
- Remove decnet/services/real_ssh.py
- Deaddeck archetype updated: services=["ssh"]
- Merge test_real_ssh.py into test_ssh.py (includes deaddeck + logging tests)
- Drop decnet.services.real_ssh from test_build module list
2026-04-11 19:51:41 -04:00
d77def64c4 fix(cli): import Path locally in deploy to fix NameError 2026-04-11 19:46:58 -04:00
ce182652ad fix(cli): add __main__ guard so python -m decnet.cli actually runs the app
The collector subprocess was spawned via 'python3 -m decnet.cli collect'
but cli.py had no 'if __name__ == __main__: app()' guard. Python executed
the module, defined all functions, then exited cleanly with code 0 without
ever calling the collect command. No output, no log file, exit 0 — silent
non-start every time.

Also route collector stderr to <log_file>.collector.log so future crashes
are visible instead of disappearing into DEVNULL.
2026-04-11 19:42:10 -04:00
a6063efbb9 fix(collector): daemonize background subprocesses with start_new_session
Collector and mutator watcher subprocesses were spawned without
start_new_session=True, leaving them in the parent's process group.
SIGHUP (sent when the controlling terminal closes) killed both
processes silently — stdout/stderr were DEVNULL so the crash was
invisible.

Also update test_services and test_composer to reflect the ssh plugin
no longer using Cowrie env vars (replaced with SSH_ROOT_PASSWORD /
SSH_HOSTNAME matching the real_ssh plugin).
2026-04-11 19:36:46 -04:00
d4ac53c0c9 feat(ssh): replace Cowrie with real OpenSSH + rsyslog logging pipeline
Scraps the Cowrie emulation layer. The real_ssh template now runs a
genuine sshd backed by a three-layer logging stack forwarded to stdout
as RFC 5424 for the DECNET collector:

  auth,authpriv.*  → rsyslogd → named pipe → stdout  (logins/failures)
  user.*           → rsyslogd → named pipe → stdout  (PROMPT_COMMAND cmds)
  sudo syslog=auth → rsyslogd → named pipe → stdout  (privilege escalation)
  sudo logfile     → /var/log/sudo.log               (local backup with I/O)

The ssh.py service plugin now points to templates/real_ssh and drops all
COWRIE_* / NODE_NAME env vars, sharing the same compose fragment shape as
real_ssh.py.
2026-04-11 19:12:54 -04:00
9ca3b4691d docs(roadmap): tick completed service implementations 2026-04-11 04:02:50 -04:00
babad5ce65 refactor(collector): use state file for container detection, drop label heuristics
_load_service_container_names() reads decnet-state.json and builds the
exact set of expected container names ({decky}-{service}). is_service_container()
and is_service_event() do a direct set lookup — no regex, no label
inspection, no heuristics.
2026-04-11 03:58:52 -04:00
7abae5571a fix(collector): fix container detection and auto-start on deploy
Two bugs caused the log file to never be written:

1. is_service_container() used regex '^decky-\d+-\w' which only matched
   the old decky-01-smtp naming style. Actual containers are named
   omega-decky-smtp, relay-decky-smtp, etc. Fixed by using Docker Compose
   labels instead: com.docker.compose.project=decnet + non-empty
   depends_on discriminates service containers from base (sleep infinity)
   containers reliably regardless of decky naming convention.
   Added is_service_event() for the Docker events path.

2. The collector was only started when --api was used. Added a 'collect'
   CLI subcommand (decnet collect --log-file <path>) and wired it into
   deploy as an auto-started background process when --api is not in use.
   Default log path: /var/log/decnet/decnet.log
2026-04-11 03:56:53 -04:00
377ba0410c feat(deploy): add --parallel flag for concurrent image builds
When --parallel is set:
- DOCKER_BUILDKIT=1 is injected into the subprocess environment to
  ensure BuildKit is active regardless of host daemon config
- docker compose build runs first (all images built concurrently)
- docker compose up -d follows without --build (no redundant checks)

Without --parallel the original up --build path is preserved.
--parallel and --no-cache compose correctly (build --no-cache).
2026-04-11 03:46:52 -04:00
5ef48d60be fix(conpot): add syslog bridge entrypoint for logging pipeline
Conpot is a third-party app with its own Python logger — it never calls
decnet_logging. Added entrypoint.py as a subprocess wrapper that:
- Launches conpot and captures its stdout/stderr
- Classifies each line (startup/request/warning/error/log)
- Extracts source IPs via regex
- Emits RFC 5424 syslog lines to stdout for Docker/collector pickup

Entrypoint is self-contained (no import of shared decnet_logging.py)
because the conpot base image runs Python 3.6, which cannot parse the
dict[str, Any] / str | None type syntax used in the canonical file.
2026-04-11 03:44:41 -04:00
fe46b8fc0b fix(conpot): use honeynet/conpot:latest base, run as conpot user
The BASE_IMAGE build arg was being unconditionally overwritten by
composer.py with the decky's distro build_base (debian:bookworm-slim),
turning the conpot container into a bare Debian image with no conpot
installation — hence the silent restart loop.

Two fixes:
1. composer.py: use args.setdefault() so services that pre-declare
   BASE_IMAGE in their compose_fragment() win over the distro default.
2. conpot.py: pre-declare BASE_IMAGE=honeynet/conpot:latest in build
   args so it always uses the upstream image regardless of decky distro.

Also removed the USER decnet switch from the conpot Dockerfile. The
upstream image already runs as the non-root 'conpot' user; switching to
'decnet' broke pkg_resources because conpot's eggs live under
/home/conpot/.local and are only on sys.path for that user.
2026-04-11 03:32:11 -04:00
c7713c6228 feat(imap,pop3): full IMAP4rev1 + POP3 bait mailbox implementation
IMAP: extended to full IMAP4rev1 — 10 bait emails (AWS keys, DB creds,
tokens, VPN config, root pw etc.), LIST/LSUB/STATUS/FETCH/UID FETCH/
SEARCH/CLOSE/NOOP, proper SELECT untagged responses (EXISTS, UIDNEXT,
FLAGS, PERMANENTFLAGS), CAPABILITY with IDLE/LITERAL+/AUTH=PLAIN.
FETCH correctly handles sequence sets (1:*, 1:3, *), item dispatch
(FLAGS, ENVELOPE, BODY[], RFC822, RFC822.SIZE), and places body literals
last per RFC 3501.

POP3: extended with same 10 bait emails, fixed banner env var key
(POP3_BANNER not IMAP_BANNER), CAPA fully populated (TOP/UIDL/USER/
RESP-CODES/SASL), TOP (headers + N body lines), UIDL (msg-N format),
DELE/RSET with _deleted set tracking, NOOP. _active_messages() helper
excludes DELE'd messages from STAT/LIST/UIDL.

Both: DEBT-026 stub added (_EMAIL_SEED_PATH env var, documented in
DEBT.md for next-session JSON seed file wiring).

Tests: test_imap.py expanded to 27 cases, test_pop3.py to 22 cases —
860 total tests passing.
2026-04-11 03:12:32 -04:00
1196363d0b feat(os_fingerprint): Phase 2 — add icmp_ratelimit + icmp_ratemask sysctls
Windows: both 0 (no ICMP rate limiting — matches real Windows behavior)
Linux: 1000ms / mask 6168 (kernel defaults)
BSD: 250ms / mask 6168 (FreeBSD default is faster than Linux)
Embedded/Cisco: both 0 (most firmware doesn't rate-limit ICMP)

These affect nmap's IE and U1 probe groups which measure ICMP error
response timing to closed UDP ports. Windows responds to all probes
instantly while Linux throttles to ~1/sec.

Tests: 10 new cases (5 per sysctl). Suite: 822 passed.
2026-04-10 16:41:23 -04:00
62a67f3d1d docs(HARDENING): rewrite roadmap based on live scan findings
Phase 1 is complete. Live testing revealed:
- Window size (64240) is already correct — Phase 2 window mangling unnecessary
- TI=Z (IP ID = 0) is the single remaining blocker for Windows spoofing
- ip_no_pmtu_disc does NOT fix TI=Z (tested and confirmed)

Revised phase plan:
- Phase 2: ICMP tuning (icmp_ratelimit + icmp_ratemask sysctls)
- Phase 3: NFQUEUE daemon for IP ID rewriting (fixes TI=Z)
- Phase 4: diminishing returns, not recommended

Added detailed NFQUEUE architecture, TCPOPTSTRIP notes, and
note clarifying P= field in nmap output.
2026-04-10 16:38:27 -04:00
6df2c9ccbf revert(os_fingerprint): undo ip_no_pmtu_disc=1 for windows — was incorrect
ip_no_pmtu_disc controls PMTU discovery for UDP/ICMP paths only.
TI=Z originates from ip_select_ident() in the kernel TCP stack setting
IP ID=0 for DF=1 TCP packets — a namespace-scoped sysctl cannot change this.
The previous commit was based on incorrect root-cause analysis.
2026-04-10 16:29:44 -04:00
b1f6c3b84a fix(os_fingerprint): set ip_no_pmtu_disc=1 for windows to eliminate TI=Z
When ip_no_pmtu_disc=0 the Linux kernel sets DF=1 on TCP packets and uses
IP ID=0 (RFC 6864). nmap's TI=Z fingerprint has no Windows match in its DB,
causing 91% confidence guesses of 'Linux 2.4/2.6 embedded' regardless of
TTL being 128. Setting ip_no_pmtu_disc=1 allows non-zero IP ID generation.

Trade-off: DF bit is not set on outgoing packets (slightly wrong for Windows)
but TI=Z is far more damaging to the spoof than losing DF accuracy.
2026-04-10 16:19:32 -04:00
5fdfe67f2f fix(cowrie): add missing COPY+chmod for entrypoint.sh in Dockerfile
The entrypoint.sh was present in the build context but never COPYed into
the image, causing 'stat /entrypoint.sh: no such file or directory' at
container start. Added COPY+chmod before the USER decnet instruction so
the script is installed as root and is executable by all users.
2026-04-10 16:15:05 -04:00
4fac9570ec chore: add arche-test.ini OS fingerprint smoke-test fleet 2026-04-10 16:11:18 -04:00
5e83c9e48d feat(os_fingerprint): Phase 1 — extend OS sysctls with 6 new fingerprint knobs
Add tcp_timestamps, tcp_window_scaling, tcp_sack, tcp_ecn, ip_no_pmtu_disc,
and tcp_fin_timeout to every OS profile in OS_SYSCTLS.

All 6 are network-namespace-scoped and safe to set per-container without
--privileged. They directly influence nmap's OPS, WIN, ECN, and T2-T6
probe groups, making OS family detection significantly more convincing.

Key changes:
- tcp_timestamps=0 for windows/embedded/cisco (strongest Windows discriminator)
- tcp_ecn=2 for linux (ECN offer), 0 for all others
- tcp_sack=0 / tcp_window_scaling=0 for embedded/cisco
- ip_no_pmtu_disc=1 for embedded/cisco (DF bit ICMP behaviour)
- Expose _REQUIRED_SYSCTLS frozenset for completeness assertions

Tests: 88 new test cases across all OS families and composer integration.
Total suite: 812 passed.
2026-04-10 16:06:36 -04:00
d8457c57f3 docs: add OS fingerprint spoofing hardening roadmap 2026-04-10 16:02:00 -04:00
38d37f862b docs: Detail attachable Swarm overlay backend in FUTURE.md 2026-04-10 03:00:03 -04:00
fa8b0f3cb5 docs: Add latency simulation to FUTURE.md 2026-04-10 02:53:00 -04:00
db425df6f2 docs: Add FUTURE.md to capture long-term architectural visions 2026-04-10 02:48:28 -04:00
73e68388c0 fix(conpot): Refactor permissions to use dedicated decnet user via chown 2026-04-10 02:27:02 -04:00
682322d564 fix(conpot): Resolve silent crash by running as nobody and ensuring permissions 2026-04-10 02:25:45 -04:00
33885a2eec fix(conpot): Keep container as root to allow port 502 binding and fix user not found error 2026-04-10 02:20:46 -04:00
f583b3d699 fix(services): Resolve protocol realism gaps and update technical debt register
- Add dynamic challenge nonces to Postgres, VNC, and SIP.
- Add basic keyspace lookup and mock data to Redis.
- Correct MSSQL TDS pre-login offset bounds.
- Support MongoDB OP_MSG handshake version checking.
- Suppress Werkzeug HTTP server headers and normalize FTPAnonymousShell response.
- Add tracking for Dynamic Bait Store (DEBT-027) via DEBT.md.
2026-04-10 02:16:42 -04:00
5cb6666d7b docs: Append bug ledger implementation plan to REALISM_AUDIT.md 2026-04-10 01:58:23 -04:00
25b6425496 Update REALISM_AUDIT.md with completed tasks 2026-04-10 01:55:14 -04:00
08242a4d84 Implement ICS/SCADA and IMAP Bait features 2026-04-10 01:50:08 -04:00
63fb477e1f feat: add smtp_relay service; add service_testing/ init
- decnet/services/smtp_relay.py: open relay variant of smtp, same template
  with SMTP_OPEN_RELAY=1 baked into the environment
- tests/service_testing/__init__.py: init so pytest discovers the subdirectory
2026-04-10 01:09:15 -04:00
94f82c9089 feat(smtp): fix DATA state machine; add SMTP_OPEN_RELAY mode
- Buffer DATA body until CRLF.CRLF terminator — fixes 502-on-every-body-line bug
- SMTP_OPEN_RELAY=1: AUTH accepted (235), RCPT TO accepted for any domain,
  full DATA pipeline with queued-as message ID
- Default (SMTP_OPEN_RELAY=0): credential harvester — AUTH rejected (535)
  but connection stays open, RCPT TO returns 554 relay denied
- SASL PLAIN and LOGIN multi-step AUTH both decoded and logged
- RSET clears all per-transaction state
- Add development/SMTP_RELAY.md, IMAP_BAIT.md, ICS_SCADA.md, BUG_FIXES.md
  (live-tested service realism plans)
2026-04-10 01:03:47 -04:00
40cd582253 fix: restore forward_syslog as no-op stub; all service server.py files import it 2026-04-10 00:43:50 -04:00
24f02c3466 fix: resolve all bandit SAST findings in templates/
- Add # nosec B104 to all intentional 0.0.0.0 binds in honeypot servers
  (hardcoded_bind_all_interfaces is by design — deckies must accept attacker connections)
- Add # nosec B101 to assert statements used for protocol validation in ldap/snmp
- Add # nosec B105 to fake SASL placeholder in ldap
- Add # nosec B108 to /tmp usage in smb template
- Exclude root-owned auto-generated decnet_logging.py copies from bandit scan
  via pyproject.toml [tool.bandit] config (synced by _sync_logging_helper at deploy)
2026-04-10 00:24:40 -04:00
25ba3fb56a feat: replace bind-mount log pipeline with Docker log streaming
Services now print RFC 5424 to stdout; Docker captures via json-file driver.
A new host-side collector (decnet.web.collector) streams docker logs from all
running decky service containers and writes RFC 5424 + parsed JSON to the host
log file. The existing ingester continues to tail the .json file unchanged.
rsyslog can consume the .log file independently — no DECNET involvement needed.

Removes: bind-mount volume injection, _LOG_NETWORK bridge, log_target config
field and --log-target CLI flag, TCP syslog forwarding from service templates.
2026-04-10 00:14:14 -04:00
8d023147cc fix: chmod 777 log dir on compose generation so container decnet user can write logs 2026-04-09 19:36:53 -04:00
14f7a535db fix: use model_dump(mode='json') to serialize datetime fields; fixes SSE stream silently dying post-ORM migration 2026-04-09 19:29:27 -04:00
cea6279a08 fix: add Last-Event-ID to CORS allow_headers to unblock SSE reconnects 2026-04-09 19:26:24 -04:00
6b8392102e fix: emit stats/histogram snapshot on SSE connect; remove polling api.get('/stats') from Dashboard 2026-04-09 19:23:24 -04:00
d2a569496d fix: add get_stream_user dependency for SSE endpoint; allow query-string token for EventSource 2026-04-09 19:20:38 -04:00
f20e86826d fix: derive default CORS origin from DECNET_WEB_HOST/PORT instead of hardcoded ports 2026-04-09 19:15:45 -04:00
29da2a75b3 fix: add localhost:9090 to CORS defaults; revert broken relative-URL and proxy changes 2026-04-09 19:14:40 -04:00
3362325479 fix: resolve CORS blocking Vite dev server (add 5173 to defaults, add proxy) 2026-04-09 19:10:10 -04:00
34a57d6f09 fix: make setcap resilient — no-op when Python absent or symlink-only 2026-04-09 19:04:52 -04:00
016115a523 fix: clear all addressable technical debt (DEBT-005 through DEBT-025)
Security:
- DEBT-008: remove query-string token auth; header-only Bearer now enforced
- DEBT-013: add regex constraint ^[a-z0-9\-]{1,64}$ on decky_name path param
- DEBT-015: stop leaking raw exception detail to API clients; log server-side
- DEBT-016: validate search (max_length=512) and datetime params with regex

Reliability:
- DEBT-014: wrap SSE event_generator in try/except; yield error frame on failure
- DEBT-017: emit log.warning/error on DB init retry; silent failures now visible

Observability / Docs:
- DEBT-020: add 401/422 response declarations to all route decorators

Infrastructure:
- DEBT-018: add HEALTHCHECK to all 24 template Dockerfiles
- DEBT-019: add USER decnet + setcap cap_net_bind_service to all 24 Dockerfiles
- DEBT-024: bump Redis template version 7.0.12 → 7.2.7

Config:
- DEBT-012: validate DECNET_API_PORT and DECNET_WEB_PORT range (1-65535)

Code quality:
- DEBT-010: delete 22 duplicate decnet_logging.py copies; deployer injects canonical
- DEBT-022: closed as false positive (print only in module docstring)
- DEBT-009: closed as false positive (templates already use structured syslog_line)

Build:
- DEBT-025: generate requirements.lock via pip freeze

Testing:
- DEBT-005/006/007: comprehensive test suite added across tests/api/
- conftest: in-memory SQLite + StaticPool + monkeypatched session_factory
- fuzz mark added; default run excludes fuzz; -n logical parallelism

DEBT.md updated: 23/25 items closed; DEBT-011 (Alembic) and DEBT-023 (digest pinning) remain
2026-04-09 19:02:51 -04:00
0166d0d559 fix: clean up db layer — model_dump, timezone-aware timestamps, unified histogram, async load_state 2026-04-09 18:46:35 -04:00
dbf6d13b95 fix: use :memory: + StaticPool for test DBs, eliminates file:testdb_* garbage 2026-04-09 18:39:36 -04:00
d15c106b44 test: fix async fixture isolation, add fuzz marks, parallelize with xdist
- Rebuild repo.engine and repo.session_factory per-test using unique
  in-memory SQLite URIs — fixes KeyError: 'access_token' caused by
  stale session_factory pointing at production DB
- Add @pytest.mark.fuzz to all Hypothesis and Schemathesis tests;
  default run excludes them (addopts = -m 'not fuzz')
- Add missing fuzz tests to bounty, fleet, histogram, and repository
- Use tmp_path for state file in patch_state_file/mock_state_file to
  eliminate file-path race conditions under xdist parallelism
- Set default addopts: -v -q -x -n logical (26 tests in ~7s)
2026-04-09 18:32:46 -04:00
6fc1a2a3ea test: refactor suite to use AsyncClient, in-memory DBs, and parallel coverage 2026-04-09 16:43:49 -04:00