Commit Graph

27 Commits

Author SHA1 Message Date
42b5d97a50 fix(syslog_bridge): rewrite both templates with from __future__ annotations, fp socket imports, and start_fp_socket_reader 2026-05-10 02:06:53 -04:00
1669f25733 fix(syslog_bridge): add from __future__ import annotations for Python <3.9 compat 2026-05-10 01:58:43 -04:00
255ccebf29 fix(entrypoint): fail-fast if Flask does not bind within timeout instead of silently starting Caddy with no backend 2026-05-10 01:51:09 -04:00
d4f391bab1 fix(caddy): remove explicit tls from listener_wrappers — Caddy applies it by default 2026-05-10 01:45:03 -04:00
38cf1e6c6d fix(caddy+syslog): add UnmarshalCaddyfile to H2FP/FP handlers; add start_fp_socket_reader to syslog_bridge 2026-05-10 01:39:04 -04:00
f2b0d286b3 fix(caddy): correct caddyhttp import path to modules/caddyhttp 2026-05-10 01:22:00 -04:00
3154224f68 fix(docker): hoist ARG BASE_IMAGE before first FROM so it scopes to all stages 2026-05-10 01:05:00 -04:00
92632d7afd feat(pr2): HTTP/2+HTTP/3 fingerprint extractors — JA4H, H2 SETTINGS, JA4-QUIC 2026-05-10 00:47:19 -04:00
0653e500b5 feat(services): HTTP/2 + HTTP/3 support via Caddy reverse-proxy
Swap Werkzeug for Caddy as the protocol layer for http and https decoy
services. Flask keeps owning app logic (fake_app, custom_body, headers,
syslog) on 127.0.0.1:8080; Caddy terminates h1/h2/h2c/h3 on the wire
with real-world TLS/QUIC fingerprints.

- Add `multi_enum` FieldType to ServiceConfigField + _coerce
- Add `http_versions` field to HTTPService (h1/h2c) and HTTPSService
  (h1/h2/h3); selecting h3 emits UDP/443 port mapping in compose
- Rewrite both Dockerfiles with multi-stage Caddy binary copy +
  setcap for port binding as the logrelay user
- Entrypoints parse HTTP_VERSIONS JSON, render a Caddyfile, start
  Flask in background, wait for it, then exec Caddy
- https/server.py drops direct TLS handling; Caddy owns the cert
- Add ProxyFix to both server.py so Flask sees real attacker IPs
- Frontend: multi_enum checkbox-group renderer in ServiceConfigFields;
  FormValue union extended to string[]; compactPayload skips []
- Fix stale test_smtp_relay_schema_matches_smtp: relay schema is a
  superset of smtp, not equal; update assertions accordingly
2026-05-10 00:04:37 -04:00
289a64014c feat(profiler/behave_shell): G.0 intent lexicon + lexical counter pass
Phase G shared infrastructure (no primitive yet emitted):

* New `_intent.py` — five precomputed first-token-hash sets (recon /
  exfil / persistence / lateral / destructive) with documented
  precedence, plus opsec-history and three lexeme sets (positive /
  negative / obscenity) for the typed-text counter pass. Stop words
  that collide with registry value vocabulary (`no`, `hell`, `ok`)
  are deliberately excluded — the PII regression test catches such
  collisions.

* `_typed_char_histograms()` extended with five integer counters
  populated in the same single-pass walk: `obscenity_hits`,
  `positive_lex_hits`, `negative_lex_hits`, `caps_run_max`,
  `bang_run_max`. Longest-suffix match against bounded lexicon
  (`LEXEME_MAX_LEN`); paste-class events excluded.

* `SessionContext` widened by the same five fields. Drives G.5
  (valence), G.6 (arousal), G.8 (frustration_venting) without retaining
  raw operator text.

* Bump twisted >= 26.4.0rc2 to clear CVE-2026-42304 (pre-existing,
  caught by pre-commit pip-audit). Adjust ftp template type-ignore
  code from attr-defined to misc to match the new Twisted typing.

PII discipline: same shape as F.4 — fixed-vocabulary integer counters
on ctx, never on observations.
2026-05-08 16:27:25 -04:00
dcd558fd91 chore(infra): pin Docker base images by digest (DEBT-023)
All base images (debian:bookworm-slim, ubuntu:22.04, ubuntu:20.04,
rockylinux:9-minimal, centos:7, alpine:3.19, fedora:39,
kalilinux/kali-rolling, archlinux:latest, honeynet/conpot:latest)
now carry their resolved sha256 digest so 'docker pull' is
deterministic. :tag retained for human readability; @sha256 is what
Docker actually resolves. Refresh procedure documented at the top of
decnet/distros.py.
2026-05-03 04:38:39 -04:00
b88d67794d feat(mail): operator-tunable IMAP/POP3 email seed (DEBT-026)
IMAP_EMAIL_SEED / POP3_EMAIL_SEED accept a directory (rglob *.eml +
*.json) or a single .json/.eml. Loaded entries CONCATENATE with the
hardcoded _BAIT_EMAILS — additive to the realism-engine emailgen
output rather than replacing it. JSON dicts require from_addr /
to_addr / subject / body; bare bodies are wrapped into RFC 5322 on
load. compose_fragment reads service_cfg["email_seed"] and bind-mounts
the host path read-only at /var/spool/decnet-emails/seed.
2026-05-03 02:47:06 -04:00
cdbb3d3571 fix(ssh,telnet): move PROMPT_COMMAND out of /root/.bashrc + pin readonly
ANTI flagged two regressions in the existing command-event capture:

1. **Tell**: PROMPT_COMMAND lived in /root/.bashrc, the FIRST file
   an attacker greps after landing root. The logger invocation
   sitting there is plain-text honeypot signage.
2. **Bypass**: even when missed, `export PROMPT_COMMAND=""` silently
   disables capture. ANTI personally bypasses this on engagements.

Reshape:

* Move the assignment to **/etc/environment** — read by pam_env at
  session open (sshd via /etc/pam.d/sshd, telnet via
  /etc/pam.d/login), before any shell rc file fires. Far less
  obvious than .bashrc; a casual `cat .bashrc` no longer surfaces
  the capture.
* Define the helper as a function `__bash_history_sync` in
  **/etc/bash.bashrc** (system-wide bashrc, sourced by every
  interactive bash). Function name reads as generic bash
  housekeeping; no DECNET branding in the symbol.
* Pin both the function and PROMPT_COMMAND **readonly** so
  `export PROMPT_COMMAND=""` fails with "readonly variable"
  instead of silently winning. Mitigation, not airtight —
  `bash --norc` still bypasses — but the passive `export`
  bypass is closed.

The actual `logger --rfc5424 --msgid command ... CMD ...` invocation
is preserved exactly; only its location and the readonly guard
change. R0001–R0030 (command-rule pack) consume the same syslog
shape as before.

Three new tests assert: the value lands in /etc/environment, the
function body lives in /etc/bash.bashrc, no PROMPT_COMMAND line
remains in /root/.bashrc, and `readonly PROMPT_COMMAND` /
`readonly -f __bash_history_sync` are both present. Mirror
assertions added on the Telnet Dockerfile via
test_config_schema.py.
2026-05-02 19:50:24 -04:00
3e9c4c29b9 feat(ssh,telnet): add non-root user account for privesc + enum lure
Real Linux deployments (especially Ubuntu cloud images) ship a non-
root admin user; honeypots that only accept root logins are a tell.
Add a second account on both SSH and Telnet decoys, configurable
via service_cfg keys `user` / `user_password`, defaulting to
`ubuntu` / `admin` so the lure is live on every fresh deploy.

* `decnet/services/{ssh,telnet}.py` — two new ServiceConfigFields
  (`user` string, `user_password` secret) and matching env vars
  (`SSH_USER` / `SSH_USER_PASSWORD`, mirror for telnet) propagated
  via the compose fragment.
* `decnet/templates/ssh/entrypoint.sh` — runtime `useradd -m -s
  /usr/libexec/login-session -G sudo "$SSH_USER"` so the new user
  inherits the same sessrec pty-recording shell as root and lands
  in the sudo group. Privesc attempts (`sudo`) flow through the
  existing sudo-log capture; network-enum from the user's shell
  rides the recorded transcript.
* `decnet/templates/telnet/entrypoint.sh` — same useradd pattern
  (no sudo group — busybox+login telnet image has no sudo
  package; privesc rides `su -` which itself flows through the
  existing PAM auth-helper at /etc/pam.d/login).
* New tests for default + custom user / password + independence
  from root password. Updated the schema-keys assertion to match
  the four-field shape.

The new account is ALSO the natural home for the body-aware
predicates that were previously gated on root-only sessions —
attackers who land on `ubuntu@host` and run network-recon /
privesc commands now generate the same structured TTP-rule
events as root sessions did, captured via the same auth-helper
+ sessrec + sudo-log pipes.
2026-05-02 19:48:03 -04:00
291b78c1d0 feat(smtp): extract body_simhash + base64-bytes + html-smuggling + per-attachment macro/encrypted
Heavyweight Layer-2 extractors land alongside the cheap projections
shipped in commit e9324aca, so the EmailLifter R0042 / R0046 (macros
/ password / smuggling lanes) / R0048 fire from the bus payload
without the lifter having to reach back to disk.

Extractors:
* body_simhash — inlined 64-bit Charikar simhash (md5-keyed,
  frequency-weighted) over word tokens of the union of text/* body
  parts. Inlined rather than pulling the `simhash` PyPI dep, which
  transitively brings numpy ~50 MB into a slim decky container; the
  algorithm is ~15 lines and identical in extraction quality.
* body_base64_bytes — largest decoded base64 chunk's byte count,
  scanning text body parts with the same `_BASE64_RE` the lifter's
  `_p_encoded_payload` fallback uses. R0048 fires from this scalar
  alone; the lifter's body_text fallback becomes dead in normal
  operation.
* attachment_macro_indicator — stdlib zipfile sniff for
  `vbaProject.bin` inside OOXML containers. Catches modern .docm /
  .xlsm / .pptm and macro-injected .docx; legacy .xls (CFBF) is a
  follow-up.
* attachment_encrypted — flag_bits & 0x01 on any ZIP / OOXML entry's
  central directory; magic-byte match for 7z / RAR / CFBF (encrypted
  Office wrap).
* html_smuggling — structural lxml parse first: fires when an `<a
  download>` element coexists with a `<script>` referencing
  `Blob` / `Uint8Array` / `URL.createObjectURL`. Regex pair-check
  fallback on lxml parse failure (real-world phish HTML is often
  malformed). Cuts the FP rate that pure-regex would produce on
  legitimate "click to download" links.

Add `python3-lxml` (~5 MB Debian package, C-extension, no transitive
Python deps) to the SMTP decky's Dockerfile. simhash stays inline.
Per the dependency rule: lxml earns its weight by cutting R0046's
OR-combined FP rate; a heavier macro-detection lib (oletools ~5 MB
pure-python with msoffcrypto) would not measurably improve the
boolean signal we need, so stdlib stays for that lane.
2026-05-02 19:08:37 -04:00
e9324acac7 feat(smtp): emit X-Mailer / Return-Path / dkim+spf / URLs on message_stored
The EmailLifter (R0041–R0048) keys on header-derived signals that the
v0 _summarize_message did not extract. Add cheap Layer 2 projections
inside the existing single-pass parse:

* return_path / x_mailer — direct header reads, decoded RFC 2047
* dkim_signed / spf_pass — booleans derived from any
  Authentication-Results header (multiple lines tolerated; positive
  verdict on any line wins)
* urls — http(s) URLs lifted from text/* body parts via a tight
  regex, deduplicated first-seen-wins, capped at 64 in the wire
  payload to bound the syslog SD value

Heavyweight extraction (body simhash, office-macro detection,
HTML-smuggling, password-protected archives, mal-hash-match,
body_text projection) stays deferred per the EmailLifter heavyweight
DEBT entry — those rules need privacy / extractor decisions before
they ship.
2026-05-02 18:37:11 -04:00
19271f9319 fix(types): P3 — annotate transport in all template protocol servers; 0 errors in templates/
- asyncio.Protocol (TCP): _transport: asyncio.Transport | None = None + cast() in
  connection_made; assert guards in every method that directly accesses the field.
  Files: pop3, smtp, mqtt, postgres, mssql, mongodb, imap, ldap, redis, mysql, sip, vnc.
- asyncio.DatagramProtocol (UDP): _transport: asyncio.DatagramTransport | None = None.
  Files: snmp, tftp, SIPUDPProtocol.
- RDP: assert new_transport is not None after start_tls() to narrow Transport | None.
- FTP (Twisted): assert self.transport is not None + targeted type: ignore for imprecise
  Twisted stubs (misc/override/arg-type/attr-defined), IReactorTCP cast for listenTCP.
- conpot: proc.stdout is None guard before iteration.
- Bonus fixes surfaced by annotation:
  - smtp: get_payload(decode=True) bytes narrowing (arg-type on sha256)
  - postgres: rename shadowed `msg` param to `err_msg` in _handle_startup
  - mongodb: base64.binascii.Error → import binascii; binascii.Error
  - imap: result: list[int] = [] (var-annotated)
2026-05-01 01:09:14 -04:00
5a240a3d55 fix(types): P1 — widen _send_json data param to dict | list in elasticsearch server 2026-05-01 00:22:16 -04:00
909913e912 fix(types): P0 mypy — explicit binascii import, drop dead or None in ntlmssp
syslog_bridge.py: base64.binascii is not a public mypy-visible attribute;
import binascii directly and reference binascii.Error at the except clause.
Propagated to all 26 template subdirectory copies (all were drift-free).

ntlmssp.py: `principal = username or None` widened the type to str | None
for no runtime reason — _decode_str() always returns str.  Drop the `or None`.
Propagated to smb/ and rdp/ copies.

762 → 722 mypy errors (-40).
2026-05-01 00:09:00 -04:00
761c23a07c fix(smtp_relay): emit service=smtp_relay in syslog so ingester can gate probe publish
SERVICE_NAME was hardcoded to 'smtp' in server.py; the ingester's probe
publish guard checked service == 'smtp_relay' and never matched.

Read SMTP_SERVICE_NAME from env (default 'smtp'); smtp_relay compose
fragment sets it to 'smtp_relay' so the two services are distinguishable.
2026-04-30 12:31:29 -04:00
f0d47c5195 fix(smtp): chmod quarantine dir before dropping to logrelay
The bind-mounted quarantine dir is owned by the host decnet user; the
logrelay process had no write access because the Dockerfile USER directive
pre-applied before the entrypoint could fix permissions.

Run entrypoint as root, chmod 0777 the quarantine dir, then exec the
server under logrelay via su.
2026-04-30 12:25:37 -04:00
4c0a1309f0 fix(smtp_relay): log upstream error reason in probe_forwarded event
forwarded=0 was silent — now fwd_error carries the exception string so
you can see exactly why the upstream refused (auth failure, connection
refused, timeout, etc).
2026-04-30 11:57:07 -04:00
fdf38a9d8c feat(smtp_relay): add upstream_sender to fix SPF on probe forwarding
Override the envelope MAIL FROM with a domain we own when talking to the
upstream relay. SPF passes at the recipient; the attacker's From: header
inside the message body is untouched so they see their own address in their
inbox and believe the relay is real.
2026-04-30 11:47:18 -04:00
9a4fe2677b feat(smtp_relay): forward probe emails upstream so attackers verify relay works
First SMTP_PROBE_LIMIT messages per source IP are forwarded via a real
upstream relay (SMTP_UPSTREAM_HOST/PORT/USER/PASS) so the attacker's
test email actually lands in their inbox. All subsequent messages from
the same IP get 250 Ok but only hit the quarantine — campaign content
captured, nothing delivered.
2026-04-30 11:21:04 -04:00
77ceb9d6f3 feat(services): config schemas for the rest of the registry + textarea base64 transport
- Declarative config_schema on RDP, Telnet, MySQL, Redis, SMTP, SMTP_Relay
  matching the keys each service already reads at compose time.
- TODO marker on the 19 services that accept service_cfg but never read it,
  so future contributors know where to plug schemas in.
- Wizard base64-wraps all textarea values at INI emit (DeckyFleet
  buildIni); validate_cfg detects the b64: sentinel and decodes back to
  UTF-8. Plain raw strings still pass through for direct API submitters.
- HTTPS image entrypoint accepts PEM content or path in TLS_CERT/TLS_KEY:
  detects a BEGIN header, writes content to /opt/tls/, and re-exports
  the on-disk path so server.py keeps reading paths.
- Tests cover schema/compose alignment for each new service plus
  textarea base64 round-trip (incl. UTF-8) and HTTPS PEM end-to-end.
2026-04-29 12:23:56 -04:00
6055f9c837 fix(deckies): set MSGID=command on bash PROMPT_COMMAND syslog lines
Add --rfc5424 --msgid command to the logger invocation in SSH and telnet
decky bashrc. MSGID arrives as "command" instead of NIL, which is what
the profiler's _COMMAND_EVENT_TYPES filter expects. The parser heuristic
shipped in d4591b3 stays as a safety net for any future emitter that
forgets the flags or for inflight pre-rebuild containers.
2026-04-28 19:12:11 -04:00
862e4dbb31 merge: testing → main (reconcile 2-week divergence) 2026-04-28 18:36:00 -04:00