13 Commits

Author SHA1 Message Date
f2b3393669 chore: relicense to AGPL-3.0-or-later and add SPDX headers
Replaces LICENSE (GPLv3 -> AGPLv3) and prepends
`SPDX-License-Identifier: AGPL-3.0-or-later` to every source file
across decnet/, decnet_web/, tests/, scripts/, and tools/.

Rationale: closes the GPLv3 ASP loophole so any party operating a
modified DECNET as a network service must offer their modified
source. Personal copyright (Samuel Paschuan) + inbound=outbound
contributions make a future unilateral relicense infeasible.

- LICENSE: full AGPL-3.0 text (gnu.org/licenses/agpl-3.0.txt)
- COPYRIGHT: project copyright notice
- tools/add_spdx_headers.py: idempotent header injector
  (shebang- and PEP 263-aware)

Touches 1565 source files (.py, .ts, .tsx, .js, .jsx, .css, .sh).
No behavior change; comments only.
2026-05-22 21:04:16 -04:00
a009746dd1 feat(fingerprint): extend syslog_bridge with HTTP/3 and JA4H fingerprinting emission 2026-05-10 22:27:22 -04:00
dcd558fd91 chore(infra): pin Docker base images by digest (DEBT-023)
All base images (debian:bookworm-slim, ubuntu:22.04, ubuntu:20.04,
rockylinux:9-minimal, centos:7, alpine:3.19, fedora:39,
kalilinux/kali-rolling, archlinux:latest, honeynet/conpot:latest)
now carry their resolved sha256 digest so 'docker pull' is
deterministic. :tag retained for human readability; @sha256 is what
Docker actually resolves. Refresh procedure documented at the top of
decnet/distros.py.
2026-05-03 04:38:39 -04:00
291b78c1d0 feat(smtp): extract body_simhash + base64-bytes + html-smuggling + per-attachment macro/encrypted
Heavyweight Layer-2 extractors land alongside the cheap projections
shipped in commit e9324aca, so the EmailLifter R0042 / R0046 (macros
/ password / smuggling lanes) / R0048 fire from the bus payload
without the lifter having to reach back to disk.

Extractors:
* body_simhash — inlined 64-bit Charikar simhash (md5-keyed,
  frequency-weighted) over word tokens of the union of text/* body
  parts. Inlined rather than pulling the `simhash` PyPI dep, which
  transitively brings numpy ~50 MB into a slim decky container; the
  algorithm is ~15 lines and identical in extraction quality.
* body_base64_bytes — largest decoded base64 chunk's byte count,
  scanning text body parts with the same `_BASE64_RE` the lifter's
  `_p_encoded_payload` fallback uses. R0048 fires from this scalar
  alone; the lifter's body_text fallback becomes dead in normal
  operation.
* attachment_macro_indicator — stdlib zipfile sniff for
  `vbaProject.bin` inside OOXML containers. Catches modern .docm /
  .xlsm / .pptm and macro-injected .docx; legacy .xls (CFBF) is a
  follow-up.
* attachment_encrypted — flag_bits & 0x01 on any ZIP / OOXML entry's
  central directory; magic-byte match for 7z / RAR / CFBF (encrypted
  Office wrap).
* html_smuggling — structural lxml parse first: fires when an `<a
  download>` element coexists with a `<script>` referencing
  `Blob` / `Uint8Array` / `URL.createObjectURL`. Regex pair-check
  fallback on lxml parse failure (real-world phish HTML is often
  malformed). Cuts the FP rate that pure-regex would produce on
  legitimate "click to download" links.

Add `python3-lxml` (~5 MB Debian package, C-extension, no transitive
Python deps) to the SMTP decky's Dockerfile. simhash stays inline.
Per the dependency rule: lxml earns its weight by cutting R0046's
OR-combined FP rate; a heavier macro-detection lib (oletools ~5 MB
pure-python with msoffcrypto) would not measurably improve the
boolean signal we need, so stdlib stays for that lane.
2026-05-02 19:08:37 -04:00
e9324acac7 feat(smtp): emit X-Mailer / Return-Path / dkim+spf / URLs on message_stored
The EmailLifter (R0041–R0048) keys on header-derived signals that the
v0 _summarize_message did not extract. Add cheap Layer 2 projections
inside the existing single-pass parse:

* return_path / x_mailer — direct header reads, decoded RFC 2047
* dkim_signed / spf_pass — booleans derived from any
  Authentication-Results header (multiple lines tolerated; positive
  verdict on any line wins)
* urls — http(s) URLs lifted from text/* body parts via a tight
  regex, deduplicated first-seen-wins, capped at 64 in the wire
  payload to bound the syslog SD value

Heavyweight extraction (body simhash, office-macro detection,
HTML-smuggling, password-protected archives, mal-hash-match,
body_text projection) stays deferred per the EmailLifter heavyweight
DEBT entry — those rules need privacy / extractor decisions before
they ship.
2026-05-02 18:37:11 -04:00
19271f9319 fix(types): P3 — annotate transport in all template protocol servers; 0 errors in templates/
- asyncio.Protocol (TCP): _transport: asyncio.Transport | None = None + cast() in
  connection_made; assert guards in every method that directly accesses the field.
  Files: pop3, smtp, mqtt, postgres, mssql, mongodb, imap, ldap, redis, mysql, sip, vnc.
- asyncio.DatagramProtocol (UDP): _transport: asyncio.DatagramTransport | None = None.
  Files: snmp, tftp, SIPUDPProtocol.
- RDP: assert new_transport is not None after start_tls() to narrow Transport | None.
- FTP (Twisted): assert self.transport is not None + targeted type: ignore for imprecise
  Twisted stubs (misc/override/arg-type/attr-defined), IReactorTCP cast for listenTCP.
- conpot: proc.stdout is None guard before iteration.
- Bonus fixes surfaced by annotation:
  - smtp: get_payload(decode=True) bytes narrowing (arg-type on sha256)
  - postgres: rename shadowed `msg` param to `err_msg` in _handle_startup
  - mongodb: base64.binascii.Error → import binascii; binascii.Error
  - imap: result: list[int] = [] (var-annotated)
2026-05-01 01:09:14 -04:00
909913e912 fix(types): P0 mypy — explicit binascii import, drop dead or None in ntlmssp
syslog_bridge.py: base64.binascii is not a public mypy-visible attribute;
import binascii directly and reference binascii.Error at the except clause.
Propagated to all 26 template subdirectory copies (all were drift-free).

ntlmssp.py: `principal = username or None` widened the type to str | None
for no runtime reason — _decode_str() always returns str.  Drop the `or None`.
Propagated to smb/ and rdp/ copies.

762 → 722 mypy errors (-40).
2026-05-01 00:09:00 -04:00
761c23a07c fix(smtp_relay): emit service=smtp_relay in syslog so ingester can gate probe publish
SERVICE_NAME was hardcoded to 'smtp' in server.py; the ingester's probe
publish guard checked service == 'smtp_relay' and never matched.

Read SMTP_SERVICE_NAME from env (default 'smtp'); smtp_relay compose
fragment sets it to 'smtp_relay' so the two services are distinguishable.
2026-04-30 12:31:29 -04:00
f0d47c5195 fix(smtp): chmod quarantine dir before dropping to logrelay
The bind-mounted quarantine dir is owned by the host decnet user; the
logrelay process had no write access because the Dockerfile USER directive
pre-applied before the entrypoint could fix permissions.

Run entrypoint as root, chmod 0777 the quarantine dir, then exec the
server under logrelay via su.
2026-04-30 12:25:37 -04:00
4c0a1309f0 fix(smtp_relay): log upstream error reason in probe_forwarded event
forwarded=0 was silent — now fwd_error carries the exception string so
you can see exactly why the upstream refused (auth failure, connection
refused, timeout, etc).
2026-04-30 11:57:07 -04:00
fdf38a9d8c feat(smtp_relay): add upstream_sender to fix SPF on probe forwarding
Override the envelope MAIL FROM with a domain we own when talking to the
upstream relay. SPF passes at the recipient; the attacker's From: header
inside the message body is untouched so they see their own address in their
inbox and believe the relay is real.
2026-04-30 11:47:18 -04:00
9a4fe2677b feat(smtp_relay): forward probe emails upstream so attackers verify relay works
First SMTP_PROBE_LIMIT messages per source IP are forwarded via a real
upstream relay (SMTP_UPSTREAM_HOST/PORT/USER/PASS) so the attacker's
test email actually lands in their inbox. All subsequent messages from
the same IP get 250 Ok but only hit the quarantine — campaign content
captured, nothing delivered.
2026-04-30 11:21:04 -04:00
862e4dbb31 merge: testing → main (reconcile 2-week divergence) 2026-04-28 18:36:00 -04:00