DEBT-042 — orchestrator failure-count badge is computed from the
in-memory SSE window; remediation is a dedicated stats endpoint.
DEBT-043 — no frontend test framework configured; the planned
Orchestrator.tsx component test couldn't be written without first
adding vitest + RTL.
The threat-intel surface was IP-keyed on day one as an expedient — the
worker is woken by IP-bearing bus events. ANTI's call: don't carry that
debt. NO IPs as primary keys anywhere on the attacker-intel surface.
Schema:
- attacker_uuid is now the canonical key — UNIQUE + FK to attackers.uuid.
- attacker_ip stays as a denormalised, indexed, NON-UNIQUE value column.
Updated on every upsert; useful for SIEM payloads and audit lookups,
but explicitly NOT a key. Model docstring says so.
- Pre-v1, no Alembic migration needed. SQLModel.metadata.create_all()
builds the new shape on fresh DBs.
Repo:
- upsert_attacker_intel now keys on attacker_uuid.
- get_attacker_intel_by_ip → get_attacker_intel_by_uuid.
- get_unenriched_attacker_ips → get_unenriched_attackers, returning
[{uuid, ip}] tuples so the worker writes by UUID and dispatches
provider calls by IP without a second round-trip.
Worker:
- _enrich_one(uuid, ip, ...) — UUID lands on the row, IP rides for
provider egress.
- attacker.intel.enriched bus payload gains attacker_uuid alongside
attacker_ip — webhook → SIEM consumers benefit; no removal.
API:
- GET /api/v1/attackers/{ip}/intel deleted outright (rip-and-replace,
never deployed beyond dev).
- GET /api/v1/attackers/{uuid}/intel is the only public route, matching
every other /attackers/* route.
Frontend:
- <IntelPanel uuid={id!} /> uses the URL param directly, fetches in
parallel with the rest of AttackerDetail rather than waiting on
attacker.ip.
Tests: re-keyed in place, 39 passed (same coverage as before the
refactor). Provider-impl tests untouched.
DEBT-041: closed in DEBT.md (entry preserved as historical rationale,
summary table flipped to ✅, remaining-open list shortened by one).
Read-only IP-keyed intel surface on the attacker detail page. Renders
the aggregate verdict (color-coded MALICIOUS/SUSPICIOUS/BENIGN/NO SIGNAL)
plus a per-provider row with verdict, queried-at timestamp, and
provider-specific detail (GreyNoise classification, AbuseIPDB
0-100 score, Feodo C2 listing + malware family, ThreatFox IOC match
+ malware family). 404 from the API renders as 'NO INTEL CACHED YET'
with a hint that decnet enrich will populate it on the next pass —
TTL drives the refresh, no manual button.
DEBT-041 documents the API/UI IP-keying as a v1 expedient that will
need a UUID-keyed sibling endpoint before federation lands. NAT
collisions, attacker.uuid consistency across attacker routes, and the
sequential-fetch UX are all callouts on that ticket; the migration
sketch is laid out so the v1.x followup is unambiguous.
Frontend build: clean (55.58 kB AttackerDetail bundle, +~5kB for the
panel). Note: not browser-tested in this session — recommend a manual
smoke against a deployed master before tagging.
Library shape (decnet/correlation/) consumed by profiler + reuse
correlator is the right model. The `decnet correlate` CLI helper has
been removed in the previous commit.
When RDP_ENABLE_NLA=true (service_cfg.nla=true on the topology side),
confirm PROTOCOL_HYBRID on the X.224 Connection Confirm, upgrade the
socket to TLS using a self-signed cert generated at first start by
the entrypoint, then drive a tiny CredSSP loop:
- Read inbound TSRequest DER (bounded to MAX_TSREQUEST_LEN).
- Scan for the NTLMSSP signature, dispatch on message type:
Type 1 -> respond with a hand-built TSRequest carrying our Type 2
challenge. Type 3 -> parse_type3() and emit auth_attempt with the
universal credential SD shape (secret_kind = ntlmssp_v2).
- Hand-built DER: no pyasn1 dependency.
Also folds in a small fix-up to commit 1: SMB SERVER_CHALLENGE was
hardcoded to 0x11..0x88 across the fleet, which would let a scanner
fingerprint every DECNET decky by its NTLM challenge. Both SMB and
RDP now derive the 8-byte challenge from
instance_seed.random_bytes(8, "ntlm_challenge"), giving each decky a
deterministic-but-distinct value. SMB Dockerfile gets the
instance_seed.py copy too (was synced into the build context but not
COPYed into the image).
- decnet/services/rdp.py: optional service_cfg.nla bool flips
RDP_ENABLE_NLA in the compose env.
- decnet/templates/rdp/Dockerfile + entrypoint.sh: openssl install +
per-decky cert generation gated on RDP_ENABLE_NLA.
- 9 NLA unit tests cover the DER reader/builder, _handle_nla round-
trip with Type 1 / Type 3, oversized-DER rejection, and per-
NODE_NAME challenge divergence.
- DEBT.md: DEBT-040 closed; full TS_INFO_PACKET capture documented as
a follow-up if attacker telemetry justifies it.
Ships the load-bearing primitive both Phase 5 (SMB) and Phase 7
(RDP NLA) need: a standalone NTLMSSP Type 3 (AUTHENTICATE_MESSAGE)
parser per MS-NLMP §2.2.1.3.
Surface:
parse_type3(blob) -> dict | None
find_ntlmssp(buf) -> int # locate NTLMSSP\\0 inside SPNEGO outer
Returns the universal Credential SD shape:
username + domain (decoded UTF-16-LE or ASCII per NEGOTIATE_UNICODE)
principal = "DOMAIN\\\\username"
secret_kind = "ntlmssp_v1" (24-byte fixed) or "ntlmssp_v2" (variable)
secret_b64 = base64 of NtChallengeResponse — canonical hashcat input
(-m 5500 v1, -m 5600 v2)
Bounds-checked for untrusted-input safety. Anonymous binds (empty NT
response) return None — no credential to record.
7 unit tests cover NTLMv1/v2 distinction, ASCII vs Unicode strings,
empty-domain shape, malformed signature/type rejection, and SPNEGO-
wrapped find_ntlmssp() lookup.
DEBT-040 opens to track the three remaining protocol framers that
will consume this parser:
- SMB: hand-rolled SMB2 + Session Setup framer (~200 LoC) replacing
Impacket's opaque SimpleSMBServer
- RDP basic auth: TPKT/X.224/MCS framer for legacy plaintext path
(~150 LoC)
- RDP NLA: TLS upgrade + CredSSP TSRequest parser, reuses parse_type3
via the SPNEGO inner blob (~250 LoC)
These are substantial protocol implementations each — landing them
inline with Phase 1-3+6's cred coverage rollout would have inflated
the session beyond reasonable scope. Cred-reuse analytics already work
across the 12 services covered in this session; the deferred three
just round out the fleet.
Honest correction to the "every cred-emitting service" claim. Audit
of templates/* found three gaps:
1. MQTT — was working through the legacy adapter, silently dropped
when Phase 3 (e696c2b) deleted it. Now migrated to encode_secret()
alongside the others.
2. Postgres — `auth, pw_hash=…` event captures the MD5
challenge-response the attacker sent. Plaintext irrecoverable, so
it never fit the (principal, secret_b64=raw_bytes) shape. Lands
in Credential as secret_kind="postgres_md5_challenge".
3. VNC — `auth_response, response=…hex` event captures the 16-byte
DES-encrypted challenge. Same situation as Postgres: plaintext
irrecoverable. Lands as secret_kind="vnc_des_response".
Adds a `secret_kind` discriminator column to Credential (default
"plaintext", indexed). The dedup tuple gains secret_kind so two
credentials with the same sha256 but different kinds are
fundamentally different rows — different challenges produce
different bytes for the same plaintext password, so cross-kind
reuse matches are meaningless and would only confuse analytics.
The model now genuinely covers every cred-emitting service in the
fleet:
plaintext SSH, Telnet, FTP, POP3, IMAP, SMTP, Redis, LDAP,
MQTT
postgres_md5_* Postgres
vnc_des_response VNC
Username-only services (MySQL/MSSQL — TDS pre-encryption captures
the user but never sees the password byte) intentionally don't feed
Credential — they're recon signals, not cred attempts.
40 tests pass in the touched scope. New cases: secret_kind dedups
independently in the repo; Postgres MD5 + VNC DES emitters thread
through; MQTT round-trips through the native branch.
Phase 3/3 of DEBT-039. Now that all six cred-emitting services
(SSH, Telnet, FTP, POP3, IMAP, SMTP, Redis, LDAP) emit the universal
`secret_b64`-bearing SD shape, the ingester's legacy fork has no
live emitters to handle. Deletes:
- `_ingest_credential_legacy()` — synthesized native fields from
username+password
- The `elif _fields.get("username") and _fields.get("password")`
branch in `_extract_bounty`
- `_printable_filter()` — only the legacy adapter called it; the
native branch trusts the emitter (encode_secret() in Python or
sd_escape() in C) to have already sanitized
- The legacy-adapter test cases in tests/web/test_ingester.py;
their coverage moved to tests/services/test_cred_emitters.py
per-service in Phase 2
The cred path is now single-shape end-to-end. A pre-migration log
row carrying only username+password silently produces no Credential
write — by design, since no current emitter writes that shape and
keeping a code path alive for theoretical legacy data risks masking
emitter regressions. Pre-v1: any historical Bounty cred rows from
before commit 2f47f67 stay untouched.
DEBT-039 marked resolved with summary of the three commits and the
silent-loss bug fix for Redis + LDAP that fell out of execution.
Replaces the opaque Bounty.bounty_type='credential' path with a
dedicated `credentials` table whose schema is forward-compatible
across every auth-bearing service in the fleet. Hoisted indexed
columns (secret_sha256, principal, service, attacker_ip) carry the
universal reuse-analytics signal; service-specific JSON keys ride
in `fields`. Cross-service reuse queries become an indexed lookup
on secret_sha256 instead of JSON_EXTRACT scans.
Schema decisions baked in (per ANTI):
- New `Credential` table, not extension to Bounty
- Hoisted `principal` column for cross-service principal-reuse
- Standardized JSON keys: every payload carries secret_b64 +
secret_printable + principal universally; service-specific extras
(user, domain, dn, mech, …) ride alongside
The auth-helper SD-block emits the new shape natively. The ingester
forks at _extract_bounty:
- Native shape (SSH/Telnet, future emitters): secret_b64 present →
direct upsert_credential
- Legacy shape (FTP/POP3/IMAP/SMTP today): username + password →
adapter synthesizes secret_{b64,sha256,printable} on the fly,
upserts into the same Credential table. Tracked as DEBT-039;
one-shot bridge until those service templates migrate.
Defense-in-depth across five layers (input validation):
- C helper: bytes outside [0x20, 0x7f) collapse to '?', RFC 5424
escape rules for \\, ", ]; b64 preserves exact bytes
- Ingester native branch: rejects malformed secret_b64 (regex), drops
the credential row but keeps the underlying Log
- Ingester legacy adapter: same printable-ASCII filter as the C
code; sha256 + b64 over the original utf-8 bytes (lossless, even
when secret_printable is sanitized)
- DB column caps with truncation warning; sha256 always over the
full pre-truncation bytes so reuse queries match across truncation
- JSON serialized with ensure_ascii=True so utf8mb4 columns stay
safe even with non-ASCII service-specific keys
Bounty.bounty_type='credential' is no longer written. Pre-v1: no
historical backfill; existing rows stay untouched but unused.
595 tests pass; new tests cover the model + repo (upsert dedup,
null-principal independence, cross-service reuse, filters), both
ingester branches, b64 validation, sanitization preserving the
fingerprinting signal in b64.
After DECNET_WEBHOOK_CIRCUIT_THRESHOLD (default 5) consecutive failed
deliveries, the worker calls trip_webhook_circuit(uuid, ts) which
flips enabled=False and stamps auto_disabled_at. The worker sets its
reload flag so the next dispatch epoch stops consuming events for the
tripped sub entirely — one dead receiver can't poison the shared
egress pool anymore.
Operator clears the trip via PATCH — setting enabled=True when the
sub was previously disabled clears auto_disabled_at, zeros
consecutive_failures, and clears last_error. Admin-pause → re-enable
hits the same path harmlessly.
Three observable states now distinguishable in the UI:
- Active enabled=True, auto_disabled_at=NULL
- Admin-paused enabled=False, auto_disabled_at=NULL
- Tripped enabled=False, auto_disabled_at=<ts>
UI surfaces a TRIPPED · <ts> chip on the row (red, alert-styled) and
a "N TRIPPED" count in the page header. Hover tooltip tells the
operator how to reset ("Re-enable via Edit").
record_webhook_failure now returns the new consecutive_failures count
so the worker can compare against the threshold without a second
roundtrip. trip_webhook_circuit is idempotent — re-tripping just
re-stamps auto_disabled_at.
Closes THREAT_MODEL WH-02 and DEBT-037 §1.
The webhook MVP shipped with deliberate deferrals; this entry names
them so future PRs know exactly what's left to close: circuit
breaker, dead-letter table, delivery audit log, batch/coalescing,
per-subscription rate limiting, payload templates per destination,
and secret encryption at rest.
Non-negotiable even at MVP scope (HMAC signing, bus-off degraded
mode, jittered retry backoff) is called out explicitly to prevent
future contributors from weakening it under the banner of
"simplification."
The SessionProfile SQLModel table has shipped with every column
nullable since session-recording v1 landed — because the ingester
that populates them from the [t,"i",d] events in the transcript
shards does not exist yet (known as gap #2 in SIGNAL_CAPTURE_AUDIT).
A manual keystroke-dynamics pass over one real session (wget scanme.
nmap.orgh) trivially recovered CoV ≈ 0.74 (human band), a 467 ms
semantic pause before the URL argument, tight intra-word bigrams
(ge 79 ms, t<space> 83 ms), and slow start-of-action latency (w→g
225 ms) — all signals the existing schema columns were designed to
hold. So the missing piece is purely the ingester.
Entry captures:
- the manual case as the motivating + sanity-check target
(ingester should produce CoV ≈ 0.74 ± 0.05 on the same shard),
- three schema extensions the manual analysis suggests beyond what
the table carries today: kd_start_of_action_latency_ms,
kd_pause_hist_{burst,think,distracted}, kd_top_bigrams,
- a non-PII discipline line: raw keystroke content (including
captured passwords) MUST NOT land in SessionProfile columns —
only timing and frequency aggregates.
Poll-driven ingestion can ship first; the bus-trigger path
piggybacks on DEBT-031's deferred session-boundary topics.
Tracks the durable follow-up to 323077b. The transcripts soft-fail
shipped in that commit keeps the API from 500-ing on
/var/lib/decnet/artifacts/** permission mismatches, but the real
issue is that decoy containers write artifacts under a uid the API
can't read — today's workaround is a manual `sudo chown -R` after
every new deploy.
Three design options documented (container-runs-as-host-uid, setgid
+ shared group, inotify sidecar) with a recommendation, plus an
acceptance criterion: fresh init + deploy + record session → the
API can read the transcripts with no manual chown.
Units + polkit rule + systemd_control helper + start endpoints +
installed flag + UI wiring all landed. SWARM-host start/stop and
crash-quarantine policy stay as named deferrals.
New decnet/templates/_shared/sessrec/ — a small C program installed as the
login shell in SSH / Telnet deckies. Forkpty-relays /bin/bash, records each
chunk as an asciinema v2 event into a shared JSONL day-shard keyed by sid,
and emits one RFC 5424 session_recorded line on exit (direct to PID 1's
stdout, same pattern syslog_bridge.py uses).
Storage: one shard per (decky, UTC day) at
/var/lib/systemd/coredump/transcripts/sessions-YYYY-MM-DD.jsonl. Concurrent
appends are lock-free: each write is chunked below PIPE_BUF so O_APPEND
interleaves atomically. Per-session cap 10 MB with a trunc sentinel; disk-
free precheck (<200 MB) falls through to plain bash with a session_skipped
log event. Attacker src_ip resolves from \$SSH_CONNECTION, getpeername(0),
or utmp in that order. SIGWINCH appends a 'r' resize event so ncurses
replays stay aligned.
Stealth for v1: /etc/passwd shell-swap to /usr/libexec/login-session
(plausible login-machinery path) + prctl comm disguise. Full LD_PRELOAD
argv-zap is deferred — sshd strips LD_PRELOAD from the session env, so
wiring the existing argv_zap.so into this path needs a separate wrapper.
DEBT-033 opened for size-based day-shard rotation; v1's disk-free precheck
covers the worst case but can be blinded by a one-shot disk fill.
The mutation-event stream landed this session closes the "deckies are
atomic nodes" gap for service-list changes, but substrate identity is
really ``(service, implementation_fingerprint)``. A base-image
rebuild that rotates OpenSSH 8.4 → 9.2 without changing the service
list is invisible to the correlation graph today because the prober's
dedup set is in-memory and per-run — no cross-run diff, no
"fingerprint changed" event.
DEBT-032 documents the required piece: a per-(decky, service,
probe_type) persistence layer + diff-on-change emission, with the
correlator's existing mutation-marker interleaving pattern as the
model for fingerprint markers. A mutation-vs-fingerprint divergence
detector then falls out of the data model for free — fingerprint drift
without a preceding mutation ⇒ substrate_divergence finding.
All nine service workers now participate in the host-local bus: sniffer,
prober, correlator (via profiler), profiler, collector, ingester, agent,
forwarder, updater. Pre-bus behavior is preserved end-to-end for
DECNET_BUS_ENABLED=false and get_bus() failures.
Three items intentionally deferred: realism-probe decky.{id}.state
(needs a realism probe path that doesn't exist yet), correlator session
boundaries (needs session state), and bus-wake subscriptions (publishes
landed; wake side wired to no subscriber today).
Per-worker integration of the service bus shipped in DEBT-029. Publishes
are fire-and-forget; subscribes wake polling loops. Bus stays optional —
if get_bus() fails or DECNET_BUS_ENABLED=false, workers log once and
continue in poll-only mode (mirrors decnet/mutator/engine.py:run_watch_loop).
- scripts/bus/smoke-mutator.sh: boots decnet bus, subscribes to
topology.>, publishes one event per mutation-lifecycle state plus
a topology.status transition, asserts all four land on the
subscriber. Cheap E2E for the topic hierarchy the mutator + SSE
route rely on.
- development/DEBT.md: mark DEBT-030 ✅ resolved (Phase A) with a
summary of what shipped; flag the optimistic staged-buffer editor
as Phase B follow-up, not debt.
Land the `decnet bus` worker and `get_bus()` factory. Transport is a
host-local UNIX-domain socket (0660, group=decnet); authz is the file
mode. Wire framing is a tiny verb-line + 4-byte-BE length + orjson body.
NATS-style wildcard topics (`*`, `>`). At-most-once, fire-and-forget —
DB stays the source of truth. `FakeBus` / `NullBus` for tests and the
disabled path. Cross-host federation is deferred to a future
`--bridge-tcp` mode; DEBT-030 is master-only and unblocked.