DECNET

Author	SHA1	Message	Date
anti	c39802a4bb	feat(correlation/attribution): hash + numeric merge functions (Phase 3) aggregate_numeric(): EWMA + dispersion (CV) over numeric primitive values. Stable when CV < 20% AND mean shift < 30%; drifting on >= 30% mean shift; conflicted on CV > 100%. Confidence is 1 - min(CV, 1). multi_actor is intentionally NOT a numeric state — bimodal distributions belong to the categorical detector once the value space is bucketed. aggregate_hash(): counts distinct hash values within HASH_DRIFT_WINDOW_SECS of the most recent observation. 0 rotations = stable, 1..HASH_DRIFT_MAX = drifting, > HASH_DRIFT_MAX = conflicted. Reads rotation events; never recomputes hashes (DEBT-032 already produces them via decnet.correlation.fingerprint_rotation). aggregate_observations() dispatcher now routes "categorical" \| "numeric" \| "hash" \| None and rejects unknown kinds with ValueError (louder than NotImplementedError now that all three v0 mergers exist). 17 synthetic-input tests cover both new mergers and the dispatcher.	2026-05-09 01:59:11 -04:00
anti	4956977739	feat(correlation/attribution): categorical merge state machine (Phase 2) aggregate_categorical(): pure function over a per-(identity, primitive) observation list. Five-state vocabulary, last-N=5 window comparison with one-outlier-tolerant majority threshold: * unknown — < 3 observations * stable — recent 5 agree (≥ 4 of 5 share top value), older 5 same * drifting — recent 5 stable but disagrees with older 5, or older was conflicted and recent stabilised * conflicted — recent 5 split, no two-value alternation pattern * multi_actor — recent 5 split + alternation between exactly two values (operator A↔B handoff). Confidence capped at 0.6 per _thresholds.MULTI_ACTOR_MAX_CONFIDENCE; flapping primitives on flaky networks would otherwise look like two operators. aggregate_observations() dispatcher honours value_kind="categorical" (or None) and raises NotImplementedError for "numeric" / "hash" so Phase 3 lands cleanly. 14 synthetic-input tests cover every state + boundary condition.	2026-05-08 23:18:22 -04:00
anti	c2891d6cca	feat(correlation/attribution): substrate + idle handler (Phase 1) v0 Phase 1 of ATTRIBUTION-ENGINE.md: * AttributionStateRow SQLModel keyed on (identity_uuid, primitive) per ANTI direction — re-keying state rows when the v1 clusterer merges attackers is the migration debt v0 should not bake in. ATTRIBUTION-ENGINE.md updated with the deviation note. * AttributionMixin: ensure_stub_identity_for_attacker, idempotent upsert_attribution_state, get_attribution_state[_for_identity], list_multi_actor_identities (the Phase 5 correlator's read). * attribution.profile.{state_changed,multi_actor_suspected} bus topics + builder; wiki Service-Bus.md updated separately. * attribution_worker.py: subscribes to attacker.observation.>, ensures stub identity per event, logs and continues. No merger, no state writes, no derived events — Phase 4 wires those. * attribution/{aggregate.py,_thresholds.py} skeletons: Phase 2 fills _aggregate_categorical, Phase 3 adds numeric+hash+dispatcher.	2026-05-08 23:16:13 -04:00
anti	6c6f97e840	feat(prober,correlation): attacker fingerprint rotation detection (DEBT-032) When the prober observes a NEW hash for an (attacker_uuid, port, probe_type) triple it has seen before — VPS rotation, SSH server rebuild, TLS cert swap — emit a derived attacker.fingerprint_rotated event carrying both old and new hash. Detection is a small library (decnet.correlation.fingerprint_rotation) called inline from the prober at each of the three emit sites (JARM/HASSH/TCPFP). No new daemon. New AttackerFingerprintState table holds per-triple last-hash state; Attacker.rotation_count and Attacker.last_rotation_at are stamped on every diff. Library is sync, fully unit-tested via injected publish_fn / syslog_fn callbacks.	2026-05-03 05:12:51 -04:00
anti	b5ce236cab	test(bus): pin scope-(2) producer wiring for reuse / clusterer / intel Three producer-side regression guards. Each drives the worker's run loop with a fake bus + stubbed repo and asserts the documented topic fires when the producer has data: - reuse correlator → credential.reuse.detected (one finding row) - clusterer → identity.formed + identity.merged (one ClusterResult) - intel worker → attacker.intel.enriched (one unenriched attacker + a fake provider returning a "malicious" verdict) These complement commit 1's attacker.session.ended producer test — together the four cover every TTP-relevant publisher in the tree (modulo email.received, which has no producer yet; tracked in DEBT.md).	2026-05-02 02:38:24 -04:00
anti	d9d2a80573	fix(collector): unwrap double-wrapped RFC5424 around bash PROMPT_COMMAND Honeypot SSH containers run `PROMPT_COMMAND` that calls `logger --rfc5424 --msgid command -t bash "CMD …"`. The Docker-stdout reader prepends an outer RFC5424 envelope (HOSTNAME=<decky>, APP-NAME=1, MSGID=NIL) around that inner syslog line. Both the collector parser (`parse_rfc5424`) and the correlation parser (`parse_line`) saw the outer NIL MSGID and emitted `event_type="-"` for every shell command — which: - kept `Attacker.commands` rows missing `command_text` - left R0001–R0030 (the pattern rule pack that matches shell commands) with no haystack - made `decnet.collector.log` show `event written … type=-` for the very lines that should be `type=command` Both parsers now detect the inner-RFC5424 shape (`<TS> <HOST> <APP> <PROCID> <MSGID> <rest>`) when the outer MSGID is NIL and the SD-arm is also NIL, and re-extract HOSTNAME / APP-NAME / MSGID / remainder from the body. The collector parser also recovers the post-SD msg tail when the SD block isn't `relay@55555` (the bash CMD line carries a `[timeQuality …]` block) so the kv-fallback can find `src_ip`. Mirroring tests in tests/collector and tests/correlation pin both the unwrap and the regression guard for non-double-wrapped lines.	2026-05-02 02:32:21 -04:00
anti	d4591b38dc	fix(profiler): aggregate bash PROMPT_COMMAND lines into attacker profile SSH/telnet decky containers emit shell commands via `logger -t bash "CMD …"` which produces RFC 5424 lines with MSGID=NIL. Both parsers were leaving event_type="-", so the behavioral profiler's `_COMMAND_EVENT_TYPES` filter silently dropped them — the IP profile existed but no command transcripts or artifacts. Confirmed in the wild: 44/48 events from one attacker were event_type="-". Rewrite event_type to "command" in both parsers when MSGID=NIL and the msg starts with "CMD ". Correlation parser also extracts the cmd= payload into fields["command"] so the profiler can build the transcript; collector parser leaves fields={} to avoid duplicate pills in the dashboard.	2026-04-28 19:09:41 -04:00
anti	862e4dbb31	merge: testing → main (reconcile 2-week divergence)	2026-04-28 18:36:00 -04:00

8 Commits