DECNET

Author	SHA1	Message	Date
anti	771944830a	docs(behave): close Phase B in BEHAVE-EXTRACTOR.md Tick the four Phase B checkboxes (B.1-B.4) and append a Phase B completion log inline (per the "append phase logs to design docs" memory rule). Captures per-primitive confidence ranges, source signals, and the PII-discipline regression that all four primitives uphold. Phase A + Phase B = 10 primitives emitting on every shard; PHASE_AB_PRIMITIVES is binding for every subsequent phase. Phase C (motor.shell_mastery.*) lands next.	2026-05-03 21:30:13 -04:00
anti	8161c67ec5	feat(profiler/behave_shell): emit motor.command_chunking BEHAVE-EXTRACTOR.md Phase B Step B.4. First implementation — prototype doesn't ship this primitive. * SessionContext gains intra_command_iats: per-command tuple of IATs between consecutive input events whose timestamps fall inside [cmd.start_ts, cmd.end_ts). Excludes the terminator IAT. Built by _per_command_iats. * _features/motor.py:command_chunking(ctx) emits one Observation in {fluent, fragmented, single_command}. - 0 commands → skip emit - 1 command → single_command (registry-allowed point) - ≥2 commands → median CV across per-command typed-IATs; < CMD_CHUNKING_FLUENT_CV_MAX (0.50) → fluent, else fragmented - paste-only sessions (no command has ≥3 typed IATs) → skip emit (no honest within-command rhythm to measure) Confidence 0.80 / 0.65 / 0.60. * Calibration grid widened to include motor.command_chunking; green across all five shards. Phase B primitive set complete. Tests: no commands → skip, 1 command → single_command, uniform typing → fluent, alternating fast/slow → fragmented, paste-only multi-command → skip emit.	2026-05-03 21:29:31 -04:00
anti	d04f91cd8c	feat(profiler/behave_shell): emit motor.error_correction BEHAVE-EXTRACTOR.md Phase B Step B.3. Replaces the prototype's two-line "0 vs >0 backspaces" placeholder with a backspace-timing classifier that honours the registry's full vocabulary. * SessionContext gains backspace_count, backspace_iats (IAT from each backspace back to the preceding non-backspace input event), and kill_line_count (^U / ^W). Built by _scan_correction_signals, which retains only counts and timing aggregates — no character data leaves the helper, in line with the BEHAVE PII discipline. * _features/motor.py:error_correction(ctx) emits one Observation in {immediate, deferred, absent, route_around}. - 0 backspaces + ≥1 ^U/^W → route_around (rewrite, not correct) - 0 backspaces + 0 kill-lines → absent - backspaces with median IAT ≤ 500 ms → immediate - slower → deferred Confidence 0.65 / 0.65 / 0.55 / 0.55. * < 3 inputs → skip emit. * Calibration grid widened to include motor.error_correction; green across all five shards. Tests cover all four buckets, the < 3 inputs skip, and the PII regression (raw command body never appears in the serialised observation).	2026-05-03 21:27:46 -04:00
anti	0737fcfe93	feat(profiler/behave_shell): emit motor.motor_stability BEHAVE-EXTRACTOR.md Phase B Step B.2. First principled implementation — the prototype doesn't ship this primitive at all. * _features/motor.py:motor_stability(ctx) emits one Observation in {steady, variable, tremor}. Reuses ctx.typing_bursts from B.1. * Tremor proxy: fraction of within-burst IATs below TREMOR_FAST_FLOOR_S (30 ms — humans can't sustain sub-50 ms IATs). ≥ TREMOR_RATE_MIN (10%) sub-floor → tremor (double-press / motor twitch / stuck-key). * Otherwise median burst CV decides: < CV_STEADY_MAX → steady, else → variable. Confidence 0.70 / 0.60 / 0.65. * No typing bursts or fewer than 5 within-burst IATs → skip emit. * Calibration grid widened to include motor.motor_stability; green across all five shards. Tests cover all three buckets + skip paths.	2026-05-03 21:25:54 -04:00
anti	d90c8b70ce	feat(profiler/behave_shell): emit motor.keystroke_cadence BEHAVE-EXTRACTOR.md Phase B Step B.1. * SessionContext gains typing_bursts: tuple[tuple[float, ...], ...] built by _split_typing_bursts(iats) — splits at gaps > IKI_THINK_MAX_S (1.5s) and drops bursts of fewer than 3 IATs. Mirrors prototype's _split_into_bursts at BEHAVE/prototype_extractors/shell/extract.py:275. * _features/motor.py:keystroke_cadence(ctx) emits one Observation in {steady, bursty, hunt_and_peck, machine}. Median CV across typing bursts; mean IKI < IKI_MACHINE_MAX_S paired with CV < CV_MACHINE_MAX → machine. Confidence 0.85/0.70/0.65/0.60 per the prototype's calibration history. * < MIN_INPUTS_FOR_CADENCE inputs or zero typing bursts → skip emission. v0.1 emits only the burst-CV variant; the prototype's NAIVE session-CV variant is parked for v0.2. * Calibration grid widened (PHASE_A_PRIMITIVES → PHASE_AB_PRIMITIVES) to include motor.keystroke_cadence. Grid green across all five shards. Tests: too-few-inputs → no emit, all-think-pauses → no burst → no emit, uniform IATs → steady, sub-5ms → machine, mixed-pace → bursty, extreme bimodal → hunt_and_peck.	2026-05-03 21:24:13 -04:00
anti	0510cde073	feat(profiler/behave_shell): Phase A — calibration floor green BEHAVE-EXTRACTOR.md Phase A Step 10. Closes the discriminative floor: six primitives emit, the five-class calibration grid is the binding regression test for every subsequent phase. * Phase A checklist boxes (Steps 0-10) ticked in development/BEHAVE-EXTRACTOR.md. * Phase A completion log appended inline to the design doc per the "append phase logs to design docs" memory rule — captures per-primitive confidence ranges and the 2026-05-02 empirical anchors that drove threshold calibration. * Hard gate: tests/profiler/behave_shell/test_calibration_grid.py parametrised over five class shards, all green; skips cleanly on BEHAVE_CALIBRATION_DIR unset. Phases B-G expand horizontally across the registry. Phase H is the full-corpus lockdown + v0 release. Worker (BEHAVE-INTEGRATION.md Phase 4) is unblocked at this milestone — it can wire per-session production against the Phase A engine without waiting for the rest of the Tier-A corpus.	2026-05-03 08:02:02 -04:00
anti	640294f3dc	test(profiler/behave_shell): five-class calibration grid lockdown BEHAVE-EXTRACTOR.md Phase A Step 9 — the gate. Runs the pure engine against each of the five 2026-05-02 calibration shards and pins the contract that all subsequent Phase B-G PRs must keep green: every Phase A primitive (motor.input_modality, motor.paste_burst_rate, cognitive.inter_command_latency_class, cognitive.command_branch_diversity, cognitive.feedback_loop_engagement, cognitive.inter_command_consistency) fires at least once per shard. * tests/profiler/behave_shell/test_calibration_grid.py parametrized over (shard_file, class_label) for HUMAN / YOU-sim / LW-sim / CLAUDE-FF / CLAUDE-CL. Skips entirely when BEHAVE_CALIBRATION_DIR is unset (CI provides the path; local dev doesn't have to). * Plus a discrimination-smoke check: at least one primitive produces different majority values across present classes — catches the "constant-output regression" failure mode where the engine quietly degenerates to a stub. Calibration tweak: BRANCH_DIVERSITY_LINEAR_MIN dropped from 0.80 to 0.70 to align with the prototype's empirical anchors (CLAUDE-CL ≈ 0.55-0.60 adaptive; YOU-sim / CLAUDE-FF scripted recon ≈ 0.75+ linear). Test for the middle band re-pinned at the new boundary. Per-class value pinning (e.g. HUMAN must emit inter_command_consistency=bimodal) is intentionally NOT a hard gate yet — v0.1 thresholds put real human sessions in "variable", and true bimodal detection (Hartigan dip / two-peak) is registry-flagged for v0.2. Tighter pinning lands as the corpus grows.	2026-05-03 08:00:50 -04:00
anti	842b7de950	feat(profiler/behave_shell): emit cognitive.inter_command_consistency BEHAVE-EXTRACTOR.md Phase A Step 8. Dispersion / bimodality of inter-command pauses. HUMAN-bimodal vs LLM-metronomic. * _features/cognitive.py:inter_command_consistency(ctx) emits one Observation in {metronomic, variable, bimodal}. * CV = stdev / mean of ctx.inter_cmd_iats. CV < 0.40 → metronomic (LLM-pure; corpus anchor 0.24); CV ≥ 1.50 → bimodal heuristic (LLM-assisted human; v0.1 placeholder, true bimodal via Hartigan dip is registry-flagged for v0.2); else → variable (human; corpus anchor 0.94). * < 2 IATs or zero mean → skip emission. < 5 commands halves confidence (0.40 vs 0.75) per sample-size honesty. Tests: too-few IATs → no emission, uniform → metronomic, human-like dispersion → variable, extreme bursts+gaps → bimodal, low-sample-count → reduced confidence. Step 8 closes the six-primitive calibration floor for Phase A. Step 9 (calibration grid lockdown) is the gate that pins it.	2026-05-03 07:56:49 -04:00
anti	2f8c107e70	feat(profiler/behave_shell): emit cognitive.feedback_loop_engagement BEHAVE-EXTRACTOR.md Phase A Step 7. The orthogonal axis — does the operator's pause-after-command correlate with bytes of output they just saw? Splits HUMAN/CLAUDE-CL (closed_loop) from LW-sim/CLAUDE-FF (fire_and_forget); cuts ACROSS the LLM/human axis. * _features/cognitive.py:feedback_loop_engagement(ctx) emits one Observation in {closed_loop, fire_and_forget, unknown}. * Pearson correlation between ctx.output_per_cmd[i] and ctx.inter_cmd_iats[i] (paired by construction in Step 4); via statistics.correlation with constant-series fallback to "unknown". * r > FEEDBACK_CORRELATION_MIN (0.30) → closed_loop; otherwise (zero, negative, or undefined) → fire_and_forget. * First primitive that depends on output events: zero output events in the shard or fewer than FEEDBACK_MIN_PAIRS (5) pairs → emit "unknown" at confidence 1.0 (the absence-of-data is itself a high-confidence answer). Zero-command session skips entirely. Tests: no-output → unknown, few-pairs → unknown, strong positive r → closed_loop, constant pace → fire_and_forget/unknown, negative r → fire_and_forget.	2026-05-03 07:55:38 -04:00
anti	3fc6ea5f75	feat(profiler/behave_shell): emit cognitive.command_branch_diversity BEHAVE-EXTRACTOR.md Phase A Step 6. Content-based playbook-vs- adaptive split. Splits CLAUDE-FF (linear_playbook, ~10 distinct tools) from CLAUDE-CL (adaptive_branching, 5-6 tools with curl re-invoked) per the 2026-05-02 empirical anchor. * _features/cognitive.py:command_branch_diversity(ctx) emits one Observation in {linear_playbook, adaptive_branching, unknown}. * unique_first_token_hashes / total_commands ratio. ≥ 0.80 → linear_playbook, otherwise adaptive_branching (the doc instructs bias-to-adaptive in the middle band — that's the discriminative signal we actually want). * < 5 commands → "unknown" at confidence 1.0 (the absence of data is itself a high-confidence answer per the registry's allowed vocabulary). Zero-command session skips emission entirely. Tests cover unique-tokens → linear, repeated-tokens → adaptive, middle band → adaptive (bias), under-floor → unknown @ 1.0, plus PII regression: raw tokens never appear in the serialised observation.	2026-05-03 07:54:13 -04:00
anti	e52a0e0381	feat(profiler/behave_shell): emit cognitive.inter_command_latency_class BEHAVE-EXTRACTOR.md Phase A Step 5. Classifies the operator's thinking pace between commands. Splits LW-sim / CLAUDE-FF / CLAUDE-CL. * _features/cognitive.py:inter_command_latency_class(ctx) emits one Observation in {instant, typing_speed, deliberate, llm_lightweight, llm_heavyweight, long}, computed as the median of ctx.inter_cmd_iats bucketed against the prototype thresholds (v0.2 split: lightweight 2-8s, heavyweight 8-30s). * Sample-size honesty: < 5 commands halves confidence (0.40 vs 0.80) per BEHAVE-EXTRACTOR.md. * Threshold consts (INTER_CMD_*_MAX, MIN_COMMANDS_FOR_FULL_CONFIDENCE, plus parked Step 6/7/8 thresholds for the next three commits) added to _thresholds.py. Tests cover all six buckets at empirically-anchored IATs (15s ≈ Claude Opus driving recon via tmux send-keys), plus the single-command no-IAT and low-sample-count paths.	2026-05-03 07:52:39 -04:00
anti	f3880b24d1	feat(profiler/behave_shell): command segmentation in SessionContext BEHAVE-EXTRACTOR.md Phase A Step 4. Pure refactor inside _ctx.py — no new feature emits. Lays the shared utility for the three cognitive primitives next in line (Steps 5-7). * Command dataclass (frozen): start_ts, end_ts, first_token_hash. PII-safe by construction — only the first whitespace-delimited token of the command is retained, and only as a sha256 hash (decnet/profiler/behave_shell/_parse.py:hash_token). * _segment_commands walks input events char-by-char, splits on \r / \n, hashes the first token, drops the rest. * SessionContext gains commands, inter_cmd_iats, output_per_cmd. output_per_cmd[i] counts bytes between commands[i].end_ts and commands[i+1].start_ts — the natural pairing for Step 7 (feedback_loop_engagement). Tests: empty / unterminated streams, single command (CR + LF terminators), paste-with-newline, multi-command IAT pairing, output-byte counting between boundaries, blank-line skip, first-token-only PII discipline.	2026-05-03 07:50:55 -04:00
anti	6763fceb0b	feat(profiler/behave_shell): emit motor.paste_burst_rate BEHAVE-EXTRACTOR.md Phase A Step 3. Same paste-event ratio as motor.input_modality but coarser-bucketed: this is the habit signal (does the operator reach for paste at all?), where input_modality is the dominant-channel signal. * _features/motor.py:paste_burst_rate(ctx) emits one Observation per session in {none, occasional, habitual} with confidence 0.70 / 0.70 / 0.80. * Thresholds: PASTE_RATE_OCCASIONAL_MIN=0.10, PASTE_RATE_HABITUAL_MIN=0.50. Splits YOU-sim from LW/CLAUDE-FF/CLAUDE-CL — LLM-driven sessions paste habitually, real humans rarely paste. Tests: pure-typed → none; 1-paste-in-10 → occasional; paste-majority → habitual; output-only → no observation; habitual confidence > occasional confidence.	2026-05-03 07:49:03 -04:00
anti	879f5e731b	feat(profiler/behave_shell): emit motor.input_modality BEHAVE-EXTRACTOR.md Phase A Step 2. The first primitive — picked first because it has the highest discriminative value (HUMAN vs everyone) and the simplest implementation (paste-event ratio over total inputs). * _features/motor.py:input_modality(ctx) emits one Observation per session in {typed, pasted, mixed} with confidence 0.75 / 0.70. * _features/_emit.py centralises the make_observation helper so every feature module gets the same Window/source/evidence_ref boilerplate without copy-paste. * Thresholds inherited from the prototype's calibration history (MODALITY_PASTED_MIN=0.40, MODALITY_TYPED_MAX=0.05). * Zero-input session skips emission — registry doesn't admit "unknown" here. Tests: pure-typed → typed, pure-pasted → pasted, mixed → mixed, output-only session → no observation, full envelope round-trip.	2026-05-03 07:47:38 -04:00
anti	c9a81a23c2	feat(profiler/behave_shell): asciinema parser + paste-burst detection BEHAVE-EXTRACTOR.md Phase A Step 1. Lays the shared primitives that Steps 2-3 (motor.input_modality, motor.paste_burst_rate) will consume: * parse_shard_line / parse_shard turn a shard JSONL line/file into AsciinemaEvents, skipping headers and malformed records. * PasteBurst dataclass + _detect_paste_bursts group consecutive paste-class input events (len(d) >= 4 chars per the prototype's empirical floor) into contiguous bursts, splitting on IAT gaps larger than PASTE_BURST_MAX_IAT_S (200ms). * SessionContext now carries iats and paste_bursts derivations. * Threshold constants harvested from BEHAVE/prototype_extractors/shell/extract.py — calibrated against the five 2026-05-02 shards. Tests cover pure-typed, pure-pasted, mixed streams; close vs far paste events; typed events breaking a burst; PasteBurst immutability; and the JSON parser's junk handling.	2026-05-03 07:46:01 -04:00
anti	f8eae04e5d	feat(profiler/behave_shell): scaffold extract_session entry point BEHAVE-EXTRACTOR.md Phase A Step 0. Lays the package skeleton (__init__/extract/_parse/_ctx/_thresholds/_features) with empty FEATURES = (), so the worker plumbing in BEHAVE-INTEGRATION Phase 4 has a stable import path before any primitive lands. extract_session() builds a SessionContext once and fans the registered feature functions across it; at Step 0 that fan-out is empty and the function yields nothing. Step 1 (asciinema parser + paste-burst detector) and Step 2 (motor.input_modality) land next. Smoke suite asserts the empty contract: empty stream → no observations, single event → t_start == t_end, multi-event → events routed into input_events / output_events by kind, evidence_ref defaults to "session:<sid>" or honours an explicit override.	2026-05-03 07:42:09 -04:00
anti	a2a61b636e	feat(web): drop SessionProfile, wire observations into AttackerDetail (DEBT-050 / DEBT-036 closure) Destructive half of BEHAVE-INTEGRATION.md Phase 1. SessionProfile + its kd_* columns + the dialect ALTER TABLE migration helpers are deleted outright; pre-v1, the table shipped empty, no migration ceremony required (per the no-new-_migrate_-pre-v1 memory rule). DEBT-036 closes via DEBT-050 supersedure. AttackerDetail's ``observations`` field is wired to the new ``observations`` table and returns an empty list until the BEHAVE-SHELL extractor (DEBT-050 Phase 2) starts emitting. decnet/web/db/models/attackers.py — SessionProfile class deleted (~135 lines), KD_PAUSE_*/KD_START_OF_ACTION_IDLE_S module constants deleted, module docstring updated to point at the observations table. AttackerIdentity.kd_digraph_simhash is KEPT — it's the v2 federation centroid hook, not a SessionProfile field; docstring repointed to the BEHAVE primitive that will populate it. decnet/web/db/sqlmodel_repo/attackers/sessions.py — DELETED. SessionProfilesMixin dropped from the AttackersMixin MRO. decnet/web/db/repository.py — abstract upsert_session_profile + get_session_profile removed. decnet/web/db/sqlite/repository.py + mysql/repository.py — _migrate_session_profile_table helpers and their initialize() calls removed. mysql initialize() now goes attackers → column_types → admin (no session_profile step). decnet/web/db/models/__init__.py — SessionProfile re-export gone. decnet/web/db/models/attacker_intel.py — docstring cross-reference to SessionProfile.schema_version retargeted to AttackerIdentity. decnet/web/router/attackers/api_get_attacker_detail.py — adds ``observations: []`` to the response by calling ``repo.latest_observation_per_primitive(uuid)`` and projecting to a list sorted by primitive path. Empty until the extractor lands; shape matches BEHAVE-INTEGRATION.md §"AttackerDetail consumer". tests/profiler/test_session_profile.py — DELETED (56 lines). tests/db/test_base_repo.py — DummyRepo loses upsert_session_profile and get_session_profile overrides. tests/db/mysql/test_mysql_migration.py — initialize-call-order assertion updated; session_profile step removed from the expected sequence; docstring records why. tests/ttp/test_lifter_absence.py — docstring "no SessionProfile" → "no ObservationRow".	2026-05-03 07:33:37 -04:00
anti	0972325527	feat(web/db): observations table + repo + bus prefix (BEHAVE-INTEGRATION Phase 1) Additive Phase 1 of BEHAVE-INTEGRATION.md. Lays the storage layer the BEHAVE-SHELL extractor (DEBT-050) will write into. Nothing breaks; SessionProfile coexists for now and is dropped in the follow-up commit. decnet/web/db/models/observations.py — new ObservationRow SQLModel mirroring the BEHAVE Observation envelope field-for-field (core/decnet_behave_core/spec/envelope.py). ``id`` is a hex-string UUID (matching BEHAVE), not a typed UUID column. ``identity_ref`` is str \| None — written by the future attribution engine, NULL until then. ``attacker_uuid`` is the one DECNET-side denormalisation; FK'd to attackers.uuid for cheap AttackerDetail joins. ``evidence_ref`` is NOT NULL for DECNET emissions even though the upstream envelope makes it optional — the worker's "already profiled?" check keys on it. UniqueConstraint(evidence_ref, primitive) enforces idempotency at the schema level so re-running the extractor on the same shard+sid produces a DB-side conflict the upsert path resolves deterministically. Class is named ``ObservationRow`` (not ``Observation``) to avoid colliding with the BEHAVE Pydantic envelope at sites that import both. decnet/web/db/sqlmodel_repo/observations.py — ObservationsMixin. Three public methods backing the canonical queries from BEHAVE-INTEGRATION.md §"Storage": ``upsert_observation`` (idempotent on the natural key), ``latest_observation_per_primitive`` (per- primitive MAX(ts) subquery, portable across SQLite and MySQL — no DISTINCT ON), ``observations_time_series`` (asc-by-ts). Plus ``has_observations_for_evidence`` for the worker's session-already- profiled check. decnet/bus/topics.py — ATTACKER_OBSERVATION_PREFIX = "observation" constant + ``attacker_observation(primitive)`` builder. Full topic shape ``attacker.observation.<primitive>`` matches what BEHAVE's spec.event_adapter.event_topic_for produces upstream. Documentation + pattern matching only — bus auth is socket file perms (DEBT-029 §2), not topic-level. decnet/web/db/repository.py — abstract ``upsert_observation``, ``latest_observation_per_primitive``, ``observations_time_series`` on BaseRepository. tests/db/test_observations.py — 11 tests covering upsert round-trip, idempotency under the unique constraint, latest-per-primitive ordering across multiple sessions, time-series asc-ordering, empty- attacker contract, every BEHAVE ValueKind round-tripping through the JSON column, and the has_observations_for_evidence check. tests/db/test_base_repo.py — DummyRepo gains the three new abstract overrides so its coverage suite still instantiates.	2026-05-03 07:25:10 -04:00
anti	11f474556c	docs(behave): integration + extractor + attribution design (DEBT-050 / 051) Three sibling design docs plus DEBT.md updates that supersede the stale DEBT-036 with a BEHAVE-aligned plan. development/BEHAVE-INTEGRATION.md — five-phase rollout: storage (observations table mirroring the BEHAVE Observation envelope plus one DECNET-side denorm; UniqueConstraint(evidence_ref, primitive) enforcing idempotency); engine (in decnet/profiler/behave_shell/ sublibrary, no new daemon, not in BEHAVE — DECNET is the engine); BEHAVE pin; worker wire; UI panel + per-attacker SSE route; live smoke. Bus payload merges id/ts/v back in to preserve sensor identifiers across the bus envelope. development/BEHAVE-EXTRACTOR.md — engine route in eight phases (A–H). Phase A locks the 6-primitive calibration grid; Phases B–G expand horizontally; Phase H is the full Tier-A corpus + v0 release. v0 ships every shell-extractable primitive (37 of them); Tier B is cross-session and lives in the attribution engine; Tier C is network-domain (toolchain.) and lives elsewhere. development/ATTRIBUTION-ENGINE.md — sublibrary inside decnet/correlation/ that consumes attacker.observation. events and emits attribution.profile.* derived state. Five-state machine (unknown / stable / drifting / conflicted / multi_actor) with per- ValueKind merge functions. v0 closes DEBT-051; v1 adds the real clusterer; v2 federation gossip. The bright line forbidding attribution to natural persons is lifted directly from BEHAVE's envelope docstring. development/DEBT.md — DEBT-036 marked STALE; DEBT-050 and DEBT-051 entries added; summary table + open list updated.	2026-05-03 07:24:19 -04:00
anti	3f080f601d	feat(intel,ingester): mal_hash feed + observed_attachments table (DEBT-046) New MalHashProvider sibling ABC (decnet/intel/base.py) since SHA-256 is a different keyspace from IntelProvider's IPs. MalwareBazaarProvider mirrors FeodoProvider's bulk-feed shape: 24h refresh via _ensure_fresh / _refresh, in-memory set[str] of hex-lowercased hashes, set-membership lookup. Auth-keyed via DECNET_MALWAREBAZAAR_AUTH_KEY; absent key silent-no-ops the lane (single warning, no HTTP traffic). Per-hash observations persist to a new observed_attachments table. DECNET is a honeypot platform — every attachment hash an attacker delivers is intel, regardless of whether anyone classified it. Verdict is sticky: True never downgrades to False/None on subsequent observations. Out of scope: API surface, federation export, retention. Ingester _publish_email_received calls the provider for each attachment sha256, sets mal_hash_match on the bus payload (omitted entirely when the message had no attachments — keeps R0046's `is True` predicate silent on hash-less mail, matching pre-paydown behavior), and upserts the row regardless of provider availability.	2026-05-03 05:56:46 -04:00
anti	03beff3840	feat(orchestrator): authoritative failure-count badge endpoint (DEBT-042) New GET /api/v1/orchestrator/events/stats?since=1h&success=false&kind=... backed by repo.count_orchestrator_failures(since_ts, kind), which counts failed rows across both orchestrator_events and orchestrator_emails since the cutoff. Window parser accepts ^\d+[smhd]$, capped at 7d. Today only success=false is accepted on this surface so the endpoint isn't accidentally repurposed before the next consumer is properly designed. Orchestrator.tsx polls the endpoint on mount + every 30 s and renders the authoritative DB-derived count instead of deriving from the in-memory SSE buffer + one paginated page (which silently excluded failures older than the local window).	2026-05-03 05:26:45 -04:00
anti	866a76eccf	test(web): scaffold vitest + RTL with Orchestrator seed suite (DEBT-043) Wire vitest 4 + jsdom + @testing-library/{react,jest-dom,user-event} + @vitest/coverage-v8 through vite.config.ts (defineConfig from vitest/config). src/test/setup.ts registers jest-dom matchers and RTL cleanup. tsconfig.app.json picks up vitest/globals types. Seed suite Orchestrator.test.tsx covers the three regressions called out in DEBT-043: empty-state render, kind-filter toggling triggers a scoped refetch, mocked stream callback prepends a row.	2026-05-03 05:20:01 -04:00
anti	6c6f97e840	feat(prober,correlation): attacker fingerprint rotation detection (DEBT-032) When the prober observes a NEW hash for an (attacker_uuid, port, probe_type) triple it has seen before — VPS rotation, SSH server rebuild, TLS cert swap — emit a derived attacker.fingerprint_rotated event carrying both old and new hash. Detection is a small library (decnet.correlation.fingerprint_rotation) called inline from the prober at each of the three emit sites (JARM/HASSH/TCPFP). No new daemon. New AttackerFingerprintState table holds per-triple last-hash state; Attacker.rotation_count and Attacker.last_rotation_at are stamped on every diff. Library is sync, fully unit-tested via injected publish_fn / syslog_fn callbacks.	2026-05-03 05:12:51 -04:00
anti	dcd558fd91	chore(infra): pin Docker base images by digest (DEBT-023) All base images (debian:bookworm-slim, ubuntu:22.04, ubuntu:20.04, rockylinux:9-minimal, centos:7, alpine:3.19, fedora:39, kalilinux/kali-rolling, archlinux:latest, honeynet/conpot:latest) now carry their resolved sha256 digest so 'docker pull' is deterministic. :tag retained for human readability; @sha256 is what Docker actually resolves. Refresh procedure documented at the top of decnet/distros.py.	2026-05-03 04:38:39 -04:00
anti	6e19d3a25a	chore(bait): scaffold default seed dir with README Empty directory tracked via .gitkeep so operators see it on first clone; README documents the .eml/.json drop-in flow that the IMAP/POP3 compose fragments wire up by default.	2026-05-03 04:30:09 -04:00
anti	b3a96a045f	feat(mail): default email_seed → \$PROJROOT/bait/ when unset When service_cfg["email_seed"] is absent, compose_fragment now falls back to $PROJROOT/bait/ if that directory exists on the host. Lets operators drop a deployment-wide bait corpus into one place without threading email_seed through every decky's config. Missing dir keeps old no-op behavior.	2026-05-03 04:25:24 -04:00
anti	b88d67794d	feat(mail): operator-tunable IMAP/POP3 email seed (DEBT-026) IMAP_EMAIL_SEED / POP3_EMAIL_SEED accept a directory (rglob .eml + .json) or a single .json/.eml. Loaded entries CONCATENATE with the hardcoded _BAIT_EMAILS — additive to the realism-engine emailgen output rather than replacing it. JSON dicts require from_addr / to_addr / subject / body; bare bodies are wrapped into RFC 5322 on load. compose_fragment reads service_cfg["email_seed"] and bind-mounts the host path read-only at /var/spool/decnet-emails/seed.	2026-05-03 02:47:06 -04:00
anti	e0b07651fd	docs(debt): mark DEBT-047 resolved (EmailLifter disk-reach + ttp agent gate)	2026-05-02 20:07:54 -04:00
anti	79674026dd	feat(cli): allow `decnet ttp` on agents (DEBT-047) The TTP-tagging worker is now safe to run on agent hosts: EmailLifter disk-reaches body-aware predicates from the local artifacts tree (DEBT-035 unblocked filesystem access; DEBT-047 added the helper). Drop `ttp` from MASTER_ONLY_COMMANDS in cli/gating.py and remove the defence-in-depth `_require_master_mode("ttp")` call in cli/ttp.py. `ttp-backfill` walks the master DB and stays master-only.	2026-05-02 20:07:03 -04:00
anti	e972d870de	feat(ttp): EmailLifter disk-reach for body-aware predicates (DEBT-047) R0047 (BEC) and the encoded-payload predicate substring-match against the email body. Shipping raw body text on the abstracted service bus is the wrong privacy stance — the bus transport may swap from UNIX socket to networked at any time, and "loopback today" is not a license to put PII on the wire. EmailLifter now opens the .eml lazily from /var/lib/decnet/artifacts/{decky_id}/smtp/{stored_as} when a body-aware predicate runs and parses the body in-process via stdlib email + policy.default. The decoded body is memoized into the payload dict so multiple body-aware predicates on the same event open the file once. Bus envelope only carries the artifact pointer (decky_id + stored_as); raw body bytes never cross the host disk boundary on the agent → master hop. Filesystem access on agents is unblocked by DEBT-035 (setgid + group-readable artifacts root, paid 2026-05-02). The legacy inline body_text path is preserved — when the producer ships body_text on the bus the helper short-circuits without opening the file.	2026-05-02 20:05:54 -04:00
anti	7036a86e76	refactor(artifacts): extract resolve_artifact_path to shared module Move artifact path validation + symlink-escape check out of the admin-gated download endpoint into decnet/artifacts/paths.py so the TTP EmailLifter can disk-reach .eml files at tag-time without duplicating regex/root logic (DEBT-047). The router now catches ArtifactPathError and re-raises HTTPException(400); behavior is unchanged.	2026-05-02 20:02:47 -04:00
anti	cdbb3d3571	fix(ssh,telnet): move PROMPT_COMMAND out of /root/.bashrc + pin readonly ANTI flagged two regressions in the existing command-event capture: 1. Tell: PROMPT_COMMAND lived in /root/.bashrc, the FIRST file an attacker greps after landing root. The logger invocation sitting there is plain-text honeypot signage. 2. Bypass: even when missed, `export PROMPT_COMMAND=""` silently disables capture. ANTI personally bypasses this on engagements. Reshape: * Move the assignment to /etc/environment — read by pam_env at session open (sshd via /etc/pam.d/sshd, telnet via /etc/pam.d/login), before any shell rc file fires. Far less obvious than .bashrc; a casual `cat .bashrc` no longer surfaces the capture. * Define the helper as a function `__bash_history_sync` in /etc/bash.bashrc (system-wide bashrc, sourced by every interactive bash). Function name reads as generic bash housekeeping; no DECNET branding in the symbol. * Pin both the function and PROMPT_COMMAND readonly so `export PROMPT_COMMAND=""` fails with "readonly variable" instead of silently winning. Mitigation, not airtight — `bash --norc` still bypasses — but the passive `export` bypass is closed. The actual `logger --rfc5424 --msgid command ... CMD ...` invocation is preserved exactly; only its location and the readonly guard change. R0001–R0030 (command-rule pack) consume the same syslog shape as before. Three new tests assert: the value lands in /etc/environment, the function body lives in /etc/bash.bashrc, no PROMPT_COMMAND line remains in /root/.bashrc, and `readonly PROMPT_COMMAND` / `readonly -f __bash_history_sync` are both present. Mirror assertions added on the Telnet Dockerfile via test_config_schema.py.	2026-05-02 19:50:24 -04:00
anti	3e9c4c29b9	feat(ssh,telnet): add non-root user account for privesc + enum lure Real Linux deployments (especially Ubuntu cloud images) ship a non- root admin user; honeypots that only accept root logins are a tell. Add a second account on both SSH and Telnet decoys, configurable via service_cfg keys `user` / `user_password`, defaulting to `ubuntu` / `admin` so the lure is live on every fresh deploy. * `decnet/services/{ssh,telnet}.py` — two new ServiceConfigFields (`user` string, `user_password` secret) and matching env vars (`SSH_USER` / `SSH_USER_PASSWORD`, mirror for telnet) propagated via the compose fragment. * `decnet/templates/ssh/entrypoint.sh` — runtime `useradd -m -s /usr/libexec/login-session -G sudo "$SSH_USER"` so the new user inherits the same sessrec pty-recording shell as root and lands in the sudo group. Privesc attempts (`sudo`) flow through the existing sudo-log capture; network-enum from the user's shell rides the recorded transcript. * `decnet/templates/telnet/entrypoint.sh` — same useradd pattern (no sudo group — busybox+login telnet image has no sudo package; privesc rides `su -` which itself flows through the existing PAM auth-helper at /etc/pam.d/login). * New tests for default + custom user / password + independence from root password. Updated the schema-keys assertion to match the four-field shape. The new account is ALSO the natural home for the body-aware predicates that were previously gated on root-only sessions — attackers who land on `ubuntu@host` and run network-recon / privesc commands now generate the same structured TTP-rule events as root sessions did, captured via the same auth-helper + sessrec + sudo-log pipes.	2026-05-02 19:48:03 -04:00
anti	c675bd26cf	docs(debt): mark DEBT-035 resolved; lift DEBT-047 filesystem-access blocker DEBT-035 (artifacts written as the container uid, not the API's) is resolved by the two preceding commits: * `39a298f6` — persists DECNET-service api-user/api-group as names in decnet.ini for any future composer / worker that wants to resolve the local uid via pwd.getpwnam. * `b2733216` — creates /var/lib/decnet/artifacts at init time with mode 0o2775 (setgid + group-write) owned by the DECNET-service user:group. The setgid bit is the load-bearing fix: Linux mkdir(2) propagates a parent's group AND its setgid bit to every new subdirectory. Docker auto-creates the per-decoy / per-service subtree as bind-mounts fire, so those subdirs come up with group=decnet and setgid set; container file writes (default umask 0o022 → mode 0o644) inherit the decnet group; the API process and the local TTP worker (both running as the DECNET-service user, primary group decnet) read via group-read. The original recommendation of compose `user:` injection turned out infeasible for SSH and Telnet — PAM's setuid(2) during login fundamentally cannot run from a non-root container. Setgid covers both root-internal and unprivileged-internal templates uniformly without requiring per-template carve-outs. DEBT-047 (R0047 BEC disk-reach) was gated on DEBT-035 for filesystem access. That blocker is lifted — `decnet ttp` running on agents as the local DECNET-service user can now read .eml files written by the SMTP decoy. The remaining DEBT-047 work is the master-only gate flip in decnet/cli/gating.py and the EmailLifter disk-reach helper itself (factor _resolve_artifact_path out of the artifacts API endpoint into a shared module). Soft-fail paths in api_get_transcript.py and api_get_artifact.py stay as defence-in-depth — option 2 should make them never fire on a healthy install but a misconfigured deploy must not 500 the API.	2026-05-02 19:40:12 -04:00
anti	b27332169d	feat(init): create /var/lib/decnet/artifacts with setgid + group-write DEBT-035 step 2. Today the artifacts subtree is auto-created by Docker as root when a decoy container's bind-mount fires for the first time. The resulting permissions are root:root 0o755 — the API process (running as the decnet user) hits PermissionError trying to read transcripts written by the container, and the soft-fail 404 path gets exercised on every fresh deploy. Add `/var/lib/decnet/artifacts` to init's dirs list with mode 0o2775: * 0o2000 — setgid bit. New files inherit the directory's group (decnet), regardless of which uid created them. This is the load- bearing bit for cross-container reads. * 0o0775 — owner+group rwx, world rx. Group-write lets the API process and the local TTP worker read each other's outputs without a manual chown. `_ensure_dir` already respects the full mode word via `os.chmod`, no helper change needed. Test asserts the resulting directory carries exactly 0o2775 after a fresh `decnet init --prefix`. Defence-in-depth: this works even if the per-decoy compose `user:` directive (next commit) misses a template — files still land in the decnet group.	2026-05-02 19:35:20 -04:00
anti	39a298f685	feat(init): persist DECNET-service api-user/api-group to decnet.ini DEBT-035 step 1. The composer needs to know which uid/gid to inject into each compose fragment's `user:` directive at deploy time. Today the resolved `--user` / `--group` values reach systemd unit rendering (init.py:349–354) but are not persisted anywhere the composer can read them. Persist as names (not numeric ids) under `[decnet] api-user` / `api-group` in the rendered decnet.ini placeholder. Resolution to uid/gid happens at deploy time on whichever host runs the deploy, via `pwd.getpwnam(...)` / `grp.getgrnam(...)` — so the same user name can have different uids on master vs agents (heterogeneous /etc/passwd) without breaking artifact ownership. The existing config_ini auto-translates kebab→DECNET_API_USER / DECNET_API_GROUP at load time; no domain-map changes needed. Two new tests: one asserting the rendered ini carries the `api-user` / `api-group` keys for the values passed to `--user` / `--group`; one round-tripping through `load_ini_config` to confirm the env vars land in `os.environ` for the composer to pick up.	2026-05-02 19:33:53 -04:00
anti	b3ea3fa925	docs(debt): merge rogue root DEBT.md into the canonical development/DEBT.md A previous agent (and several of my own commits) wrote to a top-level DEBT.md without seeing the existing development/DEBT.md — the canonical register since DEBT-001. Resulted in two parallel files, inconsistent numbering schemes, and references that resolved to the wrong place. Migrate the six entries that landed in the rogue file into the canonical register as DEBT-044 through DEBT-049, preserving their status (resolved / partial / open) and cross-references. The TTP_TAGGING.md references to "DEBT.md" already resolve to development/DEBT.md by virtue of being in the same directory; only the comment in decnet/ttp/impl/intel_lifter.py needed disambiguation to "development/DEBT.md DEBT-048". * DEBT-044 — `attacker.email.received` producer wiring (✅ RESOLVED 2026-05-02) * DEBT-045 — EmailLifter heavyweight feature extraction (PARTIAL PAID 2026-05-02) * DEBT-046 — EmailLifter mal-hash feed integration (open) * DEBT-047 — EmailLifter R0047 BEC unblock (open, gated on DEBT-035) * DEBT-048 — TTP intel provider mapping review (recurring quarterly) * DEBT-049 — TTP Sigma adapter — post-v1 (open) Summary table extended; "Remaining open" line updated; root file removed. The DEBT-047 entry now explicitly cross-references DEBT-035 as the gating dependency for the R0047 BEC unblock.	2026-05-02 19:17:20 -04:00
anti	17367d0a69	docs(debt,ttp): retire shipped lanes; file mal-hash-feed and R0047-disk-reach entries Mark the EmailLifter heavyweight follow-up as PARTIAL PAID — R0042 / R0046 (macro / password / smuggling lanes) / R0048 fire end-to-end after commits `291b78c1` (decky extractors) and the ingester producer projection that follows. Two narrower DEBT entries replace the lanes that remain gated: * "EmailLifter mal-hash feed integration" — R0046's mal_hash_match lane needs a curated bad-hash feed (MalwareBazaar SHA-256 dump as the v0 candidate, mirroring the FeodoProvider bulk-feed pattern at decnet/intel/feodo.py). Feed integration, not extraction. Lifter predicate already reads `payload.get("mal_hash_match")` — silent today only because the field is absent. * "EmailLifter R0047 BEC — unblock when artifact disk-reach lands" cross-references the agent UID/GID DEBT entry that blocks `decnet ttp` from reading artifacts written by deckies on the same host. Disk-reach is the intended solution; raw body_text on the bus is rejected because the bus transport is abstracted (the UNIX-socket implementation may swap to networked at any time, and privacy decisions must hold regardless of transport). Append to TTP_TAGGING.md §"Producer wiring": the email.received producer pointer (was "none — DEBT"), the full per-message payload shape with the new heavyweight fields, and an explanatory block on why the bus is body-text-free + how R0047 / R0048 each handle their body dependency (R0048 via the precomputed scalar; R0047 deferred).	2026-05-02 19:12:30 -04:00
anti	c714941069	feat(bus): project EmailLifter heavyweight fields onto email.received The decky's Layer-2 extension (commit `291b78c1`) emits body_simhash / body_base64_bytes / html_smuggling on the message_stored log and adds macro_indicator / encrypted booleans to each attachments_json manifest entry. Lift them all onto the email.received bus payload: * body_simhash — passes through as-is (16 hex chars or "") * body_base64_bytes — coerced to int (0 on absent / malformed) * attachment_macros / attachment_password_protected — OR-reduced across the per-attachment manifest booleans; matches R0046's matched_trigger semantics where a single positive lane fires the rule * html_smuggling — coerced bool from the decky's 0/1 int Pre-Layer-2 message_stored events (older deckies, malformed log rows) project to safe defaults: empty simhash, zero base64-bytes, all booleans False — the EmailLifter then stays silent, never fires a false positive on missing data. R0042 (mass-phish) / R0046 macro / R0046 password / R0046 smuggling / R0048 (encoded payload) all fire end-to-end after this commit. R0046 mal_hash_match and R0047 BEC remain deferred per their respective DEBT entries (filed in the next commit).	2026-05-02 19:10:30 -04:00
anti	291b78c1d0	feat(smtp): extract body_simhash + base64-bytes + html-smuggling + per-attachment macro/encrypted Heavyweight Layer-2 extractors land alongside the cheap projections shipped in commit `e9324aca`, so the EmailLifter R0042 / R0046 (macros / password / smuggling lanes) / R0048 fire from the bus payload without the lifter having to reach back to disk. Extractors: * body_simhash — inlined 64-bit Charikar simhash (md5-keyed, frequency-weighted) over word tokens of the union of text/* body parts. Inlined rather than pulling the `simhash` PyPI dep, which transitively brings numpy ~50 MB into a slim decky container; the algorithm is ~15 lines and identical in extraction quality. * body_base64_bytes — largest decoded base64 chunk's byte count, scanning text body parts with the same `_BASE64_RE` the lifter's `_p_encoded_payload` fallback uses. R0048 fires from this scalar alone; the lifter's body_text fallback becomes dead in normal operation. * attachment_macro_indicator — stdlib zipfile sniff for `vbaProject.bin` inside OOXML containers. Catches modern .docm / .xlsm / .pptm and macro-injected .docx; legacy .xls (CFBF) is a follow-up. * attachment_encrypted — flag_bits & 0x01 on any ZIP / OOXML entry's central directory; magic-byte match for 7z / RAR / CFBF (encrypted Office wrap). * html_smuggling — structural lxml parse first: fires when an `<a download>` element coexists with a `<script>` referencing `Blob` / `Uint8Array` / `URL.createObjectURL`. Regex pair-check fallback on lxml parse failure (real-world phish HTML is often malformed). Cuts the FP rate that pure-regex would produce on legitimate "click to download" links. Add `python3-lxml` (~5 MB Debian package, C-extension, no transitive Python deps) to the SMTP decky's Dockerfile. simhash stays inline. Per the dependency rule: lxml earns its weight by cutting R0046's OR-combined FP rate; a heavier macro-detection lib (oletools ~5 MB pure-python with msoffcrypto) would not measurably improve the boolean signal we need, so stdlib stays for that lane.	2026-05-02 19:08:37 -04:00
anti	fb85762703	feat(bus): publish email.received from ingester after SMTP artifact persist Wires the EmailLifter (R0041–R0048) producer that DEBT.md item #3 deferred. After the existing add_bounty() call in _extract_bounty (line 615), call _publish_email_received() which: * resolves the attacker_uuid via repo.get_attacker_uuid_by_ip; drops the publish if unresolved (the TTP worker can't anchor orphan events) * projects the message_stored fields onto the EmailLifter wire contract: from_domain / mail_from_domain / return_path_domain parsed via _domain_of, rcpt_count + rcpt_domains via _rcpt_projection, attachment_sha256s + attachment_extensions derived from the existing attachments_json manifest, urls from urls_json, dkim_signed/spf_pass coerced from 0/1 ints to bool * mirrors _publish_probe_pending's bus-per-call pattern and swallows all exceptions (the bus is the notification layer, not the source of truth) Fires for both relay and non-relay SMTP services. R0041 / R0043 / R0044 / R0045 are now live end-to-end; R0046 partial (extension lane). Heavyweight predicates (R0042 simhash, R0046-deep, R0047 / R0048 body_text) stay deferred per the EmailLifter heavyweight DEBT entry.	2026-05-02 18:39:13 -04:00
anti	e9324acac7	feat(smtp): emit X-Mailer / Return-Path / dkim+spf / URLs on message_stored The EmailLifter (R0041–R0048) keys on header-derived signals that the v0 _summarize_message did not extract. Add cheap Layer 2 projections inside the existing single-pass parse: * return_path / x_mailer — direct header reads, decoded RFC 2047 * dkim_signed / spf_pass — booleans derived from any Authentication-Results header (multiple lines tolerated; positive verdict on any line wins) * urls — http(s) URLs lifted from text/* body parts via a tight regex, deduplicated first-seen-wins, capped at 64 in the wire payload to bound the syslog SD value Heavyweight extraction (body simhash, office-macro detection, HTML-smuggling, password-protected archives, mal-hash-match, body_text projection) stays deferred per the EmailLifter heavyweight DEBT entry — those rules need privacy / extractor decisions before they ship.	2026-05-02 18:37:11 -04:00
anti	2ce150a53e	docs(debt): mark email.received producer as paid; file heavyweight follow-up The 2026-05-02 paydown wires the producer at ingester.py after add_bounty(), with the cheap projections (domains, rcpt_count, attachment_count, x_mailer, dkim/spf, attachment shas + extensions, URLs). R0041 / R0043 / R0044 / R0045 fire end-to-end after this PR; R0046 partial. The remaining lanes (R0042 body_simhash, R0046 macro / smuggling / password / mal_hash, R0047 / R0048 body_text projection) are filed as a new entry "EmailLifter heavyweight feature extraction" with the field map and the privacy-vs-completeness fork on body_text called out for the next maintainer to pick a side.	2026-05-02 18:24:51 -04:00
anti	9a7d116351	docs(ttp): sync A.10 + rewrite §9 drift runbook + DEBT.md markers Appendix A.10 corrected to match the post-2026-05-02-audit reality: AbuseIPDB cat 7/13/16/17 land on their canonical AbuseIPDB names (Phishing / VPN IP / SQL Injection / Spoofing); cats 4 and 10 carry explicit "drop" annotations so the next reviewer sees the intent rather than guessing. ThreatFox table re-keys on `threat_type` (the canonical taxonomy field) and adds the `payload` and `cc_skimming` rows. GreyNoise table promotes bare-malicious to a half-multiplier emission of T1071. §"Hard parts §9 Intel provider drift" replaces the prose handwave with a runnable check: provider URLs, the ThreatFox curl invocation that needs DECNET_THREATFOX_API_KEY, the rule_version + emits + attack_catalog co-evolution rules, and the full chain of files to exercise. Adds a "Ship-time audit log" subsection so future quarterly runs have a known-good baseline to diff against. DEBT.md item #1 records LAST_REVIEWED: 2026-05-02 / NEXT_REVIEW: 2026-08-02 and points at §9 for the runbook. DEBT.md item #3 (the attacker.email.received producer) flags its gating premise as potentially stale — ANTI noted SMTP honeypots already persist received messages, contradicting the "no source row" claim that deferred the wiring.	2026-05-02 18:09:20 -04:00
anti	f8dee596e5	fix(ttp): expand R0054/R0055/R0057 emits + LAST_REVIEWED markers The IntelLifter's _emit_filtered fans out only the rule.emits entries whose technique_id appears in the predicate's decision set. v1's emits lists were narrow supersets of the common case, silently dropping the rest of the predicate's possible emissions: R0054 dropped: T1046 (cat 14), T1078 (cat 20), T1090 (cats 9/13), T1496 (cat 11), T1595 (cats 14/19) R0055 dropped: T1090 (tor_exit_node), T1110 (ssh_bruteforcer), T1588 (the second emit of every C2-framework tag) R0057 dropped: T1105 (payload_delivery, download_url) Bump rule_version 1->2 on R0054/R0055/R0057, expand emits to cover every technique the predicate produces. R0056 (Feodo) and R0058 (aggregate bump) carry no enum and stay at v1. All five YAMLs gain `last_reviewed: "2026-05-02"` and `next_review: "2026-08-02"` markers; the rule YAML is now the canonical record of when the mapping was last reconciled against upstream, with DEBT.md as the calendar reminder.	2026-05-02 18:09:03 -04:00
anti	75ff0ede1f	fix(ttp): correct intel_lifter mappings + repoint ThreatFox to threat_type Three bug classes uncovered by the 2026-05-02 ship-time audit: * AbuseIPDB code/name mismatch in v1: cat 10 was treated as DDoS (it's Web Spam — DDoS is cat 4, intentionally unmapped per A.10) and cat 17 as VPN IP (it's Spoofing — VPN IP is cat 13). Both typos mirrored in code AND the design doc Appendix A.10. Code now matches the AbuseIPDB taxonomy exactly; cat 17 retargets to T1566 (email-spoofing as a phishing precursor), and cats 7 (Phishing) and 16 (SQL Injection) pick up T1566 / T1190 emissions that v1 didn't cover. * ThreatFox dispatch keyed on `ioc_type` in v1, but `ioc_type` is the indicator format (url / domain / hash variants) and carries no ATT&CK signal. The canonical taxonomy field per ThreatFox's API is `threat_type` (botnet_cc / payload_delivery / payload / cc_skimming). Repoint dispatch through the new `threatfox_threat_types` payload field; `ioc_type` rides as evidence only. Also adds the missing cc_skimming -> T1056 (Input Capture) mapping and registers T1056 in attack_catalog.py. * GreyNoise bare-malicious lane: a `classification == "malicious"` row with no recognised tag used to emit nothing. Now lights T1071 at a half multiplier, suppressed when a tag already fires T1071 to avoid double-stamping at conflicting confidence levels.	2026-05-02 18:08:48 -04:00
anti	a31ad82880	feat(intel): project per-provider taxonomy into attacker.intel.enriched payload The TTP worker forwards the bus payload verbatim to the IntelLifter as TaggerEvent.payload. The pre-audit publish payload only carried {attacker_uuid, attacker_ip, aggregate_verdict, providers}, so even with the new AttackerIntel taxonomy columns populated the lifter still saw nothing. Lift the relevant fields (categories / tags / threat_types / malware family / score / classification) into the bus event and decode JSON-string list columns back to native lists at the boundary.	2026-05-02 18:08:29 -04:00
anti	999d3494b4	feat(intel): persist per-provider taxonomy on AttackerIntel for TTP dispatch The 2026-05-02 ship-time audit of the R0054-R0058 intel rule pack found that AbuseIPDB / GreyNoise / ThreatFox stored only the aggregate verdict (score / classification / listed-bool) plus the raw response blob. The TTP IntelLifter expects per-provider taxonomy fields (categories, tags, threat_types) that were never populated, so R0054 / R0055 / R0057 emitted zero tags in production despite passing unit tests. Add typed columns: abuseipdb_categories, greynoise_tags, greynoise_name, feodo_malware_family, threatfox_threat_types, threatfox_ioc_types, threatfox_malware_families. Each provider now parses the relevant taxonomy out of the upstream response and writes it through column_updates. JSON-list columns ride as TEXT with default "[]" to keep the SQLite/MySQL backend split honest, deserialised back to native lists by the repo on read.	2026-05-02 18:07:57 -04:00
anti	d1c4a48963	feat(ttp): split bash CMD evidence into structured uid/user/src/pwd/cmd rows The inspector was dumping the whole `CMD uid=0 user=root src=… pwd=… cmd=nmap -p- 192.168.1.0/24` syslog body into a single ``command_text`` blob. ANTI: "I'd like to separate the fields." Done — three layers work together: 1. Collector session aggregator: new `_parse_cmd_msg` splits the bash PROMPT_COMMAND msg into `{uid, user, src, pwd, command}`. The session-ended envelope's per-command dict now carries the structured fields, with `command_text` set to just the cmd= value (preserving embedded whitespace — `nmap -p- 1.2.3.0/24` etc.). 2. Rule engine: per-source_kind auxiliary evidence list (`_AUX_EVIDENCE_FIELDS`). For `command` events the engine automatically promotes uid/user/src/pwd into the persisted `evidence` dict on top of the rule's explicit `evidence_fields`. Engine-controlled, not per-rule — adding a new aux field is one line here, not a 30-rule YAML sweep, and rule authors can't accidentally drop it. 3. TTPInspector frontend: evidence renders as a structured `kvs` grid (UID / USER / SRC / PWD / CMD rows) instead of pretty-printed JSON. Primary-order list keeps shell fields at the top; everything else falls below alphabetically so unfamiliar evidence shapes still surface predictably. Tests: - session_aggregator pins the structured-fields emit (uid/user/src/ pwd/command_text without "CMD" prefix, embedded whitespace preserved). - rule_engine_tagger pins the aux-field auto-promotion + the no-`None`-leakage path when payload doesn't carry an aux key.	2026-05-02 03:20:53 -04:00
anti	84699f89da	feat(ttp): show canonical ATT&CK technique names in the TTPs UI "T1595" alone is opaque; "T1595 — Active Scanning" tells you the story at a glance. The names come from a backend-side static catalogue pinned to the same ATT&CK release as the rule engine (_ATTACK_RELEASE = "v15.1") — names are the canonical MITRE labels, not author-supplied strings on rules, so a rule author can't typo a name and the entire fleet sees the typo. - New `decnet/ttp/attack_catalog.py` with `TECHNIQUE_NAMES` covering every technique_id + sub_technique_id emitted by `rules/ttp/` (R0001..R0058 → 69 IDs in the v0 pack). - `IdentityTechniqueRow` / `TechniqueRollupRow` / `CampaignTechniqueRow` / `TTPTagDetailRow` gain optional `technique_name` / `sub_technique_name` fields. Repo + router populate them from the catalogue at row-construction time. None when an ID isn't in the catalogue — UI falls back to the bare ID. - Coverage test (`tests/ttp/test_attack_catalog.py`) walks every YAML rule and asserts every emitted ID has a catalogue entry, so a future rule author who forgets to update the catalogue gets a loud failure rather than a silent UI fallback. Frontend: - `TTPsObservedSection` shows "T1595.002 — Active Scanning: Vulnerability Scanning" instead of just the ID, with overflow ellipsis + tooltip for narrow viewports. Inspector header / TECHNIQUE row also surface the names.	2026-05-02 03:10:07 -04:00

1 2 3 4 5 ...

1146 Commits