DECNET

Author	SHA1	Message	Date
anti	0972325527	feat(web/db): observations table + repo + bus prefix (BEHAVE-INTEGRATION Phase 1) Additive Phase 1 of BEHAVE-INTEGRATION.md. Lays the storage layer the BEHAVE-SHELL extractor (DEBT-050) will write into. Nothing breaks; SessionProfile coexists for now and is dropped in the follow-up commit. decnet/web/db/models/observations.py — new ObservationRow SQLModel mirroring the BEHAVE Observation envelope field-for-field (core/decnet_behave_core/spec/envelope.py). ``id`` is a hex-string UUID (matching BEHAVE), not a typed UUID column. ``identity_ref`` is str \| None — written by the future attribution engine, NULL until then. ``attacker_uuid`` is the one DECNET-side denormalisation; FK'd to attackers.uuid for cheap AttackerDetail joins. ``evidence_ref`` is NOT NULL for DECNET emissions even though the upstream envelope makes it optional — the worker's "already profiled?" check keys on it. UniqueConstraint(evidence_ref, primitive) enforces idempotency at the schema level so re-running the extractor on the same shard+sid produces a DB-side conflict the upsert path resolves deterministically. Class is named ``ObservationRow`` (not ``Observation``) to avoid colliding with the BEHAVE Pydantic envelope at sites that import both. decnet/web/db/sqlmodel_repo/observations.py — ObservationsMixin. Three public methods backing the canonical queries from BEHAVE-INTEGRATION.md §"Storage": ``upsert_observation`` (idempotent on the natural key), ``latest_observation_per_primitive`` (per- primitive MAX(ts) subquery, portable across SQLite and MySQL — no DISTINCT ON), ``observations_time_series`` (asc-by-ts). Plus ``has_observations_for_evidence`` for the worker's session-already- profiled check. decnet/bus/topics.py — ATTACKER_OBSERVATION_PREFIX = "observation" constant + ``attacker_observation(primitive)`` builder. Full topic shape ``attacker.observation.<primitive>`` matches what BEHAVE's spec.event_adapter.event_topic_for produces upstream. Documentation + pattern matching only — bus auth is socket file perms (DEBT-029 §2), not topic-level. decnet/web/db/repository.py — abstract ``upsert_observation``, ``latest_observation_per_primitive``, ``observations_time_series`` on BaseRepository. tests/db/test_observations.py — 11 tests covering upsert round-trip, idempotency under the unique constraint, latest-per-primitive ordering across multiple sessions, time-series asc-ordering, empty- attacker contract, every BEHAVE ValueKind round-tripping through the JSON column, and the has_observations_for_evidence check. tests/db/test_base_repo.py — DummyRepo gains the three new abstract overrides so its coverage suite still instantiates.	2026-05-03 07:25:10 -04:00
anti	3f080f601d	feat(intel,ingester): mal_hash feed + observed_attachments table (DEBT-046) New MalHashProvider sibling ABC (decnet/intel/base.py) since SHA-256 is a different keyspace from IntelProvider's IPs. MalwareBazaarProvider mirrors FeodoProvider's bulk-feed shape: 24h refresh via _ensure_fresh / _refresh, in-memory set[str] of hex-lowercased hashes, set-membership lookup. Auth-keyed via DECNET_MALWAREBAZAAR_AUTH_KEY; absent key silent-no-ops the lane (single warning, no HTTP traffic). Per-hash observations persist to a new observed_attachments table. DECNET is a honeypot platform — every attachment hash an attacker delivers is intel, regardless of whether anyone classified it. Verdict is sticky: True never downgrades to False/None on subsequent observations. Out of scope: API surface, federation export, retention. Ingester _publish_email_received calls the provider for each attachment sha256, sets mal_hash_match on the bus payload (omitted entirely when the message had no attachments — keeps R0046's `is True` predicate silent on hash-less mail, matching pre-paydown behavior), and upserts the row regardless of provider availability.	2026-05-03 05:56:46 -04:00
anti	03beff3840	feat(orchestrator): authoritative failure-count badge endpoint (DEBT-042) New GET /api/v1/orchestrator/events/stats?since=1h&success=false&kind=... backed by repo.count_orchestrator_failures(since_ts, kind), which counts failed rows across both orchestrator_events and orchestrator_emails since the cutoff. Window parser accepts ^\d+[smhd]$, capped at 7d. Today only success=false is accepted on this surface so the endpoint isn't accidentally repurposed before the next consumer is properly designed. Orchestrator.tsx polls the endpoint on mount + every 30 s and renders the authoritative DB-derived count instead of deriving from the in-memory SSE buffer + one paginated page (which silently excluded failures older than the local window).	2026-05-03 05:26:45 -04:00
anti	6c6f97e840	feat(prober,correlation): attacker fingerprint rotation detection (DEBT-032) When the prober observes a NEW hash for an (attacker_uuid, port, probe_type) triple it has seen before — VPS rotation, SSH server rebuild, TLS cert swap — emit a derived attacker.fingerprint_rotated event carrying both old and new hash. Detection is a small library (decnet.correlation.fingerprint_rotation) called inline from the prober at each of the three emit sites (JARM/HASSH/TCPFP). No new daemon. New AttackerFingerprintState table holds per-triple last-hash state; Attacker.rotation_count and Attacker.last_rotation_at are stamped on every diff. Library is sync, fully unit-tested via injected publish_fn / syslog_fn callbacks.	2026-05-03 05:12:51 -04:00
anti	b3a96a045f	feat(mail): default email_seed → \$PROJROOT/bait/ when unset When service_cfg["email_seed"] is absent, compose_fragment now falls back to $PROJROOT/bait/ if that directory exists on the host. Lets operators drop a deployment-wide bait corpus into one place without threading email_seed through every decky's config. Missing dir keeps old no-op behavior.	2026-05-03 04:25:24 -04:00
anti	b88d67794d	feat(mail): operator-tunable IMAP/POP3 email seed (DEBT-026) IMAP_EMAIL_SEED / POP3_EMAIL_SEED accept a directory (rglob .eml + .json) or a single .json/.eml. Loaded entries CONCATENATE with the hardcoded _BAIT_EMAILS — additive to the realism-engine emailgen output rather than replacing it. JSON dicts require from_addr / to_addr / subject / body; bare bodies are wrapped into RFC 5322 on load. compose_fragment reads service_cfg["email_seed"] and bind-mounts the host path read-only at /var/spool/decnet-emails/seed.	2026-05-03 02:47:06 -04:00
anti	79674026dd	feat(cli): allow `decnet ttp` on agents (DEBT-047) The TTP-tagging worker is now safe to run on agent hosts: EmailLifter disk-reaches body-aware predicates from the local artifacts tree (DEBT-035 unblocked filesystem access; DEBT-047 added the helper). Drop `ttp` from MASTER_ONLY_COMMANDS in cli/gating.py and remove the defence-in-depth `_require_master_mode("ttp")` call in cli/ttp.py. `ttp-backfill` walks the master DB and stays master-only.	2026-05-02 20:07:03 -04:00
anti	e972d870de	feat(ttp): EmailLifter disk-reach for body-aware predicates (DEBT-047) R0047 (BEC) and the encoded-payload predicate substring-match against the email body. Shipping raw body text on the abstracted service bus is the wrong privacy stance — the bus transport may swap from UNIX socket to networked at any time, and "loopback today" is not a license to put PII on the wire. EmailLifter now opens the .eml lazily from /var/lib/decnet/artifacts/{decky_id}/smtp/{stored_as} when a body-aware predicate runs and parses the body in-process via stdlib email + policy.default. The decoded body is memoized into the payload dict so multiple body-aware predicates on the same event open the file once. Bus envelope only carries the artifact pointer (decky_id + stored_as); raw body bytes never cross the host disk boundary on the agent → master hop. Filesystem access on agents is unblocked by DEBT-035 (setgid + group-readable artifacts root, paid 2026-05-02). The legacy inline body_text path is preserved — when the producer ships body_text on the bus the helper short-circuits without opening the file.	2026-05-02 20:05:54 -04:00
anti	7036a86e76	refactor(artifacts): extract resolve_artifact_path to shared module Move artifact path validation + symlink-escape check out of the admin-gated download endpoint into decnet/artifacts/paths.py so the TTP EmailLifter can disk-reach .eml files at tag-time without duplicating regex/root logic (DEBT-047). The router now catches ArtifactPathError and re-raises HTTPException(400); behavior is unchanged.	2026-05-02 20:02:47 -04:00
anti	cdbb3d3571	fix(ssh,telnet): move PROMPT_COMMAND out of /root/.bashrc + pin readonly ANTI flagged two regressions in the existing command-event capture: 1. Tell: PROMPT_COMMAND lived in /root/.bashrc, the FIRST file an attacker greps after landing root. The logger invocation sitting there is plain-text honeypot signage. 2. Bypass: even when missed, `export PROMPT_COMMAND=""` silently disables capture. ANTI personally bypasses this on engagements. Reshape: * Move the assignment to /etc/environment — read by pam_env at session open (sshd via /etc/pam.d/sshd, telnet via /etc/pam.d/login), before any shell rc file fires. Far less obvious than .bashrc; a casual `cat .bashrc` no longer surfaces the capture. * Define the helper as a function `__bash_history_sync` in /etc/bash.bashrc (system-wide bashrc, sourced by every interactive bash). Function name reads as generic bash housekeeping; no DECNET branding in the symbol. * Pin both the function and PROMPT_COMMAND readonly so `export PROMPT_COMMAND=""` fails with "readonly variable" instead of silently winning. Mitigation, not airtight — `bash --norc` still bypasses — but the passive `export` bypass is closed. The actual `logger --rfc5424 --msgid command ... CMD ...` invocation is preserved exactly; only its location and the readonly guard change. R0001–R0030 (command-rule pack) consume the same syslog shape as before. Three new tests assert: the value lands in /etc/environment, the function body lives in /etc/bash.bashrc, no PROMPT_COMMAND line remains in /root/.bashrc, and `readonly PROMPT_COMMAND` / `readonly -f __bash_history_sync` are both present. Mirror assertions added on the Telnet Dockerfile via test_config_schema.py.	2026-05-02 19:50:24 -04:00
anti	3e9c4c29b9	feat(ssh,telnet): add non-root user account for privesc + enum lure Real Linux deployments (especially Ubuntu cloud images) ship a non- root admin user; honeypots that only accept root logins are a tell. Add a second account on both SSH and Telnet decoys, configurable via service_cfg keys `user` / `user_password`, defaulting to `ubuntu` / `admin` so the lure is live on every fresh deploy. * `decnet/services/{ssh,telnet}.py` — two new ServiceConfigFields (`user` string, `user_password` secret) and matching env vars (`SSH_USER` / `SSH_USER_PASSWORD`, mirror for telnet) propagated via the compose fragment. * `decnet/templates/ssh/entrypoint.sh` — runtime `useradd -m -s /usr/libexec/login-session -G sudo "$SSH_USER"` so the new user inherits the same sessrec pty-recording shell as root and lands in the sudo group. Privesc attempts (`sudo`) flow through the existing sudo-log capture; network-enum from the user's shell rides the recorded transcript. * `decnet/templates/telnet/entrypoint.sh` — same useradd pattern (no sudo group — busybox+login telnet image has no sudo package; privesc rides `su -` which itself flows through the existing PAM auth-helper at /etc/pam.d/login). * New tests for default + custom user / password + independence from root password. Updated the schema-keys assertion to match the four-field shape. The new account is ALSO the natural home for the body-aware predicates that were previously gated on root-only sessions — attackers who land on `ubuntu@host` and run network-recon / privesc commands now generate the same structured TTP-rule events as root sessions did, captured via the same auth-helper + sessrec + sudo-log pipes.	2026-05-02 19:48:03 -04:00
anti	b27332169d	feat(init): create /var/lib/decnet/artifacts with setgid + group-write DEBT-035 step 2. Today the artifacts subtree is auto-created by Docker as root when a decoy container's bind-mount fires for the first time. The resulting permissions are root:root 0o755 — the API process (running as the decnet user) hits PermissionError trying to read transcripts written by the container, and the soft-fail 404 path gets exercised on every fresh deploy. Add `/var/lib/decnet/artifacts` to init's dirs list with mode 0o2775: * 0o2000 — setgid bit. New files inherit the directory's group (decnet), regardless of which uid created them. This is the load- bearing bit for cross-container reads. * 0o0775 — owner+group rwx, world rx. Group-write lets the API process and the local TTP worker read each other's outputs without a manual chown. `_ensure_dir` already respects the full mode word via `os.chmod`, no helper change needed. Test asserts the resulting directory carries exactly 0o2775 after a fresh `decnet init --prefix`. Defence-in-depth: this works even if the per-decoy compose `user:` directive (next commit) misses a template — files still land in the decnet group.	2026-05-02 19:35:20 -04:00
anti	39a298f685	feat(init): persist DECNET-service api-user/api-group to decnet.ini DEBT-035 step 1. The composer needs to know which uid/gid to inject into each compose fragment's `user:` directive at deploy time. Today the resolved `--user` / `--group` values reach systemd unit rendering (init.py:349–354) but are not persisted anywhere the composer can read them. Persist as names (not numeric ids) under `[decnet] api-user` / `api-group` in the rendered decnet.ini placeholder. Resolution to uid/gid happens at deploy time on whichever host runs the deploy, via `pwd.getpwnam(...)` / `grp.getgrnam(...)` — so the same user name can have different uids on master vs agents (heterogeneous /etc/passwd) without breaking artifact ownership. The existing config_ini auto-translates kebab→DECNET_API_USER / DECNET_API_GROUP at load time; no domain-map changes needed. Two new tests: one asserting the rendered ini carries the `api-user` / `api-group` keys for the values passed to `--user` / `--group`; one round-tripping through `load_ini_config` to confirm the env vars land in `os.environ` for the composer to pick up.	2026-05-02 19:33:53 -04:00
anti	c714941069	feat(bus): project EmailLifter heavyweight fields onto email.received The decky's Layer-2 extension (commit `291b78c1`) emits body_simhash / body_base64_bytes / html_smuggling on the message_stored log and adds macro_indicator / encrypted booleans to each attachments_json manifest entry. Lift them all onto the email.received bus payload: * body_simhash — passes through as-is (16 hex chars or "") * body_base64_bytes — coerced to int (0 on absent / malformed) * attachment_macros / attachment_password_protected — OR-reduced across the per-attachment manifest booleans; matches R0046's matched_trigger semantics where a single positive lane fires the rule * html_smuggling — coerced bool from the decky's 0/1 int Pre-Layer-2 message_stored events (older deckies, malformed log rows) project to safe defaults: empty simhash, zero base64-bytes, all booleans False — the EmailLifter then stays silent, never fires a false positive on missing data. R0042 (mass-phish) / R0046 macro / R0046 password / R0046 smuggling / R0048 (encoded payload) all fire end-to-end after this commit. R0046 mal_hash_match and R0047 BEC remain deferred per their respective DEBT entries (filed in the next commit).	2026-05-02 19:10:30 -04:00
anti	291b78c1d0	feat(smtp): extract body_simhash + base64-bytes + html-smuggling + per-attachment macro/encrypted Heavyweight Layer-2 extractors land alongside the cheap projections shipped in commit `e9324aca`, so the EmailLifter R0042 / R0046 (macros / password / smuggling lanes) / R0048 fire from the bus payload without the lifter having to reach back to disk. Extractors: * body_simhash — inlined 64-bit Charikar simhash (md5-keyed, frequency-weighted) over word tokens of the union of text/* body parts. Inlined rather than pulling the `simhash` PyPI dep, which transitively brings numpy ~50 MB into a slim decky container; the algorithm is ~15 lines and identical in extraction quality. * body_base64_bytes — largest decoded base64 chunk's byte count, scanning text body parts with the same `_BASE64_RE` the lifter's `_p_encoded_payload` fallback uses. R0048 fires from this scalar alone; the lifter's body_text fallback becomes dead in normal operation. * attachment_macro_indicator — stdlib zipfile sniff for `vbaProject.bin` inside OOXML containers. Catches modern .docm / .xlsm / .pptm and macro-injected .docx; legacy .xls (CFBF) is a follow-up. * attachment_encrypted — flag_bits & 0x01 on any ZIP / OOXML entry's central directory; magic-byte match for 7z / RAR / CFBF (encrypted Office wrap). * html_smuggling — structural lxml parse first: fires when an `<a download>` element coexists with a `<script>` referencing `Blob` / `Uint8Array` / `URL.createObjectURL`. Regex pair-check fallback on lxml parse failure (real-world phish HTML is often malformed). Cuts the FP rate that pure-regex would produce on legitimate "click to download" links. Add `python3-lxml` (~5 MB Debian package, C-extension, no transitive Python deps) to the SMTP decky's Dockerfile. simhash stays inline. Per the dependency rule: lxml earns its weight by cutting R0046's OR-combined FP rate; a heavier macro-detection lib (oletools ~5 MB pure-python with msoffcrypto) would not measurably improve the boolean signal we need, so stdlib stays for that lane.	2026-05-02 19:08:37 -04:00
anti	fb85762703	feat(bus): publish email.received from ingester after SMTP artifact persist Wires the EmailLifter (R0041–R0048) producer that DEBT.md item #3 deferred. After the existing add_bounty() call in _extract_bounty (line 615), call _publish_email_received() which: * resolves the attacker_uuid via repo.get_attacker_uuid_by_ip; drops the publish if unresolved (the TTP worker can't anchor orphan events) * projects the message_stored fields onto the EmailLifter wire contract: from_domain / mail_from_domain / return_path_domain parsed via _domain_of, rcpt_count + rcpt_domains via _rcpt_projection, attachment_sha256s + attachment_extensions derived from the existing attachments_json manifest, urls from urls_json, dkim_signed/spf_pass coerced from 0/1 ints to bool * mirrors _publish_probe_pending's bus-per-call pattern and swallows all exceptions (the bus is the notification layer, not the source of truth) Fires for both relay and non-relay SMTP services. R0041 / R0043 / R0044 / R0045 are now live end-to-end; R0046 partial (extension lane). Heavyweight predicates (R0042 simhash, R0046-deep, R0047 / R0048 body_text) stay deferred per the EmailLifter heavyweight DEBT entry.	2026-05-02 18:39:13 -04:00
anti	e9324acac7	feat(smtp): emit X-Mailer / Return-Path / dkim+spf / URLs on message_stored The EmailLifter (R0041–R0048) keys on header-derived signals that the v0 _summarize_message did not extract. Add cheap Layer 2 projections inside the existing single-pass parse: * return_path / x_mailer — direct header reads, decoded RFC 2047 * dkim_signed / spf_pass — booleans derived from any Authentication-Results header (multiple lines tolerated; positive verdict on any line wins) * urls — http(s) URLs lifted from text/* body parts via a tight regex, deduplicated first-seen-wins, capped at 64 in the wire payload to bound the syslog SD value Heavyweight extraction (body simhash, office-macro detection, HTML-smuggling, password-protected archives, mal-hash-match, body_text projection) stays deferred per the EmailLifter heavyweight DEBT entry — those rules need privacy / extractor decisions before they ship.	2026-05-02 18:37:11 -04:00
anti	75ff0ede1f	fix(ttp): correct intel_lifter mappings + repoint ThreatFox to threat_type Three bug classes uncovered by the 2026-05-02 ship-time audit: * AbuseIPDB code/name mismatch in v1: cat 10 was treated as DDoS (it's Web Spam — DDoS is cat 4, intentionally unmapped per A.10) and cat 17 as VPN IP (it's Spoofing — VPN IP is cat 13). Both typos mirrored in code AND the design doc Appendix A.10. Code now matches the AbuseIPDB taxonomy exactly; cat 17 retargets to T1566 (email-spoofing as a phishing precursor), and cats 7 (Phishing) and 16 (SQL Injection) pick up T1566 / T1190 emissions that v1 didn't cover. * ThreatFox dispatch keyed on `ioc_type` in v1, but `ioc_type` is the indicator format (url / domain / hash variants) and carries no ATT&CK signal. The canonical taxonomy field per ThreatFox's API is `threat_type` (botnet_cc / payload_delivery / payload / cc_skimming). Repoint dispatch through the new `threatfox_threat_types` payload field; `ioc_type` rides as evidence only. Also adds the missing cc_skimming -> T1056 (Input Capture) mapping and registers T1056 in attack_catalog.py. * GreyNoise bare-malicious lane: a `classification == "malicious"` row with no recognised tag used to emit nothing. Now lights T1071 at a half multiplier, suppressed when a tag already fires T1071 to avoid double-stamping at conflicting confidence levels.	2026-05-02 18:08:48 -04:00
anti	a31ad82880	feat(intel): project per-provider taxonomy into attacker.intel.enriched payload The TTP worker forwards the bus payload verbatim to the IntelLifter as TaggerEvent.payload. The pre-audit publish payload only carried {attacker_uuid, attacker_ip, aggregate_verdict, providers}, so even with the new AttackerIntel taxonomy columns populated the lifter still saw nothing. Lift the relevant fields (categories / tags / threat_types / malware family / score / classification) into the bus event and decode JSON-string list columns back to native lists at the boundary.	2026-05-02 18:08:29 -04:00
anti	999d3494b4	feat(intel): persist per-provider taxonomy on AttackerIntel for TTP dispatch The 2026-05-02 ship-time audit of the R0054-R0058 intel rule pack found that AbuseIPDB / GreyNoise / ThreatFox stored only the aggregate verdict (score / classification / listed-bool) plus the raw response blob. The TTP IntelLifter expects per-provider taxonomy fields (categories, tags, threat_types) that were never populated, so R0054 / R0055 / R0057 emitted zero tags in production despite passing unit tests. Add typed columns: abuseipdb_categories, greynoise_tags, greynoise_name, feodo_malware_family, threatfox_threat_types, threatfox_ioc_types, threatfox_malware_families. Each provider now parses the relevant taxonomy out of the upstream response and writes it through column_updates. JSON-list columns ride as TEXT with default "[]" to keep the SQLite/MySQL backend split honest, deserialised back to native lists by the repo on read.	2026-05-02 18:07:57 -04:00
anti	d1c4a48963	feat(ttp): split bash CMD evidence into structured uid/user/src/pwd/cmd rows The inspector was dumping the whole `CMD uid=0 user=root src=… pwd=… cmd=nmap -p- 192.168.1.0/24` syslog body into a single ``command_text`` blob. ANTI: "I'd like to separate the fields." Done — three layers work together: 1. Collector session aggregator: new `_parse_cmd_msg` splits the bash PROMPT_COMMAND msg into `{uid, user, src, pwd, command}`. The session-ended envelope's per-command dict now carries the structured fields, with `command_text` set to just the cmd= value (preserving embedded whitespace — `nmap -p- 1.2.3.0/24` etc.). 2. Rule engine: per-source_kind auxiliary evidence list (`_AUX_EVIDENCE_FIELDS`). For `command` events the engine automatically promotes uid/user/src/pwd into the persisted `evidence` dict on top of the rule's explicit `evidence_fields`. Engine-controlled, not per-rule — adding a new aux field is one line here, not a 30-rule YAML sweep, and rule authors can't accidentally drop it. 3. TTPInspector frontend: evidence renders as a structured `kvs` grid (UID / USER / SRC / PWD / CMD rows) instead of pretty-printed JSON. Primary-order list keeps shell fields at the top; everything else falls below alphabetically so unfamiliar evidence shapes still surface predictably. Tests: - session_aggregator pins the structured-fields emit (uid/user/src/ pwd/command_text without "CMD" prefix, embedded whitespace preserved). - rule_engine_tagger pins the aux-field auto-promotion + the no-`None`-leakage path when payload doesn't carry an aux key.	2026-05-02 03:20:53 -04:00
anti	84699f89da	feat(ttp): show canonical ATT&CK technique names in the TTPs UI "T1595" alone is opaque; "T1595 — Active Scanning" tells you the story at a glance. The names come from a backend-side static catalogue pinned to the same ATT&CK release as the rule engine (_ATTACK_RELEASE = "v15.1") — names are the canonical MITRE labels, not author-supplied strings on rules, so a rule author can't typo a name and the entire fleet sees the typo. - New `decnet/ttp/attack_catalog.py` with `TECHNIQUE_NAMES` covering every technique_id + sub_technique_id emitted by `rules/ttp/` (R0001..R0058 → 69 IDs in the v0 pack). - `IdentityTechniqueRow` / `TechniqueRollupRow` / `CampaignTechniqueRow` / `TTPTagDetailRow` gain optional `technique_name` / `sub_technique_name` fields. Repo + router populate them from the catalogue at row-construction time. None when an ID isn't in the catalogue — UI falls back to the bare ID. - Coverage test (`tests/ttp/test_attack_catalog.py`) walks every YAML rule and asserts every emitted ID has a catalogue entry, so a future rule author who forgets to update the catalogue gets a loud failure rather than a silent UI fallback. Frontend: - `TTPsObservedSection` shows "T1595.002 — Active Scanning: Vulnerability Scanning" instead of just the ID, with overflow ellipsis + tooltip for narrow viewports. Inspector header / TECHNIQUE row also surface the names.	2026-05-02 03:10:07 -04:00
anti	42e9492118	feat(ttp): inspector drawer surfaces evidence + rule_id behind each technique The TTPsObservedSection rollup tells the operator "we saw T1059" but not why. Click any technique row → side drawer opens listing every ttp_tag row in scope with the persisted evidence JSON, firing rule_id / rule_version, source_kind / source_id, confidence, and created_at. Mirrors the CredentialReuseInspector / BountyInspector pattern (drawer-backdrop + bd-head/bd-body + kvs grid). Backend: - New `GET /api/v1/ttp/tags/by-{scope}/{uuid}/{technique_id}` (`scope ∈ {identity, attacker, session}`, optional `?sub_technique_id=`, `?limit=` capped to 1000). Returns raw TTPTag rows newest-first. - New `TTPTagDetailRow` Pydantic model + re-export. - New repo method `list_tags_by_scope_and_technique` on TTPMixin (+ abstract on BaseRepository) — single query branched on scope; identity scope projects through `Attacker.identity_id` the same way `list_techniques_by_identity` does. - Tests: evidence round-trips, sub_technique filter, JWT-required, empty scope, unknown scope rejected. Frontend: - New `TTPInspector.tsx` + `TTPInspector.css` (violet accent, slide animation, focus-trapped panel matching the existing inspector family). - `TTPsObservedSection`'s TechniqueBar is now click+keyboard activatable; clicking opens the inspector for that (technique, sub_technique) tuple. mypy clean. 532 passed in the targeted sweep.	2026-05-02 02:55:05 -04:00
anti	c4e29e3bf9	fix(ttp): resolve attacker_uuid from attacker_ip on bus-event consume The collector's `attacker.session.ended` envelope carries `attacker_uuid: null` and `attacker_ip: <ip>` because the collector doesn't talk to the DB. The TTP worker passed that null straight through, and `TTPTag.__init__` raised the documented invariant: ValueError: ttp_tag requires at least one of attacker_uuid / identity_uuid; both NULL is not a valid anchor. The worker now resolves `attacker_uuid` from `attacker_ip` via `BaseRepository.get_attacker_uuid_by_ip` before fanning out the event. When the IP isn't in the DB yet (profiler hasn't ingested the row), the event is dropped with one log line — better than exploding mid-tag. - New `get_attacker_uuid_by_ip(ip) -> str \| None` on the repo (BaseRepository abstract + AttackersCoreMixin impl). - `_resolve_attacker_uuid` helper in `decnet/ttp/worker.py` runs before `_build_events`. Short-circuits when the payload already has either anchor; drops the event when neither anchor is resolvable. - Tests pin: short-circuit on existing uuid/identity, repo lookup, drop on unknown IP, drop on "Unknown" sentinel, drop on no-anchor payload, drop on repo failure.	2026-05-02 02:44:30 -04:00
anti	b5ce236cab	test(bus): pin scope-(2) producer wiring for reuse / clusterer / intel Three producer-side regression guards. Each drives the worker's run loop with a fake bus + stubbed repo and asserts the documented topic fires when the producer has data: - reuse correlator → credential.reuse.detected (one finding row) - clusterer → identity.formed + identity.merged (one ClusterResult) - intel worker → attacker.intel.enriched (one unenriched attacker + a fake provider returning a "malicious" verdict) These complement commit 1's attacker.session.ended producer test — together the four cover every TTP-relevant publisher in the tree (modulo email.received, which has no producer yet; tracked in DEBT.md).	2026-05-02 02:38:24 -04:00
anti	b043c96d29	feat(collector): publish attacker.session.ended on session_recorded events The TTP worker subscribes to attacker.session.ended but no upstream component published it — the rule pack (R0001–R0030) therefore never fired on live SSH traffic even after the consume-side wiring landed in E.3.18a/b/c. The collector now hosts a per-attacker_ip command index (_SessionAggregator) that watches the same parsed-event stream as _publish_log. Shell `command` events are appended to a per-IP list; on `session_recorded` the aggregator slices the list to commands inside the [ended_at - duration_s, ended_at] window and publishes attacker.session.ended with the session metadata + commands list. The TTP worker's _build_events fan-out (E.3.18b) turns each command into a source_kind="command" TaggerEvent that the RuleEngineTagger (E.3.18c) matches against R0001–R0030. Memory bound: per-IP entries TTL-evict at DECNET_COLLECTOR_SESSION_AGG_TTL_SEC (default 3600 s). Publish failures are swallowed in the aggregator — a misbehaving bus cannot stall the per-container stream threads.	2026-05-02 02:35:08 -04:00
anti	d9d2a80573	fix(collector): unwrap double-wrapped RFC5424 around bash PROMPT_COMMAND Honeypot SSH containers run `PROMPT_COMMAND` that calls `logger --rfc5424 --msgid command -t bash "CMD …"`. The Docker-stdout reader prepends an outer RFC5424 envelope (HOSTNAME=<decky>, APP-NAME=1, MSGID=NIL) around that inner syslog line. Both the collector parser (`parse_rfc5424`) and the correlation parser (`parse_line`) saw the outer NIL MSGID and emitted `event_type="-"` for every shell command — which: - kept `Attacker.commands` rows missing `command_text` - left R0001–R0030 (the pattern rule pack that matches shell commands) with no haystack - made `decnet.collector.log` show `event written … type=-` for the very lines that should be `type=command` Both parsers now detect the inner-RFC5424 shape (`<TS> <HOST> <APP> <PROCID> <MSGID> <rest>`) when the outer MSGID is NIL and the SD-arm is also NIL, and re-extract HOSTNAME / APP-NAME / MSGID / remainder from the body. The collector parser also recovers the post-SD msg tail when the SD block isn't `relay@55555` (the bash CMD line carries a `[timeQuality …]` block) so the kv-fallback can find `src_ip`. Mirroring tests in tests/collector and tests/correlation pin both the unwrap and the regression guard for non-double-wrapped lines.	2026-05-02 02:32:21 -04:00
anti	e08bfc4a73	fix(ttp): /api/v1/ttp/rules returns the live rule catalogue The endpoint was a contract-phase stub returning `[]` even though the RuleStore loaded all 58 YAML rules at worker startup. UI saw an empty table; operators couldn't tell whether anything was wired up. - `api_list_rules` now calls `get_rule_store().load_compiled()` and serializes each CompiledRule + its operational state into a RuleCatalogueRow. Sorted by rule_id for stable golden snapshots. - Add `description: str` to RuleSchema (pydantic) and CompiledRule (NamedTuple, defaulted) + propagate through `_compile_one` so the catalogue surfaces the human-readable YAML description, not just the slug-style `name`. - Update `tests/ttp/test_rule_engine.py` _fields assertion for the new column; new `tests/api/ttp/test_rules_catalogue.py` pins the catalogue contents (R0001/R0014 presence, row shape, sort order). Worker behaviour is unchanged: it was already loading rules correctly. This is purely a read-side wiring fix on the operator API.	2026-05-02 01:54:06 -04:00
anti	301d3feee9	feat(ttp): E.4.a extract decnet/cli/ttp.py with worker run + backfill CLI The TTP worker entry moved out of decnet/cli/workers.py into its own module so the TTP CLI surface (worker + admin verbs) is colocated, mirroring decnet/cli/canary.py / webhook.py / swarm.py. - New `decnet/cli/ttp.py` with `decnet ttp` (worker, ExecStart-stable for decnet-ttp.service) and `decnet ttp-backfill --since-days N`. - `decnet ttp-backfill` walks Attacker.commands and CanaryTrigger history, dispatches each row through the live CompositeTagger, persists tags via repo.insert_tags (idempotent INSERT OR IGNORE). --dry-run / --source command\|canary\|all / --batch-size supported. - Backfill deliberately bypasses bus publish — historical replay must not re-trigger SIEM/webhook fan-out per TTP_TAGGING.md §"Bus topics" loop-prevention invariant. - Added `iter_attacker_commands_since` / `iter_canary_triggers_since` read-only iterators on TTPMixin + abstract bindings on BaseRepository. - Master-only via gating; both `ttp` and `ttp-backfill` listed in MASTER_ONLY_COMMANDS.	2026-05-02 01:35:17 -04:00
anti	e84b522fd3	feat(ttp): E.3.18c wire RuleEngine via RuleEngineTagger The canonical rule-based engine from §"Tagging engines, layered §1" of TTP_TAGGING.md was fully implemented but never instantiated as a composite child — pure pattern rules (R0014/R0017/R0023/... 23 rules total) had no tagger to dispatch them. - Add `RuleEngineTagger(Tagger)` adapter in rule_engine.py wrapping `RuleEngine.evaluate()`. `HANDLES = {command, http_request, auth_attempt, payload}` — the source kinds whose rules typically live outside any per-source lifter. - Adapter's `watch_store()` filters via `_is_engine_owned` so the engine's dispatch index excludes lifter-claimed rules (`match.kind: lifter:*`) and stays disjoint from per-lifter ownership. - Prepend `RuleEngineTagger` to the `CompositeTagger` lifter list so generic pattern rules dispatch before per-source cross-event logic. - Composes with E.3.18a (worker hydrates `watch_store`) and E.3.18b (worker fans session payloads into per-`command` events) — together these three commits make R0001–R0030 actually fire at runtime.	2026-05-02 01:29:58 -04:00
anti	65435f1427	feat(ttp): E.3.18b worker fans session-ended payloads into per-command events R0001–R0030 declare `applies_to: [command]` and match per command, not per session. The worker now translates one `attacker.session.ended` payload carrying a `commands: list` into: - one source_kind="session" event (behavioral / cross-event lifters) - one source_kind="command" event per command (RuleEngineTagger) Both string and dict command shapes are accepted; dicts contribute their `id` / `uuid` / `command_id` as the per-command source_id so the deterministic `compute_tag_uuid` keeps replays idempotent. Tags from session + per-command dispatch are aggregated into a single `ttp.tagged` envelope per upstream session.	2026-05-02 01:27:37 -04:00
anti	44ade3eb63	fix(ttp): E.3.18a worker hydrates per-lifter rule indexes via watch_store Each per-source lifter holds its own RuleIndex and exposes an `async watch_store()` that loads the corpus and drains store change events forever. Until this commit nothing called `watch_store()` in production — every dispatch index stayed empty and no rule fired. - Add `WatchableTagger` runtime-checkable Protocol in `decnet.ttp.base`. - `CompositeTagger.iter_watchables()` yields lifters that satisfy it. - `run_ttp_worker_loop` fans out one task per watchable, cancelled and awaited alongside pump/heartbeat/control in the existing finally. - Watch failures log and exit the watch task without taking the worker down — mirrors the pump-task tolerance contract.	2026-05-02 01:25:15 -04:00
anti	9a31d0e50c	feat(ttp): E.3.17 worker registration + scoped schemathesis suite Wires decnet-ttp as a first-class worker: * `decnet ttp` CLI command (master-only via MASTER_ONLY_COMMANDS) * deploy/decnet-ttp.service.j2 systemd unit (After= identity / intel / reuse-correlator workers; ProtectHome=read-only since FilesystemRuleStore only reads ./rules/ttp/) * deploy/decnet.target Wants= chain extended with decnet-ttp.service * `ttp` was already in web/worker_registry.KNOWN_WORKERS tests/api/test_schemathesis_ttp.py: TTP-routes-only schemathesis suite, filtered via the OpenAPI tags=["TTP Tagging"] annotation shared by the eight TTP routes. Reuses the live uvicorn subprocess the wider test_schemathesis spawns; max_examples=400 keeps the focused gate fast for E.3.13–E.3.16 iteration. wiki-checkout/Service-Bus.md committed in its own repo: ttp.tagged and ttp.rule.fired.<id> flipped from "reserved (TTP worker)" to "decnet.ttp.worker" now that the worker publishes them.	2026-05-01 21:26:46 -04:00
anti	403d83faba	feat(ttp): E.3.15 UKC bridge — production phase-handoff edge fires Add BaseRepository.list_ttp_decky_phases(identity_uuid) returning per-decky tag observations as (decky_id, tactic, created_at_ts) rows ordered by creation time. Rewrite from_identity_row() to project tactic → UKCPhase via tactic_to_ukc_phase and populate the four phase-handoff maps (first/last_phase_per_decky, first/last_seen_per_decky) so combined_campaign_weight finally lights up on real DB rows — not just synthetic fixtures. ConnectedComponentsCampaignClusterer.tick() pulls each active identity's per-decky phase observations before projecting features. Repo failures are non-fatal: a partial repo falls back to the empty phase-handoff signal (legacy behavior) so the worker stays up. tests/clustering/test_ttp_phase_handoff.py pins the production-row pair clearing CAMPAIGN_EDGE_THRESHOLD on a C2 → DISCOVERY hand-off — the trip-wire that says the whole project paid off. commands_by_phase_on_decky itself stays empty on the production path: it is consumed only by the synthetic-fixture similarity surface, and the phase-handoff edge does not use it. Synthetic fixtures still populate it directly via from_synthetic_identity.	2026-05-01 21:01:58 -04:00
anti	101127247e	feat(ttp): E.3.14 worker bootstrap (insert + ttp.tagged publish) Inner loop drains a per-process asyncio.Queue populated by one pump task per topic in _TOPICS, dispatches each event through CompositeTagger, persists via repo.insert_tags(), and publishes ttp.tagged + per-technique ttp.rule.fired.<id> only when the insert returned a non-zero rowcount. CompositeTagger seeded with all six lifters (Behavioral, Intel, CanaryFingerprint, Email, Identity, Credential). Loop-prevention invariant from TTP_TAGGING.md §"Bus topics" enforced: N replays of the same upstream event publish exactly one ttp.tagged event. test_worker_bus covers both the direct invocation path and the idempotency replay path. Intel catch-up via attacker.session.ended is intentionally deferred to E.3.14b — needs a session→intel join the repo doesn't expose yet.	2026-05-01 20:57:57 -04:00
anti	322fd44d72	feat(ttp): E.3.13 IdentityLifter + CredentialLifter (R0001-R0006) IdentityLifter owns lifter:identity_* — currently R0003 (password spraying). CredentialLifter owns lifter:credential_* — R0001 generic auth brute, R0002 password guessing, R0004 credential reuse, R0005 valid-account use, R0006 default credentials. YAMLs R0001/R0002/R0003/R0005/R0006 had their match.kind normalised to fit the lifter prefix scheme — the design doc's promised "YAMLs normalised in a separate refactor commit" lands here. Identity-rollup tags null out attacker_uuid on emit so the worked- example invariant holds (the tag belongs to the Identity, never to one member IP). Tests: test_identity_lifter.py + test_credential_lifter.py cover each predicate's positive/negative path, state modulation (disabled/clipped/expired), source-kind gating, and idempotent replay. test_lifter_absence and test_lifters updated for the new ctor signature.	2026-05-01 20:52:56 -04:00
anti	7a89fbb357	feat(ttp): E.3.12 EmailLifter (R0041-R0048) SMTP message-level technique tagger per Appendix A.6: open relay abuse (rcpt_count + foreign From), mass phishing (rcpt_count + body simhash), phishing-kit X-Mailer, IDN/punycode URL, sender masquerade composite (From/Return-Path/DKIM/SPF), malicious attachment (macro/.lnk/.iso/.img/ hash match), BEC subject+body composite, encoded payload in body. PII discipline (TTP_TAGGING.md §'Hard parts §6') is enforced at the lifter layer via _filter_evidence(): emitted TTPTag.evidence is restricted to the EmailEvidence-allowed allowlist (body_sha256, matched_headers — names only, rcpt_domain_set — domains only, attachment_sha256s, rcpt_count) plus PII-safe match discriminators (matched_kit, matched_trigger, matched_url_host, etc). Raw addresses, raw body bytes, full URLs, and decoded base64 previews NEVER appear in evidence — defense-in-depth over the YAML evidence_fields hint. Tests: tests/ttp/test_email_lifter.py per-rule positive + negative + PII allowlist guard + state modulation. tests/ttp/rule_precision/ test_email_rules.py xfail flipped to real precision (R0041-R0048 H-band ≥95%). Corpus rows updated to acknowledge that R0045 (masquerade) co-fires with R0041 / R0047 when the sender-masquerade signals are present alongside open-relay or BEC patterns — overlap is by design, not a precision bug.	2026-05-01 20:31:03 -04:00
anti	f211d394e6	feat(ttp): E.3.11 CanaryFingerprintLifter (R0049-R0053) Browser-payload derivations per Appendix A.9: navigator.webdriver flag, canvas/audio/WebGL automation hash matches (Puppeteer/Playwright/ Selenium/curl-impersonate), WebRTC IP leak, TZ/language vs source-IP geo mismatch, navigator.platform vs userAgent vs WebGL renderer inconsistency. Evidence shape pinned to CanaryFingerprintEvidence (metric + matched_signature) — raw fingerprint blobs (canvas hashes, full UAs, navigator.platform values) explicitly NOT carried into TTPTag.evidence per TTP_TAGGING.md §'Hard parts §7' (enrichment vs tag boundary). The identity-merge guard rail is preserved: composite fp.id matches across IPs are NOT a TTP, so no rule fires on the bare hash. Tests: tests/ttp/test_canary_fingerprint_lifter.py per-rule positive + negative + evidence-shape guard + state modulation. tests/ttp/rule_precision/test_canary_rules.py xfail flipped to real precision (R0049/R0050/R0051/R0053 H-band ≥95%; R0052 M-band ≥80%).	2026-05-01 20:25:57 -04:00
anti	7865e71aa9	feat(ttp): E.3.10 IntelLifter (R0054-R0058) Per-provider verdict translator for AbuseIPDB, GreyNoise, Feodo Tracker, and ThreatFox per Appendix A.10. Each rule's predicate inspects payload fields produced by the enrich worker (no DB I/O, no decnet.intel.* imports — E.2.7 decoupling guard preserved). AbuseIPDB confidence is scaled by abuse_confidence_score / 100; categories drive per-technique fan-out. R0058 aggregate-bump is a no-op in v0 (cross-tag bump deferred to E.3.14 worker bootstrap). Per-provider null tolerance is the steady state — a missing provider column produces zero tags from that rule, never an error. Tests: - tests/ttp/test_intel_lifter.py — per-provider positive + negative + state modulation + decoupling source-import guard. - tests/ttp/rule_precision/test_intel_rules.py — xfail flipped, real precision driven over seed_intel.jsonl (R0054-R0057 H-band ≥95%; R0058 skipped as bump-only). - tests/ttp/test_lifter_absence.py — IntelLifter all-populated test flipped from xfail-strict to real assertion with realistic payload. - tests/ttp/test_lifters.py — partial-null xfail flipped to real assertion.	2026-05-01 20:23:42 -04:00
anti	eff3e4bce7	feat(ttp): E.3.9 BehavioralLifter (R0031-R0040) Reads pre-shaped session aggregates from TaggerEvent.payload and emits techniques per Appendix A behavior tables. Per-rule predicates dispatch on match.kind (lifter:behavioral_<name>); the lifter holds its own RuleIndex watching the same RuleStore as the engine, so disable / clip / TTL state reaches lifter-bound rules through the same atomic-swap path. R0032/R0036/R0037/R0040 YAMLs had over-escaped regex strings (\\ instead of \\) — fixed in place. Factory wired so default get_tagger() returns CompositeTagger with BehavioralLifter shipped; remaining three lifters (E.3.10-E.3.12) land in subsequent commits. E.2.6 contract preserved via TolerantTagger: empty payload steady-state yields [] with zero ERROR records. Disabled / clipped / expired state verified.	2026-05-01 20:17:59 -04:00
anti	e7531ee756	refactor(ttp): extract RuleIndex from RuleEngine E.3.9.0 prerequisite for the per-source lifters (E.3.9-E.3.13). The dispatch index, install/evict/apply_change atomic-swap protocol, and state-modulation helpers (is_active / apply_ceiling) move out of rule_engine.py into _rule_index.py and _state.py. RuleEngine wraps a RuleIndex; back-compat shims preserve _by_kind / _by_rule / _install attribute access for tests poking at the dispatch internals. Lifters in E.3.9-E.3.12 will each hold their own RuleIndex, watching the same RuleStore via subscribe_changes() fan-out. Hot-reload semantics (disable / clip / TTL via set_state API) now reach lifter-bound rules through the same atomic-swap path the engine uses, not a future composite-rebuild compromise.	2026-05-01 20:09:18 -04:00
anti	b819dfefa3	feat(ttp): E.3.8 R0054-R0058 intel cohort + mark step done 5 YAMLs for the intel-verdict cohort per Appendix B / A.10: AbuseIPDB category mapping, GreyNoise classification, Feodo Tracker hit, ThreatFox IOC type, aggregate-malicious bump-only. IntelLifter (E.3.10) consumes by rule_id and tolerates absence silently (null provider column → no tag). R0058 is the meta bump-only rule — emits a single confidence=0.0 sentinel so it validates and surfaces in the catalogue, but the repository's sub-0.3 drop ensures no fresh tag persists if the fanout fires accidentally. test_intel_rules.py pins that zero-confidence invariant. Marks E.3.8 done in development/TTP_TAGGING.md with the cohort- split summary.	2026-05-01 09:22:48 -04:00
anti	dc1867315d	feat(ttp): E.3.8 R0049-R0053 canary fingerprint cohort 5 YAMLs for the canary-fingerprint cohort per Appendix B / A.9: navigator.webdriver flag, automation canvas/audio/WebGL hash match, WebRTC IP leak, TZ/lang vs geo mismatch, platform inconsistency. CanaryFingerprintLifter (E.3.11) consumes by rule_id. test_canary_rules.py: YAML-present + inert-in-v0 + xfail(strict) gated on E.3.11.	2026-05-01 09:21:01 -04:00
anti	1ad15470a1	feat(ttp): E.3.8 R0041-R0048 email cohort 8 YAMLs for the email cohort per Appendix B: open-relay abuse, mass phishing, phishing-kit X-Mailer signatures, IDN/punycode URLs, sender masquerade, malicious attachment, BEC, encoded payload in body. EmailLifter (E.3.12) consumes by rule_id. test_email_rules.py: YAML-present + inert-in-v0 + xfail(strict) precision case gated on E.3.12.	2026-05-01 09:19:56 -04:00
anti	806301e179	feat(ttp): E.3.8 R0031-R0040 behavioral cohort 10 YAMLs for the behavioral / cross-event cohort per Appendix B: beaconing, data destruction, ransom note, web exfil, DB mass-read, credentials-in-files, k8s SA token harvest, Docker host escape, LLMNR poisoning, TFTP router-config retrieval. Every rule is lifter-bound (BehavioralLifter / IdentityLifter) — the v0 RuleEngine cannot count, aggregate, or compose cross-event signals, so these YAMLs declare the technique mappings the lifter will consume by rule_id at E.3.9. Their match specs use a 'kind: lifter:*' shape inert to the regex matcher. test_behavioral_rules.py asserts each YAML compiles, none fire from the v0 engine (FP regression guard against a YAML drifting into a regex), and an xfail(strict=True, reason='impl phase E.3.9') precision case that will flip green when the lifter lands.	2026-05-01 09:18:27 -04:00
anti	b1fe1f9403	feat(ttp): E.3.8 R0001-R0030 command cohort 30 YAMLs for the shell/command rule cohort per Appendix B (rules/ttp/). Splits into engine-active (R0007-R0029, regex on command_text / raw_url / user_agent) and lifter-bound (R0001-R0006, R0030 — the v0 RuleEngine cannot count auth attempts, do identity rollups, or parse fingerprint blobs; the BehavioralLifter / IdentityLifter / CredentialLifter consume them by rule_id at E.3.9 / E.3.13). test_command_rules.py asserts: - every R000N has a YAML that compiles - lifter-bound rules NEVER fire from the v0 engine (regression guard against a YAML drifting into a regex match.spec) - engine-active rules meet their Appendix-C precision target against the seed corpus (≥0.95 high-conf, ≥0.80 medium) Conftest fixes: precision_engine moved to module-scope so module- scope precomputed dispatch fixture (fired_by_label) can request it; _RULES_DIR path bumped from parents[2] to parents[3] so the loader resolves the project root regardless of pytest cwd; make_event synthesizes attacker_uuid so TTPTag's anchor invariant is satisfied. Seed corpus broadened: positive examples for every regex rule plus 6 negative examples across innocuous shell verbs (ls, echo, cd, ps, df, free) so FPs surface in precision rather than passing vacuously.	2026-05-01 09:16:38 -04:00
anti	c635478442	feat(ttp): E.3.8 corpus + harness — labelled holdout fixture Sub-step preceding the rule-pack commits per TTP_TAGGING.md:2967. Adds the per-rule precision suite scaffolding under tests/ttp/rule_precision/: - conftest.py: precision_engine fixture (RuleEngine populated from ./rules/ttp/), corpus_loader (real → seed → empty fallback), precision_for() helper for TP/FP accounting. - _build_corpus.py: extractor for a real prod corpus pull. Mandatory --exclude-ip / DECNET_TTP_CORPUS_EXCLUDE_IPS — operator IPs never end up in the committed exclusion list. Pulls both 'command' and 'unknown_command' event types. - corpus/seed_.jsonl: synthetic seed rows for each cohort so the harness exercises in clean checkouts. - corpus/.jsonl (operator-built) is gitignored. - test_corpus_loads.py: sentinel that every seed file parses.	2026-05-01 09:08:07 -04:00
anti	ed3f340ea8	feat(ttp): E.3.7 RuleEngine — evaluate + atomic-swap watch_store Implements the rule engine body left empty at contract phase: evaluate() dispatches by source_kind through self._by_kind, runs the rule's match spec against event.payload, and emits one TTPTag per emits entry. watch_store() loads the initial corpus from RuleStore.load_compiled, then drains subscribe_changes, applying definition changes via single-statement dict assignment (atomic swap, GIL-atomic to readers) and state changes via NamedTuple._replace on the existing CompiledRule. Why: with the FS + DB stores in place (E.3.5/E.3.6), the engine is the last piece of the rule plane. Lifters (E.3.9–E.3.13) consume the engine; the worker bootstrap (E.3.14) wires watch_store into the asyncio event loop. After this commit a CompositeTagger constructed with a RuleEngine + a populated rules dir will produce real tags. Notes: - CompiledRule.emits extended to 4-tuple (technique_id, sub_technique_id, tactic, confidence). Tactic + confidence ride per-emit so a single rule can carry multiple precision targets (the "one event maps to many techniques" property). Compile helpers in both backends extract them from the YAML emits dict; missing tactic or confidence is a deploy-time error. - v0 match operator is "pattern" (regex). The field defaults per source_kind (command_text / raw_url / subject / verdict / …) and is overridable via match.field. Future ops (contains, equals, in_set) extend _match_event without touching the engine surface. - Confidence model: rules with state="clipped" + confidence_max set cap the per-emit confidence downward; clipped is a soft suppress, not a hard skip. Disabled rules are skipped wholly; expires_at past is re-checked at evaluate as defense-in-depth (the store auto-reverts, but a racing read between expiry and revert must not fire the rule). - _span(name, **attrs) helper in engine + both stores short-circuits on decnet.telemetry._ENABLED — matches the project's @traced / wrap_repository zero-overhead-when-disabled pattern instead of relying solely on the no-op tracer indirection. - Late-bound tracer (telemetry.get_tracer called per-span, not at module load) so test_tracing's monkeypatch reaches the production code path. xfails flipped: tests/ttp/test_rule_engine.py multi-emit fan-out + rule_version-collision-via-engine; tests/ttp/test_multi_mapping.py N×M engine fan-out + idempotent replay; tests/ttp/test_tracing.py ttp.eval span hierarchy + ttp.rule.fire span attributes. Tests: 214 passed, 19 xfailed (gated on E.3.8 lifters / rule pack / worker bootstrap). mypy: clean on prod code; pre-existing test-stub arg-type warnings unchanged.	2026-05-01 08:49:15 -04:00
anti	8a93ee3129	feat(ttp): E.3.6 DatabaseRuleStore — ttp_rule/ttp_rule_state + master sync Implements the DB-backed rule store body left empty at contract phase: load_compiled reads from ttp_rule + ttp_rule_state; get_state / set_state hit ttp_rule_state with the same expires_at auto-revert and bus-event semantics as the FS backend; subscribe_changes returns a per-subscriber queue. State persists across process restarts — the swarm property the FS backend deliberately doesn't have. Also lands two swarm-mode helpers: - sync_from_filesystem(fs_store) — master-side, subscribes to a FilesystemRuleStore and projects each RuleChange onto a ttp_rule upsert/delete. - tail_db(poll_interval) — worker-side, watermark poll over ttp_rule.updated_at; emits RuleChange("definition", ...) for each row that moved. Why: swarm mode needs rule definitions and operator state to propagate across hosts. The filesystem backend (E.3.5) was the single-host-dev variant; this one survives restart and serves N workers from a shared DB. Notes: - DatabaseRuleStore() with no args lazy-inits an in-memory SQLite repo so the conformance fixture works without test plumbing. In production the worker bootstrap (E.3.14) passes an explicit repo. - The conftest.py rule_store fixture became async (pytest_asyncio), per-backend creates/initializes a SQLite repo for the DB run. - Adds a `seed_rule(store, rule_id, yaml)` helper to bridge backend semantics: drop a YAML file (FS) vs insert a ttp_rule row (DB). Used by the parametrized load_compiled conformance test. - Late-bound _tracer() in both backends (was module-level get_tracer binding) so test_tracing's monkeypatch of decnet.telemetry.get_tracer actually affects span output. xfails flipped: tests/ttp/store/test_database.py set_state-writes-to- ttp_rule_state + filesystem-to-DB sync; tests/ttp/store/test_conformance.py DB-side load_compiled / set_state isolation / round-trip / per-rule fan-out / expired-state revert / set_state failure / get_state default (was xfail-only-on-DB); tests/ttp/test_tracing.py set_state span hierarchy. Tests: 208 passed, 25 xfailed (gated on E.3.7 + lifters). mypy: clean on all touched files.	2026-05-01 08:39:46 -04:00
anti	f41995a229	feat(ttp): E.3.5 FilesystemRuleStore — inotify hot-reload + per-rule events Implements the filesystem-backed rule store body left empty at contract phase: YAML parse + Pydantic validation, asyncinotify watch over ./rules/ttp/, in-process state cache with auto-revert on expires_at, and a subscribe_changes() async iterator yielding one RuleChange per per-rule edit. Bus topic builders ttp_rule_reloaded / ttp_rule_state ship alongside. Why: the rule plane needed a store before the engine (E.3.7) could consume RuleChange events and atomically swap compiled rules into its dispatch index. Notes: - Linux-only by construction (asyncinotify wheel gated by sys_platform marker; FilesystemRuleStore.__init__ raises on non-Linux). - Filename allowlist is the FIRST check on every inotify event. - Content-hash dedup so a single write firing IN_CREATE + IN_CLOSE_WRITE produces exactly one RuleChange. - All compile work serializes on a single asyncio.Lock. - Subscribers register their queue eagerly so events fired between subscribe_changes() and the first __anext__() are buffered. xfails flipped: per-save-style + filter-ordering + atomic-swap in test_filesystem.py; load_compiled / set_state isolation / round-trip / per-rule fan-out / expired-state revert / set_state failure semantics in test_conformance.py (FS side; DB side stays xfail until E.3.6); malformed-YAML compile-time check in test_rule_engine.py. Tests: 197 passed, 35 xfailed (gated on E.3.6 / E.3.7 / lifters). mypy + bandit: clean on all touched files. Wiki update for the per-rule reload + state-change topics lands in a matching wiki-checkout/Service-Bus.md edit (separate repo).	2026-05-01 08:31:05 -04:00

1 2 3 4

178 Commits