feat(ttp): split bash CMD evidence into structured uid/user/src/pwd/cmd rows

The inspector was dumping the whole `CMD uid=0 user=root src=… pwd=…
cmd=nmap -p- 192.168.1.0/24` syslog body into a single ``command_text``
blob. ANTI: "I'd like to separate the fields." Done — three layers
work together:

1. Collector session aggregator: new `_parse_cmd_msg` splits the bash
   PROMPT_COMMAND msg into `{uid, user, src, pwd, command}`. The
   session-ended envelope's per-command dict now carries the
   structured fields, with `command_text` set to just the cmd= value
   (preserving embedded whitespace — `nmap -p- 1.2.3.0/24` etc.).

2. Rule engine: per-source_kind auxiliary evidence list
   (`_AUX_EVIDENCE_FIELDS`). For `command` events the engine
   automatically promotes uid/user/src/pwd into the persisted
   `evidence` dict on top of the rule's explicit `evidence_fields`.
   Engine-controlled, not per-rule — adding a new aux field is one
   line here, not a 30-rule YAML sweep, and rule authors can't
   accidentally drop it.

3. TTPInspector frontend: evidence renders as a structured
   `kvs` grid (UID / USER / SRC / PWD / CMD rows) instead of
   pretty-printed JSON. Primary-order list keeps shell fields at
   the top; everything else falls below alphabetically so unfamiliar
   evidence shapes still surface predictably.

Tests:
- session_aggregator pins the structured-fields emit (uid/user/src/
  pwd/command_text without "CMD" prefix, embedded whitespace
  preserved).
- rule_engine_tagger pins the aux-field auto-promotion + the
  no-`None`-leakage path when payload doesn't carry an aux key.
This commit is contained in:
2026-05-02 03:20:53 -04:00
parent 84699f89da
commit d1c4a48963
6 changed files with 268 additions and 4 deletions

View File

@@ -296,6 +296,21 @@ _DEFAULT_MATCH_FIELD: dict[str, str] = {
}
# Per-``source_kind`` auxiliary evidence fields that the engine
# auto-promotes onto every emitted tag, on top of the rule's
# explicit ``evidence_fields`` list. The point is operator UX: when
# a shell rule fires on ``cat /etc/shadow``, the inspector should
# show *who* ran it (``user``), *where from* (``src``), *as whom*
# (``uid``), and the working directory (``pwd``) — without forcing
# every rule author to add the same four fields to every shell
# rule's ``evidence_fields`` list. Engine-controlled, not per-rule:
# adding a new aux field is a one-line edit here, not a 30-rule
# YAML sweep.
_AUX_EVIDENCE_FIELDS: dict[str, tuple[str, ...]] = {
"command": ("uid", "user", "src", "pwd"),
}
def _evaluate_rules(
rules: list[CompiledRule], event: TaggerEvent,
) -> list[TTPTag]:
@@ -330,6 +345,12 @@ def _evaluate_rules(
for field in rule.evidence_fields
if field in event.payload
}
# Engine-controlled auxiliary fields per source_kind —
# added on top of the rule's explicit list so the
# inspector always sees uid/user/src/pwd on shell tags.
for aux in _AUX_EVIDENCE_FIELDS.get(event.source_kind, ()):
if aux in event.payload and aux not in evidence:
evidence[aux] = event.payload.get(aux)
out.append(TTPTag(
uuid=tag_uuid,
source_kind=event.source_kind,