test(clustering): fixture 4 paused_campaign + active_days/time_window

Adds the actor.active_days primitive to the campaign factory so a DSL actor can be bound to specific day indexes. Falls back to the non-paused day pool when absent (existing fixtures unchanged). Intersects with pause_windows so the campaign-wide silence still wins if both are set. Adds time_window_clusterer reference to fixture_harness — union-find over attackers, edge if their session time-ranges are within gap_days of each other. Deliberately-bad reference for fixture 4: multi-day silent stretches fragment a single campaign because the clusterer has no signal that bridges the gap. Fixture 4 (paused_campaign): one campaign modeled as two DSL actors representing the operator's two operational windows (active days 1-2 and 6-7), separated by a silent stretch (days 3-5). Both share JA3 + HASSH + payload + C2 callback; only their active_days differ. Five tests: corpus shape (rows in their windows, shared signals), pipeline pass via fingerprint_clusterer at level=campaign, adversarial fragmentation via time_window_clusterer (1-day union threshold cannot bridge the 4-day silence → completeness collapses), huge-gap sanity (gap_days=10 unions both halves), silent-stretch invariant (no session leaks into the configured pause window). Identity-level scoring is fixture 2's job; this fixture is campaign-level only — modeling caveat documented in the YAML.
2026-04-26 07:39:46 -04:00
parent 0def6f7e37
commit 304592abfe
5 changed files with 334 additions and 11 deletions
--- a/tests/fixtures/campaigns/paused_campaign.expected.yaml
+++ b/tests/fixtures/campaigns/paused_campaign.expected.yaml
@@ -0,0 +1,24 @@
+# Bounds for fixture 4 (paused_campaign).
+#
+# Ground truth at campaign-level: 1 campaign of 2 observation rows
+# (one per DSL actor — modeling the operator's two operational
+# windows). A correct algorithm scores 1.0 on every metric.
+#
+# Completeness is the load-bearing metric: a clusterer that lets a
+# multi-day silent period split the campaign tanks completeness
+# (the one true class is split across two predicted clusters,
+# matching the gap). The adversarial time_window_clusterer
+# demonstrates this and the bound below rejects it.
+#
+# This fixture is CAMPAIGN-LEVEL ONLY (see the fixture YAML for
+# why). No identity-level scoring.
+#
+# Bounds are loose at v1; tighten as the algorithm matures.
+adjusted_rand_index:
+  min: 0.85
+homogeneity:
+  min: 0.90
+completeness:
+  min: 0.80
+singleton_recall:
+  min: 0.95
--- a/tests/fixtures/campaigns/paused_campaign.yaml
+++ b/tests/fixtures/campaigns/paused_campaign.yaml
@@ -0,0 +1,85 @@
+# Fixture 4 (paused_campaign) — see development/CAMPAIGN_CLUSTERING.md §2.
+#
+# One campaign that operates in two sprints with a multi-day silence
+# between them:
+#
+#   active days 1-2 (0-indexed [0, 1]) — Delivery, Exploitation
+#   silent days 3-5 (0-indexed [2, 3, 4]) — pause window
+#   active days 6-7 (0-indexed [5, 6]) — Discovery, Lateral Movement,
+#                                        Exfiltration
+#
+# Modeled as TWO DSL actors representing the same operator's two
+# operational windows. Both share JA3, HASSH, payload, and C2
+# callback — the stable signals a fingerprint-driven clusterer
+# resolves on. Their ``active_days`` differ so each operator-half
+# emits sessions in disjoint time ranges, which is what makes the
+# adversarial time-window clusterer fragment the campaign.
+#
+# Two-actor modeling caveat: the factory mints a separate
+# ``truth_identity_id`` per DSL actor by design (see IDENTITY_
+# RESOLUTION.md — identities are recovered from signals, not
+# declared in the DSL). This is a CAMPAIGN-LEVEL fixture only;
+# identity-level scoring is fixture 2's job. The bound floors below
+# apply at level=campaign.
+#
+# Pass condition: a fingerprint-driven clusterer must fold both
+# operational windows into one cluster (shared JA3 + HASSH +
+# payload). A clusterer that lets a multi-day quiet period split
+# the campaign fails the completeness floor.
+#
+# Adversarial condition: ``time_window_clusterer`` (union sessions
+# within ≤1 day of each other) is unable to bridge the 4-day silent
+# stretch and splits the campaign into "before pause" and "after
+# pause" clusters. Completeness collapses; the bound floor rejects
+# this clusterer.
+campaign:
+  id: paused-campaign-001
+  duration_days: 7
+  pause_windows:
+    - [2, 4]            # campaign-wide silence days 3-5 (0-indexed)
+  actors:
+    - id: ops-sprint-1
+      asn: 64520
+      ip_pool: sticky
+      ja3: "771,4865-4866-4867-49195-49199-49196-49200,0-23-65281-10-11-35-16-5-13-18-51-45-43-27,29-23-24,0"
+      hassh: "paused-op-dddddddd-dddddddd-dddddddd"
+      hours_active_utc: [9, 10, 11, 12, 13, 14, 15, 16]
+      jitter_seconds: 60
+      active_days: [0, 1]
+    - id: ops-sprint-2
+      asn: 64520        # same ASN — operator stays on same egress
+      ip_pool: sticky
+      ja3: "771,4865-4866-4867-49195-49199-49196-49200,0-23-65281-10-11-35-16-5-13-18-51-45-43-27,29-23-24,0"
+      hassh: "paused-op-dddddddd-dddddddd-dddddddd"
+      hours_active_utc: [9, 10, 11, 12, 13, 14, 15, 16]
+      jitter_seconds: 60
+      active_days: [5, 6]
+  phases:
+    - name: delivery
+      actor: ops-sprint-1
+      target_selector: { service: ssh, count: 2 }
+      dwell_seconds: 1
+    - name: exploitation
+      actor: ops-sprint-1
+      tool_signature:
+        payload_hash: "paused-op-stage1-payload"
+        c2_callback: "c2.paused-op.example"
+      target_selector: { service: ssh, count: 2 }
+      dwell_seconds: 5
+    - name: discovery
+      actor: ops-sprint-2
+      target_selector: { service: ssh, count: 2 }
+      dwell_seconds: 5
+    - name: lateral_movement
+      actor: ops-sprint-2
+      tool_signature:
+        payload_hash: "paused-op-stage1-payload"
+        c2_callback: "c2.paused-op.example"
+      target_selector: { service: ssh, count: 2 }
+      dwell_seconds: 5
+    - name: exfiltration
+      actor: ops-sprint-2
+      tool_signature:
+        c2_callback: "c2.paused-op.example"
+      target_selector: { service: ssh, count: 2 }
+      dwell_seconds: 5