docs(wiki): Campaign-Clustering page + sidebar link

Documents the pre-implementation test infrastructure: UKC vocabulary, synthetic campaign factory + DSL, metric harness, fixture layout, and how to run the suite. Algorithm itself isn't built yet — the simulator ships first per the design doc.
2026-04-26 06:34:45 -04:00
parent b0de3cd182
commit ba1862f380
2 changed files with 229 additions and 0 deletions
--- a/Campaign-Clustering.md
+++ b/Campaign-Clustering.md
@@ -0,0 +1,228 @@
+# Campaign Clustering
+
+Pre-implementation feature. Goal: graduate per-attacker attribution into
+**campaign-level grouping** — recover the fact that a set of distinct
+attacker rows are in fact one coordinated operation, even when IPs,
+ASNs, and toolchains diverge.
+
+The full design lives in the repo at
+[`development/CAMPAIGN_CLUSTERING.md`](https://github.com/dec-net/decnet/blob/main/development/CAMPAIGN_CLUSTERING.md).
+This page documents the **test infrastructure** that ships ahead of
+the algorithm.
+
+The order is deliberate: simulator first, algorithm second. If we cannot
+write down what a campaign *is* in code that produces ground-truth
+labels, we cannot validate any clusterer we build. The simulator is the
+specification.
+
+---
+
+## Vocabulary — Unified Kill Chain
+
+`decnet/clustering/ukc.py` defines `UKCPhase`, the canonical phase enum.
+Pols' Unified Kill Chain (2017): 19 phases across three stages.
+
+| Stage | Phases | Honeypot-observable? |
+|---|---|---|
+| **In** (initial foothold) | reconnaissance, resource_development, weaponization, delivery, social_engineering, exploitation, persistence, defense_evasion, command_and_control | partial — pre-target phases (recon/resource_dev/weaponization/social_eng) happen before any decky is touched |
+| **Through** (network propagation) | pivoting, discovery, privilege_escalation, execution, credential_access, lateral_movement | yes — MazeNET-segmented topologies make this a strength, not a gap |
+| **Out** (action on objectives) | collection, exfiltration, impact, objectives | yes |
+
+`OBSERVABLE_PHASES` is the frozenset (15 of 19) the synthetic generator
+will emit events for. The DSL accepts the full enum so a campaign spec
+can describe an end-to-end story; unobservable phases parse and validate
+but produce no synthetic events. UKC vocabulary aligns with MITRE ATT&CK
+tactics, so the same labels will be produced by the future TTP-tagging
+worker — fixtures don't need renaming when that lands.
+
+```python
+from decnet.clustering.ukc import UKCPhase, OBSERVABLE_PHASES, stage_of
+
+stage_of(UKCPhase.LATERAL_MOVEMENT)  # → "through"
+UKCPhase.RECONNAISSANCE in OBSERVABLE_PHASES  # → False
+```
+
+---
+
+## The synthetic campaign factory
+
+`tests/factories/campaign_factory.py` parses a YAML DSL describing
+actors, UKC phases, and tool signatures, and emits truth-labeled
+`SyntheticAttacker` / `SyntheticSession` records.
+
+**Key contract:** deterministic given a seed. Identical YAML + identical
+seed → identical attacker IDs and session IDs across runs. This is
+load-bearing for fixture stability and is checked by an explicit test.
+
+### Quick poke from a Python REPL
+
+The factory is a library, not a test module — pytest does not collect
+it. Drive it directly:
+
+```bash
+source .311/bin/activate
+python -c "
+from tests.factories.campaign_factory import generate, load_yaml
+spec = load_yaml('tests/fixtures/campaigns/lone_wolf.yaml')
+corpus = generate(spec, seed=0)
+print(f'{len(corpus.attackers)} attackers, {len(corpus.sessions)} sessions')
+for a in corpus.attackers[:3]:
+    print(f'  {a.attacker_id[:24]} → campaign={a.truth_campaign_id}')
+print('truth labels:', corpus.truth_labels())
+"
+```
+
+### DSL shape
+
+```yaml
+corpus:
+  campaigns:
+    - campaign:
+        id: c-apt-fauxbear-01
+        actors:
+          - id: a-001
+            asn: 14061
+            ip_pool: rotating         # rotating | sticky | tor
+            ja3: "769,4865-..."
+            hassh: "aae6b9..."
+            hours_active_utc: [22, 23, 0, 1, 2, 3]
+            jitter_seconds: 90
+        phases:                       # any UKCPhase value
+          - name: delivery
+            actor: a-001
+            tool_signature: { user_agent: "Nmap" }
+            target_selector: { count: 50 }
+            dwell_seconds: 1
+          - name: persistence
+            actor: a-001
+            target_selector: { decky: previous_success }
+        duration_days: 7
+        pause_windows: []             # [[start_day, end_day], ...]
+  noise:
+    scanner_count: 8                  # opportunistic singletons
+```
+
+Single-campaign specs may omit the `corpus:` wrapper and provide
+`campaign:` at the top level.
+
+### Hard validation errors
+
+The DSL parser (`_validate_campaign_spec`) raises `DSLValidationError`
+on:
+
+- missing `campaign`, `id`, `actors`, or `phases` keys
+- empty actor list
+- unknown UKC phase names
+- a phase referencing an actor not declared in `actors:`
+
+Unobservable phases produce a *warning*, not an error — the spec is
+allowed to describe pre-target activity, the generator just emits
+nothing for it.
+
+---
+
+## Metric harness
+
+`tests/clustering/metrics.py`. Pure-Python, no sklearn/numpy dependency.
+Decided **before** any clustering algorithm exists, on purpose: pick the
+metric after seeing results and you'll pick the one that flatters the
+algorithm.
+
+Four metrics, none individually sufficient:
+
+| Metric | Catches | Range |
+|---|---|---|
+| `adjusted_rand_index` | overall partition agreement (chance-corrected) | typically [0, 1]; negative possible |
+| `homogeneity` | **false merges** — distinct campaigns wrongly fused | [0, 1] |
+| `completeness` | **false splits** — one campaign torn across clusters | [0, 1] |
+| `singleton_recall` | noise absorption — lone wolves swallowed by real campaigns | [0, 1] |
+
+Homogeneity and completeness trade off; both must be reported. Singleton
+recall exists because ARI/homogeneity/completeness all dilute the cost
+of absorbing background scanners — and that absorption is the failure
+mode that makes attribution useless in practice.
+
+```python
+from tests.clustering.metrics import score
+
+truth = {"a": "C1", "b": "C1", "c": "C2"}
+pred  = {"a": "X",  "b": "X",  "c": "Y"}
+score(truth, pred)
+# {
+#   'adjusted_rand_index': 1.0,
+#   'homogeneity': 1.0,
+#   'completeness': 1.0,
+#   'singleton_recall': 1.0,
+# }
+```
+
+---
+
+## Fixtures
+
+`tests/fixtures/campaigns/` — YAML scenarios with paired
+`*.expected.yaml` bound files. Six fixtures planned (see the design
+doc); fixture 3 (`lone_wolf`) ships first because it exercises the full
+DSL → factory → metrics pipeline against the simplest ground truth
+(every actor is a singleton).
+
+| # | Fixture | Property under test |
+|---|---|---|
+| 1 | `shared_wordlist` *(planned)* | credential overlap alone must not merge campaigns |
+| 2 | `vpn_hopping` *(planned)* | actor identity survives IP/ASN churn |
+| 3 | `lone_wolf` ✓ | opportunistic scanners stay singleton |
+| 4 | `paused_campaign` *(planned)* | temporal gaps must not split a campaign |
+| 5 | `multi_operator` *(planned)* | UKC phase handoff merges across operators with diverged infra |
+| 6 | `noise_floor` *(planned)* | all of the above survive 10× background scanner pollution |
+
+Each fixture's bounds (`adjusted_rand_index.min`, `homogeneity.min`,
+etc.) are loose at v1 and ratchet up as the clusterer matures.
+Loosening a bound to make CI pass requires PR-comment justification.
+
+---
+
+## Running the tests
+
+```bash
+source .311/bin/activate
+
+# all 18 tests
+pytest tests/clustering/ -v
+
+# scoped runs
+pytest tests/clustering/test_metrics.py -v               # metric sanity
+pytest tests/clustering/test_campaign_factory.py -v      # factory determinism + DSL validation
+pytest tests/clustering/test_lone_wolf_fixture.py -v     # end-to-end pipeline
+```
+
+`tests/factories/campaign_factory.py` is a library, not a test module.
+Pytest will not collect it directly — invoke the test files in
+`tests/clustering/` instead.
+
+---
+
+## What hasn't been built yet
+
+- **Clusterer worker** (`decnet clusterer`). Connected-components on a
+  similarity graph is the planned v1 algorithm; ML stays out until a
+  fixture proves CC inadequate.
+- **`campaigns` table** + `attackers.campaign_id` FK.
+- **Bus signals** `campaign.{id}.formed` / `campaign.{id}.updated`.
+- **Dashboard surface** — Campaigns list page + CampaignDetail with
+  UKC phase timeline.
+- **Fixtures 1, 2, 4, 5, 6** — each property the algorithm must
+  satisfy gets its own scenario.
+- **Replay tier** — public-dataset replay (Honeynet SSH corpora,
+  DShield) through the live collector. Reality check on whether our
+  DSL captures the right dimensions. Post-v1.
+
+DSL evolution (concurrent phases, multi-actor per phase, probabilistic
+ordering) is documented as a deferred extension in
+`development/DEVELOPMENT_V2.md` — the design doesn't block it; we just
+don't need it before fixtures 1–6 ship.
+
+---
+
+See also: [Service-Bus](Service-Bus) (where future
+`campaign.{id}.formed` signals will live), [Testing-and-CI](Testing-and-CI),
+[Module-Reference-Workers](Module-Reference-Workers).
--- a/_Sidebar.md
+++ b/_Sidebar.md
@@ -46,6 +46,7 @@
 - [Module-Reference-Workers](Module-Reference-Workers)
 - [PKI-and-mTLS](PKI-and-mTLS)
 - [Testing-and-CI](Testing-and-CI)
+- [Campaign-Clustering](Campaign-Clustering)
 - [Performance-Story](Performance-Story)
 - [Tracing-and-Profiling](Tracing-and-Profiling)