docs(wiki): Campaign-Clustering page + sidebar link

Documents the pre-implementation test infrastructure: UKC vocabulary,
synthetic campaign factory + DSL, metric harness, fixture layout, and
how to run the suite. Algorithm itself isn't built yet — the simulator
ships first per the design doc.
2026-04-26 06:34:45 -04:00
parent b0de3cd182
commit ba1862f380
2 changed files with 229 additions and 0 deletions

228
Campaign-Clustering.md Normal file

@@ -0,0 +1,228 @@
# Campaign Clustering
Pre-implementation feature. Goal: graduate per-attacker attribution into
**campaign-level grouping** — recover the fact that a set of distinct
attacker rows are in fact one coordinated operation, even when IPs,
ASNs, and toolchains diverge.
The full design lives in the repo at
[`development/CAMPAIGN_CLUSTERING.md`](https://github.com/dec-net/decnet/blob/main/development/CAMPAIGN_CLUSTERING.md).
This page documents the **test infrastructure** that ships ahead of
the algorithm.
The order is deliberate: simulator first, algorithm second. If we cannot
write down what a campaign *is* in code that produces ground-truth
labels, we cannot validate any clusterer we build. The simulator is the
specification.
---
## Vocabulary — Unified Kill Chain
`decnet/clustering/ukc.py` defines `UKCPhase`, the canonical phase enum.
Pols' Unified Kill Chain (2017): 19 phases across three stages.
| Stage | Phases | Honeypot-observable? |
|---|---|---|
| **In** (initial foothold) | reconnaissance, resource_development, weaponization, delivery, social_engineering, exploitation, persistence, defense_evasion, command_and_control | partial — pre-target phases (recon/resource_dev/weaponization/social_eng) happen before any decky is touched |
| **Through** (network propagation) | pivoting, discovery, privilege_escalation, execution, credential_access, lateral_movement | yes — MazeNET-segmented topologies make this a strength, not a gap |
| **Out** (action on objectives) | collection, exfiltration, impact, objectives | yes |
`OBSERVABLE_PHASES` is the frozenset (15 of 19) the synthetic generator
will emit events for. The DSL accepts the full enum so a campaign spec
can describe an end-to-end story; unobservable phases parse and validate
but produce no synthetic events. UKC vocabulary aligns with MITRE ATT&CK
tactics, so the same labels will be produced by the future TTP-tagging
worker — fixtures don't need renaming when that lands.
```python
from decnet.clustering.ukc import UKCPhase, OBSERVABLE_PHASES, stage_of
stage_of(UKCPhase.LATERAL_MOVEMENT) # → "through"
UKCPhase.RECONNAISSANCE in OBSERVABLE_PHASES # → False
```
---
## The synthetic campaign factory
`tests/factories/campaign_factory.py` parses a YAML DSL describing
actors, UKC phases, and tool signatures, and emits truth-labeled
`SyntheticAttacker` / `SyntheticSession` records.
**Key contract:** deterministic given a seed. Identical YAML + identical
seed → identical attacker IDs and session IDs across runs. This is
load-bearing for fixture stability and is checked by an explicit test.
### Quick poke from a Python REPL
The factory is a library, not a test module — pytest does not collect
it. Drive it directly:
```bash
source .311/bin/activate
python -c "
from tests.factories.campaign_factory import generate, load_yaml
spec = load_yaml('tests/fixtures/campaigns/lone_wolf.yaml')
corpus = generate(spec, seed=0)
print(f'{len(corpus.attackers)} attackers, {len(corpus.sessions)} sessions')
for a in corpus.attackers[:3]:
print(f' {a.attacker_id[:24]} → campaign={a.truth_campaign_id}')
print('truth labels:', corpus.truth_labels())
"
```
### DSL shape
```yaml
corpus:
campaigns:
- campaign:
id: c-apt-fauxbear-01
actors:
- id: a-001
asn: 14061
ip_pool: rotating # rotating | sticky | tor
ja3: "769,4865-..."
hassh: "aae6b9..."
hours_active_utc: [22, 23, 0, 1, 2, 3]
jitter_seconds: 90
phases: # any UKCPhase value
- name: delivery
actor: a-001
tool_signature: { user_agent: "Nmap" }
target_selector: { count: 50 }
dwell_seconds: 1
- name: persistence
actor: a-001
target_selector: { decky: previous_success }
duration_days: 7
pause_windows: [] # [[start_day, end_day], ...]
noise:
scanner_count: 8 # opportunistic singletons
```
Single-campaign specs may omit the `corpus:` wrapper and provide
`campaign:` at the top level.
### Hard validation errors
The DSL parser (`_validate_campaign_spec`) raises `DSLValidationError`
on:
- missing `campaign`, `id`, `actors`, or `phases` keys
- empty actor list
- unknown UKC phase names
- a phase referencing an actor not declared in `actors:`
Unobservable phases produce a *warning*, not an error — the spec is
allowed to describe pre-target activity, the generator just emits
nothing for it.
---
## Metric harness
`tests/clustering/metrics.py`. Pure-Python, no sklearn/numpy dependency.
Decided **before** any clustering algorithm exists, on purpose: pick the
metric after seeing results and you'll pick the one that flatters the
algorithm.
Four metrics, none individually sufficient:
| Metric | Catches | Range |
|---|---|---|
| `adjusted_rand_index` | overall partition agreement (chance-corrected) | typically [0, 1]; negative possible |
| `homogeneity` | **false merges** — distinct campaigns wrongly fused | [0, 1] |
| `completeness` | **false splits** — one campaign torn across clusters | [0, 1] |
| `singleton_recall` | noise absorption — lone wolves swallowed by real campaigns | [0, 1] |
Homogeneity and completeness trade off; both must be reported. Singleton
recall exists because ARI/homogeneity/completeness all dilute the cost
of absorbing background scanners — and that absorption is the failure
mode that makes attribution useless in practice.
```python
from tests.clustering.metrics import score
truth = {"a": "C1", "b": "C1", "c": "C2"}
pred = {"a": "X", "b": "X", "c": "Y"}
score(truth, pred)
# {
# 'adjusted_rand_index': 1.0,
# 'homogeneity': 1.0,
# 'completeness': 1.0,
# 'singleton_recall': 1.0,
# }
```
---
## Fixtures
`tests/fixtures/campaigns/` — YAML scenarios with paired
`*.expected.yaml` bound files. Six fixtures planned (see the design
doc); fixture 3 (`lone_wolf`) ships first because it exercises the full
DSL → factory → metrics pipeline against the simplest ground truth
(every actor is a singleton).
| # | Fixture | Property under test |
|---|---|---|
| 1 | `shared_wordlist` *(planned)* | credential overlap alone must not merge campaigns |
| 2 | `vpn_hopping` *(planned)* | actor identity survives IP/ASN churn |
| 3 | `lone_wolf` ✓ | opportunistic scanners stay singleton |
| 4 | `paused_campaign` *(planned)* | temporal gaps must not split a campaign |
| 5 | `multi_operator` *(planned)* | UKC phase handoff merges across operators with diverged infra |
| 6 | `noise_floor` *(planned)* | all of the above survive 10× background scanner pollution |
Each fixture's bounds (`adjusted_rand_index.min`, `homogeneity.min`,
etc.) are loose at v1 and ratchet up as the clusterer matures.
Loosening a bound to make CI pass requires PR-comment justification.
---
## Running the tests
```bash
source .311/bin/activate
# all 18 tests
pytest tests/clustering/ -v
# scoped runs
pytest tests/clustering/test_metrics.py -v # metric sanity
pytest tests/clustering/test_campaign_factory.py -v # factory determinism + DSL validation
pytest tests/clustering/test_lone_wolf_fixture.py -v # end-to-end pipeline
```
`tests/factories/campaign_factory.py` is a library, not a test module.
Pytest will not collect it directly — invoke the test files in
`tests/clustering/` instead.
---
## What hasn't been built yet
- **Clusterer worker** (`decnet clusterer`). Connected-components on a
similarity graph is the planned v1 algorithm; ML stays out until a
fixture proves CC inadequate.
- **`campaigns` table** + `attackers.campaign_id` FK.
- **Bus signals** `campaign.{id}.formed` / `campaign.{id}.updated`.
- **Dashboard surface** — Campaigns list page + CampaignDetail with
UKC phase timeline.
- **Fixtures 1, 2, 4, 5, 6** — each property the algorithm must
satisfy gets its own scenario.
- **Replay tier** — public-dataset replay (Honeynet SSH corpora,
DShield) through the live collector. Reality check on whether our
DSL captures the right dimensions. Post-v1.
DSL evolution (concurrent phases, multi-actor per phase, probabilistic
ordering) is documented as a deferred extension in
`development/DEVELOPMENT_V2.md` — the design doesn't block it; we just
don't need it before fixtures 16 ship.
---
See also: [Service-Bus](Service-Bus) (where future
`campaign.{id}.formed` signals will live), [Testing-and-CI](Testing-and-CI),
[Module-Reference-Workers](Module-Reference-Workers).

@@ -46,6 +46,7 @@
- [Module-Reference-Workers](Module-Reference-Workers)
- [PKI-and-mTLS](PKI-and-mTLS)
- [Testing-and-CI](Testing-and-CI)
- [Campaign-Clustering](Campaign-Clustering)
- [Performance-Story](Performance-Story)
- [Tracing-and-Profiling](Tracing-and-Profiling)