Three producer-side regression guards. Each drives the worker's run
loop with a fake bus + stubbed repo and asserts the documented topic
fires when the producer has data:
- reuse correlator → credential.reuse.detected (one finding row)
- clusterer → identity.formed + identity.merged (one ClusterResult)
- intel worker → attacker.intel.enriched (one unenriched attacker
+ a fake provider returning a "malicious" verdict)
These complement commit 1's attacker.session.ended producer test —
together the four cover every TTP-relevant publisher in the tree
(modulo email.received, which has no producer yet; tracked in
DEBT.md).
Add BaseRepository.list_ttp_decky_phases(identity_uuid) returning
per-decky tag observations as (decky_id, tactic, created_at_ts) rows
ordered by creation time. Rewrite from_identity_row() to project
tactic → UKCPhase via tactic_to_ukc_phase and populate the four
phase-handoff maps (first/last_phase_per_decky,
first/last_seen_per_decky) so combined_campaign_weight finally lights
up on real DB rows — not just synthetic fixtures.
ConnectedComponentsCampaignClusterer.tick() pulls each active
identity's per-decky phase observations before projecting features.
Repo failures are non-fatal: a partial repo falls back to the empty
phase-handoff signal (legacy behavior) so the worker stays up.
tests/clustering/test_ttp_phase_handoff.py pins the production-row
pair clearing CAMPAIGN_EDGE_THRESHOLD on a C2 → DISCOVERY hand-off —
the trip-wire that says the whole project paid off.
commands_by_phase_on_decky itself stays empty on the production path:
it is consumed only by the synthetic-fixture similarity surface, and
the phase-handoff edge does not use it. Synthetic fixtures still
populate it directly via from_synthetic_identity.
Pre-target phases (RECONNAISSANCE/RESOURCE_DEVELOPMENT/WEAPONIZATION/
SOCIAL_ENGINEERING) and observable-but-unmappable phases (EXPLOITATION/
PIVOTING/OBJECTIVES, UKC-only concepts ATT&CK lacks tactics for) are
pinned as lossy via _LOSSY_INVERSE_REFERENCE so a future contributor
cannot 'fix' the asymmetry without tripping the suite.
Brings the federation-gossip columns on AttackerIdentity to life —
ja3_hashes, hassh_hashes, and the new tls_cert_sha256 — by projecting
the union of every member observation's fingerprints JSON onto the
identity at clusterer create / link / merge time.
- decnet/profiler/identity_rollup.py: pure extract_fp_summaries()
reads the production bounty shape (payload.fingerprint_type +
payload.{ja3,hash,cert_sha256}) and returns deduped+sorted JSON
list[str] per family, or None when a family has no signal so the
column stays NULL instead of '[]'.
- BaseRepository.update_identity_fingerprints + SQLModel impl: one
idempotent write that overwrites the three summary columns and
bumps updated_at.
- ConnectedComponentsClusterer: after every per-component
reconciliation (fresh-create OR existing-merge+link), recomputes
and writes the rollup for the target identity. Wrapped in a
best-effort helper so a write failure logs but never breaks the
tick.
- Tests: extract_fp_summaries unit (dedup, sort determinism,
unknown types ignored, malformed JSON, nested-stringified
payloads, non-string values); end-to-end clusterer ticks
populate the columns on create + on later observation links;
no-fingerprint clusters keep the columns NULL.
Runs the chained identity + campaign clustering pipeline against all
seven fixtures via from_synthetic / from_synthetic_identity adapters
and ratchets every YAML floor to 1.0 — the production clusterer
(and the reference clusterers used in the per-fixture tests) all
score perfectly across ARI / homogeneity / completeness /
singleton_recall on each fixture.
Three substrate fixes surfaced by the ratchet:
- Tuning: shared_infra now Jaccards payload+C2 only; decky_set moved
into cohort_weight to prevent fleet-scarcity false-merges (F1's
shared_wordlist failure mode). Tier weight raised to 1.0 so
shared payload+C2 alone crosses threshold (F5's intended pass).
- Adapter: from_synthetic_identity now reads SyntheticSession
started_at + duration_s for session_windows and per-decky
timestamps (the production-row adapter still uses start_ts/end_ts
when available).
- Fixture data: paused_campaign.yaml's JA3 collided exactly with
vpn_hopping.yaml's (same TLS extension list). The collision
fused two unrelated campaigns under the chained identity layer
in the noise_floor composite. Made paused's JA3 distinct.
Also wires Campaign / CampaignsResponse into models/__init__.py's
__all__ that was missed in the schema commit.
The campaign clusterer worker mirrors the identity-side worker shell
(bus connect, heartbeat, control listener, slow-tick fallback) but
wakes on identity.> instead of attacker.> — campaign-level work is
gated on identity-layer changes, not raw observations.
The connected-components implementation reads identities via
list_identities_for_clustering, projects them with from_identity_row,
runs union-find over combined_campaign_weight, writes campaigns rows,
sets attacker_identities.campaign_id, and runs the same revocable-
merge pass as the identity layer (a merged-out campaign whose
identities no longer co-cluster with the winner gets revoked).
Bus: adds campaign.> family (formed / identity.assigned / merged /
unmerged) plus the cross-family identity.campaign.assigned so
existing identity-stream subscribers see the badge update without
having to subscribe to campaign.>. Wiki Service-Bus.md updated in
wiki-checkout in the same wave per the project's bus-signals
discipline.
CLI: decnet campaign-clusterer registered as master-only via
MASTER_ONLY_COMMANDS; --poll-interval / --daemon mirror the identity
clusterer command surface.
The signal taxonomy for the campaign clusterer (next commit). Mirror
of the identity-layer module but with edge families that don't
translate 1:1: phase-handoff (load-bearing for F5 multi_operator —
the signal the identity-side fingerprint-disagreement veto deliberately
isn't), shared-infra (vetoed at identity level, primary positive
signal here), temporal-overlap (pairwise-relative — F7 invariance
preserved), cohort (weak supporting weight only).
Tier weights tuned so phase-handoff alone crosses threshold (F5),
shared-infra + temporal-overlap together cross (canonical co-op
pattern), and shared-infra + cohort together do NOT (F1
shared_wordlist's failure mode). The F7 time-shift invariant is
explicitly tested on every time-bearing edge and on the combined
weight.
Reworks the clusterer's tick to handle multi-identity components and
re-evaluate prior merges. Two passes per tick:
Pass 1 — per-component reconciliation:
* Fresh component → mint identity (commit 4 path).
* Single-identity component → link unassigned observations.
* Multi-identity component → soft-merge: pick the smallest-uuid
winner deterministically, set merged_into_uuid on each loser,
link unassigned observations to the winner. Observations stay
FK'd to their original identity row — the merge is a soft
pointer, not a re-point. Audit trail preserved; cached
subscribers resolve through the chain.
Pass 2 — revocable-merge undo:
* For each merged-out identity, check whether its observations
still cluster with its winner's. If not, the merge is
contradicted by new evidence — clear merged_into_uuid and emit
identities_unmerged. The resurrected identity keeps its original
uuid, so subscribers that cached it during the merged interval
re-attach without a new lookup.
A pre-built merge-chain dict feeds Pass 1 so the effective-identity
lookup is O(1) per observation. The chain has a hop cap (paranoia
against accidental cycles in the underlying state).
Repo additions on BaseRepository + SQLModelRepository:
* list_all_identities() — includes merged-out rows.
* update_identity_merged_into(uuid, winner_or_None) — single
setter for both merge and unmerge.
DummyRepo coverage stub updated.
Tests:
* Two distinct identities bridged by a new observation merge with
the smaller uuid as winner.
* A pre-seeded soft-merge whose underlying observations diverge
gets revoked; resurrected uuid emerges with merged_into_uuid
cleared.
* Tick is idempotent under no state changes.
Two targeted invariants instead of a wholesale YAML-bounds re-use,
because the existing F6 bounds were tuned for the reference
composite_signals_clusterer (fingerprint OR C2). The production
clusterer trades that aggregation for tier discipline + the
fingerprint-disagreement veto, so its score profile differs even
when its judgments are correct — multi_operator stays as 2 truth
identities, paused_campaign's two DSL actors remain a single cluster
because they share fingerprints, etc. Wholesale bounds re-use would
fight the design.
The two production-side ratchets:
1. singleton_recall ≥ 0.95 at campaign-level scoring — truth-
singleton noise scanners must not be absorbed into real campaigns.
This is the F6 failure mode that motivates the fixture.
2. Intra-campaign recovery under cross-corpus interference:
* vpn_hopping's 5 rotations consolidate to one cluster.
* shared_wordlist A and B stay in disjoint clusters despite
sharing credentials with each other (and with the noise floor).
A future commit can revisit when the production clusterer's identity-
level truth alignment improves (e.g. when paused_campaign's DSL is
extended to mark its two actors as one truth identity).
Fixture 7 ratchet: one campaign across 3 multi-week operational
windows with stable JA3 + HASSH + C2. The production clusterer must
fold all 3 into one cluster despite multi-week silence between
windows; completeness = 1.0.
Time-shift invariance test: applying a +90 day delta to every
session start (and the per-attacker first/last seen) must produce
the same cluster membership as the baseline. This is the runtime
counterpart of the static no-time-fields check on Observation. If
either check ever fails, the clusterer has accidentally grown a
recency-aware edge — fixture 7's whole reason for existing.
Pins down the tier-discipline contract end-to-end:
- Credentials-only overlap doesn't fuse observations (F1 in
miniature).
- ASN-only overlap doesn't fuse observations (F2 in miniature).
- All three weak tiers (medium + low + very-low) stacked still
don't fuse — only a high-tier signal does.
- F1 (shared_wordlist) at identity-level: no false merges, every
row is its own predicted cluster, homogeneity = 1.0.
- F2 (vpn_hopping): 5 distinct ASNs collapse into 1 predicted
cluster, proving JA3 / HASSH dominate ASN as the design
requires.
The combination math itself was wired in commit 5; this commit is
the failure-mode regression suite that gates future tuning of the
tier weights.
Two operators cooperating on one campaign can share C2 endpoints +
stage-1 payloads while running distinct tooling — fixture 5
(multi_operator) is the canonical demonstration. The identity
clusterer must NOT fuse them: shared infra is a campaign-level
signal, not an identity-level one. The campaign clusterer (downstream
work) handles that grouping over identities.
Mechanism: when two observations have non-null fingerprints AND the
fingerprints fully disagree, the high-weight tier drops the payload
and C2 contributions to zero. JA3 / HASSH agreement still returns
1.0 directly — no veto applies when something agrees. Partial
agreement (one slot agrees, another disagrees) is treated as
agreement, since stable-tool partial overlap is more consistent
with one identity than two.
The veto only triggers when there is actual disagreement evidence —
two un-fingerprinted observations sharing a C2 still cluster, since
the absence of fingerprints is not the same as disagreement on them.
Fixture 5 production-clusterer assertion added at identity level:
ARI = 1.0, homogeneity = 1.0, exactly 2 predicted clusters from
2 truth identities. Phase-handoff edges (from the TODO) belong to
the downstream campaign clusterer, not this identity clusterer.
The clusterer now drops a single high-tier function call in favor of
a tier-weighted sum. Tier multipliers (high=1.0, medium=0.6, low=0.2,
very_low=0.05) are tuned so the threshold (1.0) admits high-tier
agreement alone while leaving every weaker tier — and every
combination of weaker tiers — under threshold.
Per-tier discipline tested:
- high alone clusters
- medium alone does NOT cluster (supporting signal only)
- low alone does NOT cluster (fixture 1's failure mode)
- very-low alone does NOT cluster (fixture 2's failure mode)
- all three weak tiers stacked still don't reach threshold
- high + medium clusters (high already saturates)
The combination is forward-compatible: low + very-low contributions
are computed today but always project to 0.0 because the production
adapter doesn't populate credentials / ASN-edge inputs into the
fixture path yet. Their contribution becomes load-bearing in commit 7
when the low-tier landing tightens the F1 / F2 bounds.
Fixture 4 (paused_campaign) ratchet added: high-tier signal carries
the multi-day-silence campaign into one identity. Time-agnostic
invariant — silence is irrelevant to the edge weight.
The connected-components clusterer now writes attacker_identities
rows + sets attackers.identity_id when high-weight signals (JA3 /
HASSH / payload-hash / C2-endpoint exact match) agree across
observations. Singletons stay un-fingerprinted and un-clustered.
Algorithm split:
- cluster_observations(observations) — pure union-find over the
high-weight edge function. Same code path for fixture validation
and production tick.
- from_attacker_row(row) — production-row adapter; recovers JA3 +
HASSH from Attacker.fingerprints JSON. Payload + C2 join from
logs in later commits; the function shape doesn't change.
Repo additions on BaseRepository + SQLModelRepository:
- list_attackers_for_clustering(limit=None)
- create_attacker_identity(row)
- set_attacker_identity_id(attacker_uuid, identity_uuid)
DummyRepo coverage stub updated.
v1 behavior is conservative: only assigns identities to observations
whose identity_id is currently NULL. Multi-identity components are
skipped this pass — merge / re-assign lands in commit 10 with
revocable merges.
Fixture bounds tightened against the production clusterer:
- lone_wolf (F3) — singletons stay singletons
- shared_wordlist (F1) — credential-only overlap doesn't cluster
(high-weight tier doesn't include credentials)
- vpn_hopping (F2, identity-level) — 5 rotated IPs with stable JA3
+ HASSH fold into one identity, ARI = 1.0, completeness = 1.0
Adds the four weight-tier edge functions as pure, time-agnostic
scoring primitives over an Observation projection. Each returns a
score in [0, 1]; the connected-components impl will combine + threshold
in subsequent commits.
Tier semantics (from IDENTITY_RESOLUTION.md):
- high — JA3/HASSH/payload-hash/C2-endpoint exact match
- medium — phase-bucketed command-sequence Jaccard
- low — credential-attempt-set Jaccard (defeated alone by F1)
- very low — ASN equality (defeated alone by F2)
Time-agnostic invariant is a static test: Observation has no time
fields, so no edge function can silently start using them. Fixture 7
forbids recency-decay clustering on multi-month APT campaigns.
A from_synthetic() adapter projects SyntheticAttacker corpora into
Observation; the production-row adapter lands when the clusterer
starts reading the attackers table.
Revocable merges (a contradiction-driven undo of identity.merged) ship
in the clusterer work; this reserves the topic up-front so identity.>
subscribers receive it day one without a re-subscribe.
The clusterer worker's ClusterResult fan-out now publishes on
identity.unmerged when populated. The skeleton clusterer never
populates it; the revocable-merge commit will.
Wiki update lives in wiki-checkout/Service-Bus.md (separate repo).
Adds the decnet clusterer master-only command + provider-subpackage
shape (base.py + factory.py + impl/connected_components.py) so
subsequent commits can land similarity-graph features without
churning callers.
The skeleton ConnectedComponentsClusterer.tick is a no-op; the
worker shell is fully wired (bus consumer on attacker.observed +
attacker.scored, slow-tick fallback, health heartbeat, control
listener, ClusterResult fan-out to identity.formed/observation.linked
/merged). Subscribers on identity.> see no events from this clusterer
until edge functions land, but the lifecycle is in place.
Multi-month APT campaign modeling real APT operational tempo: recon
over weeks, exploitation later, action-on-objectives later still.
The unique signal this fixture stresses is TIME-AGNOSTIC IDENTITY
across multi-week silences — a clusterer that silently expires old
edges fragments any campaign that operates over months.
Three DSL actors represent the operator's three operational windows
(week 2, month 2, month 3 of a 90-day campaign), all sharing JA3 +
HASSH + payload + C2 callback. Campaign-level fixture only — the
three actors mint distinct truth_identity_id rows by design (same
modeling caveat as fixtures 4 and 5).
The fixture's narrative mirrors how an APT works a deep nested
topology (DECNET MazeNET mode): map decoy networks for weeks, only
then commit to exploitation. Slow-and-low pacing is the signal.
recency_decay_clusterer added to fixture_harness — same edge
construction as composite_signals_clusterer, but each edge weighted
by exp(-time_distance / half_life_days) and dropped below a
threshold. Adversarial reference for slow_burn: with 14-day half-
life and 0.5 threshold, edges between operational windows (24+ days
apart) decay below threshold and drop. The campaign fragments into
three clusters; completeness collapses.
This is the canonical production failure mode for graph clusterers
that bound memory or bias toward "what's hot" by silently expiring
old edges. Catching it in synthetic data is what fixture 7 exists
for; the replay tier will surface real-world drift / dwell patterns
that calibrate the half-life threshold the real algorithm should
tolerate.
Four tests: corpus shape (window-isolated sessions, stable
fingerprint), pipeline pass via composite_signals_clusterer (time-
agnostic — folds all three windows), adversarial fragmentation
(3 clusters at 14-day half-life), long-half-life sanity (gentle
decay unions everything; confirms behavior depends on the half-life
parameter, not on something unrelated).
Bundles all five prior fixtures' campaigns into one corpus alongside
10 fresh Delivery-only noise scanners (on top of lone_wolf's 8
inherited). The fixture covers cross-corpus interference — signal
collisions across fixtures' JA3/HASSH/C2 strings, factory ID re-use,
clusterer ambiguity that only manifests when multiple campaigns
score together. Each constituent fixture already ships its own
in-fixture adversarial test; this one is the control for the class
of failures that single-corpus fixtures cannot catch.
Composition is declared via a fixture-6-specific include_fixtures
block in noise_floor.yaml. The test file's loader expands it into
a full corpus.campaigns spec at runtime so the factory itself stays
unaware — no factory primitive added for what only this fixture
needs. The 8 noise scanners declared by lone_wolf flow through
naturally; the extra_noise_scanners count adds 10 more.
composite_signals_clusterer (added in the fixture-5 commit) is the
pass clusterer — union-find combining (ja3, hassh) match OR
overlapping C2 callback. Approximates the planned similarity graph
well enough that every campaign resolves and every singleton stays
singleton in the merged corpus.
Three tests: corpus integrity (every campaign id present, 12
campaign-driven attackers + 18 noise = 30 total), pipeline pass
against the global bounds, and an explicit singleton-recall
assertion (21 truth-singletons — 1 lone wolf, 18 noise, 2
shared_wordlist actors whose campaigns are size 1 — all kept
singleton by the composite clusterer). Singleton recall is the
load-bearing metric here: noise absorption is the failure mode
that makes campaign attribution useless in practice.
Three new reference clusterers in fixture_harness:
* c2_callback_clusterer — union-find on overlapping C2 callback
sets across an attacker's sessions. Pass-clusterer for fixture 5
where two operators with distinct tooling share a C2 endpoint as
the campaign signal.
* shift_clusterer — deliberately-bad reference that buckets
attackers by majority session-start hour into night/day/swing.
Adversarial reference for fixture 5; proves operational schedule
is NOT a campaign signal.
* composite_signals_clusterer — union-find combining (ja3, hassh)
match OR overlapping C2 callback. Will serve as the pass-
clusterer for fixture 6 (noise_floor) where multiple campaigns
with heterogeneous signal types are scored together.
Also factored a small _union_find helper for the new clusterers
(existing time_window/credential_jaccard left untouched to avoid
mixing refactor with feature work).
Fixture 5 (multi_operator): one campaign, two operators with
distinct UKC roles. Actor A (broker, night shift): Delivery →
Exploitation → Persistence → C2. Actor B (post-ex, day shift):
Discovery → Lateral Movement → Collection → Exfiltration.
Distinct JA3/HASSH/ASN/IPs; shared C2 + payload hash.
Four tests: corpus shape (distinct fingerprints, shared C2,
disjoint shifts), pipeline pass via c2_callback_clusterer,
explicit harness sanity that fingerprint_clusterer cannot resolve
this fixture (documents which signal carries the campaign), and
adversarial shift_clusterer fragmentation.
Phase-handoff edges (the real load-bearing signal per the design
doc) wait for the production clusterer; this fixture will prove
they're needed when it ships.
Adds the actor.active_days primitive to the campaign factory so a
DSL actor can be bound to specific day indexes. Falls back to the
non-paused day pool when absent (existing fixtures unchanged).
Intersects with pause_windows so the campaign-wide silence still
wins if both are set.
Adds time_window_clusterer reference to fixture_harness — union-find
over attackers, edge if their session time-ranges are within
gap_days of each other. Deliberately-bad reference for fixture 4:
multi-day silent stretches fragment a single campaign because the
clusterer has no signal that bridges the gap.
Fixture 4 (paused_campaign): one campaign modeled as two DSL actors
representing the operator's two operational windows (active days
1-2 and 6-7), separated by a silent stretch (days 3-5). Both share
JA3 + HASSH + payload + C2 callback; only their active_days differ.
Five tests: corpus shape (rows in their windows, shared signals),
pipeline pass via fingerprint_clusterer at level=campaign,
adversarial fragmentation via time_window_clusterer (1-day union
threshold cannot bridge the 4-day silence → completeness collapses),
huge-gap sanity (gap_days=10 unions both halves), silent-stretch
invariant (no session leaks into the configured pause window).
Identity-level scoring is fixture 2's job; this fixture is
campaign-level only — modeling caveat documented in the YAML.
One campaign, one DSL actor, ip_pool: rotating + rotation_count: 5
across 5 synthetic private-use ASNs (RFC 6996 64512-64516). Stable
JA3, HASSH, and payload_hash across every rotation — these are the
"signals the attacker can't cheaply rotate" per IDENTITY_RESOLUTION.md
and the load-bearing reason all 5 observation rows must resolve to
one identity / one campaign.
Two new reference clusterers in fixture_harness.py:
* fingerprint_clusterer — groups by (ja3, hassh). Un-fingerprinted
rows stay singleton so it doesn't trivially fuse all noise into one
mega-cluster. Approximates the stable-signal arm of the planned
similarity graph.
* asn_clusterer — deliberately-bad reference for fixture 2's
adversarial test. Group-by-ASN shatters the campaign into 5
singletons; completeness collapses to 0.
Four tests in test_vpn_hopping_fixture.py: corpus shape (5 rows, 1
identity, 1 campaign, 5 distinct ASNs/IPs, stable fingerprints),
pass at campaign level, pass at identity level (asserts ARI exactly
1.0), asn_clusterer breaches the completeness floor.
Fifth and final commit of the identity-resolution substrate. Unblocks
fixture 2 (vpn_hopping) by making the synthetic factory match
production shape: an actor rotating across N IPs produces N
SyntheticAttacker rows that share fingerprints + truth_identity_id but
differ on ip / asn — exactly the shape the future clusterer needs to
recover via JA3/HASSH match.
Factory:
* SyntheticSession + SyntheticAttacker gain truth_identity_id field.
* DSL: ip_pool: rotating + rotation_count: N produces N observation
rows per actor. Optional rotation_asns: [...] cycles ASN per row;
defaults to the actor's primary asn.
* Sessions distribute round-robin across the actor's rotated rows.
* Noise scanners get truth_identity_id == truth_actor_id ==
truth_campaign_id (each is its own singleton at every level).
* GeneratedCorpus.truth_labels(level=) accepts "campaign" (default,
back-compat), "identity", or "actor" — picks the oracle the
metric harness scores against.
Harness:
* assert_fixture_bounds gains truth_level kwarg (default "campaign")
so identity-resolution fixtures can score against truth_identity_id
without churning the campaign-clustering test files.
Tests: 9 new (rotation_count emits N rows, shared identity +
fingerprints, distinct IPs, rotation_asns distribution + cycling,
round-robin session distribution, identity-level truth labels,
sticky default unchanged, sessions inherit identity label).
598 tests green across clustering / factories / db / web / bus /
profiler / correlation.
Two campaigns sharing a credential wordlist; everything else (ASN, IPs,
JA3, HASSH, active hours) divergent. Pass condition: clusterer must NOT
merge. Protects against the "credential overlap is identity" failure
mode that commodity wordlists invite.
* tests/clustering/fixture_harness.py — shared assert_fixture_bounds
helper + identity_clusterer (placeholder, trivially correct on
all-singleton fixtures) + credential_jaccard_clusterer (deliberately-
bad reference used to PROVE the fixture catches what it should).
* tests/clustering/test_shared_wordlist_fixture.py — bounds pass with
identity, bounds FAIL (homogeneity → 0) with the bad credential
clusterer. The latter is the proof the fixture earns its keep.
* tests/fixtures/campaigns/shared_wordlist.{yaml,expected.yaml}.
* tests/clustering/test_lone_wolf_fixture.py — refactored onto the
shared harness. No behavior change.
Pre-implementation scaffolding for campaign clustering. The simulator is
the spec — algorithm code follows once fixtures + metrics are stable.
* decnet/clustering/ukc.py — UKCPhase enum (19 phases across In/Through/Out
stages), OBSERVABLE_PHASES set, stage_of() helper. Vocabulary aligns
with future MITRE ATT&CK tagging so synthetic data and runtime phase
inference don't need renaming when TTP-tagging lands.
* tests/factories/campaign_factory.py — YAML DSL parser + deterministic
generator emitting truth-labeled SyntheticAttacker / SyntheticSession
records. Validates phase names, warns on unobservable phases, supports
multi-campaign + noise corpora.
* tests/clustering/metrics.py — pure-Python ARI / homogeneity /
completeness / singleton_recall (no sklearn dep). Decided before any
algorithm exists, on purpose.
* tests/fixtures/campaigns/lone_wolf.{yaml,expected.yaml} — fixture 3
from the design doc; simplest of the six, exercises the full pipeline
with an identity-clusterer placeholder.
* development/CAMPAIGN_CLUSTERING.md — design spec for the feature.
* development/DEVELOPMENT_V2.md — note on DSL evolution path
(concurrent phases, multi-actor per phase) deferred post-v1.