DECNET

Author	SHA1	Message	Date
anti	304592abfe	test(clustering): fixture 4 paused_campaign + active_days/time_window Adds the actor.active_days primitive to the campaign factory so a DSL actor can be bound to specific day indexes. Falls back to the non-paused day pool when absent (existing fixtures unchanged). Intersects with pause_windows so the campaign-wide silence still wins if both are set. Adds time_window_clusterer reference to fixture_harness — union-find over attackers, edge if their session time-ranges are within gap_days of each other. Deliberately-bad reference for fixture 4: multi-day silent stretches fragment a single campaign because the clusterer has no signal that bridges the gap. Fixture 4 (paused_campaign): one campaign modeled as two DSL actors representing the operator's two operational windows (active days 1-2 and 6-7), separated by a silent stretch (days 3-5). Both share JA3 + HASSH + payload + C2 callback; only their active_days differ. Five tests: corpus shape (rows in their windows, shared signals), pipeline pass via fingerprint_clusterer at level=campaign, adversarial fragmentation via time_window_clusterer (1-day union threshold cannot bridge the 4-day silence → completeness collapses), huge-gap sanity (gap_days=10 unions both halves), silent-stretch invariant (no session leaks into the configured pause window). Identity-level scoring is fixture 2's job; this fixture is campaign-level only — modeling caveat documented in the YAML.	2026-04-26 07:39:46 -04:00
anti	0def6f7e37	test(clustering): fixture 2 vpn_hopping + fingerprint/asn references One campaign, one DSL actor, ip_pool: rotating + rotation_count: 5 across 5 synthetic private-use ASNs (RFC 6996 64512-64516). Stable JA3, HASSH, and payload_hash across every rotation — these are the "signals the attacker can't cheaply rotate" per IDENTITY_RESOLUTION.md and the load-bearing reason all 5 observation rows must resolve to one identity / one campaign. Two new reference clusterers in fixture_harness.py: * fingerprint_clusterer — groups by (ja3, hassh). Un-fingerprinted rows stay singleton so it doesn't trivially fuse all noise into one mega-cluster. Approximates the stable-signal arm of the planned similarity graph. * asn_clusterer — deliberately-bad reference for fixture 2's adversarial test. Group-by-ASN shatters the campaign into 5 singletons; completeness collapses to 0. Four tests in test_vpn_hopping_fixture.py: corpus shape (5 rows, 1 identity, 1 campaign, 5 distinct ASNs/IPs, stable fingerprints), pass at campaign level, pass at identity level (asserts ARI exactly 1.0), asn_clusterer breaches the completeness floor.	2026-04-26 07:34:18 -04:00
anti	e80f3eec54	test(clustering): fixture 1 (shared_wordlist) + fixture-harness extraction Two campaigns sharing a credential wordlist; everything else (ASN, IPs, JA3, HASSH, active hours) divergent. Pass condition: clusterer must NOT merge. Protects against the "credential overlap is identity" failure mode that commodity wordlists invite. * tests/clustering/fixture_harness.py — shared assert_fixture_bounds helper + identity_clusterer (placeholder, trivially correct on all-singleton fixtures) + credential_jaccard_clusterer (deliberately- bad reference used to PROVE the fixture catches what it should). * tests/clustering/test_shared_wordlist_fixture.py — bounds pass with identity, bounds FAIL (homogeneity → 0) with the bad credential clusterer. The latter is the proof the fixture earns its keep. * tests/fixtures/campaigns/shared_wordlist.{yaml,expected.yaml}. * tests/clustering/test_lone_wolf_fixture.py — refactored onto the shared harness. No behavior change.	2026-04-26 06:38:17 -04:00
anti	00254629f8	feat(clustering): UKC phase enum + synthetic campaign factory + metric harness Pre-implementation scaffolding for campaign clustering. The simulator is the spec — algorithm code follows once fixtures + metrics are stable. * decnet/clustering/ukc.py — UKCPhase enum (19 phases across In/Through/Out stages), OBSERVABLE_PHASES set, stage_of() helper. Vocabulary aligns with future MITRE ATT&CK tagging so synthetic data and runtime phase inference don't need renaming when TTP-tagging lands. * tests/factories/campaign_factory.py — YAML DSL parser + deterministic generator emitting truth-labeled SyntheticAttacker / SyntheticSession records. Validates phase names, warns on unobservable phases, supports multi-campaign + noise corpora. * tests/clustering/metrics.py — pure-Python ARI / homogeneity / completeness / singleton_recall (no sklearn dep). Decided before any algorithm exists, on purpose. * tests/fixtures/campaigns/lone_wolf.{yaml,expected.yaml} — fixture 3 from the design doc; simplest of the six, exercises the full pipeline with an identity-clusterer placeholder. * development/CAMPAIGN_CLUSTERING.md — design spec for the feature. * development/DEVELOPMENT_V2.md — note on DSL evolution path (concurrent phases, multi-actor per phase) deferred post-v1.	2026-04-26 06:29:10 -04:00

4 Commits