Pre-implementation scaffolding for campaign clustering. The simulator is
the spec — algorithm code follows once fixtures + metrics are stable.
* decnet/clustering/ukc.py — UKCPhase enum (19 phases across In/Through/Out
stages), OBSERVABLE_PHASES set, stage_of() helper. Vocabulary aligns
with future MITRE ATT&CK tagging so synthetic data and runtime phase
inference don't need renaming when TTP-tagging lands.
* tests/factories/campaign_factory.py — YAML DSL parser + deterministic
generator emitting truth-labeled SyntheticAttacker / SyntheticSession
records. Validates phase names, warns on unobservable phases, supports
multi-campaign + noise corpora.
* tests/clustering/metrics.py — pure-Python ARI / homogeneity /
completeness / singleton_recall (no sklearn dep). Decided before any
algorithm exists, on purpose.
* tests/fixtures/campaigns/lone_wolf.{yaml,expected.yaml} — fixture 3
from the design doc; simplest of the six, exercises the full pipeline
with an identity-clusterer placeholder.
* development/CAMPAIGN_CLUSTERING.md — design spec for the feature.
* development/DEVELOPMENT_V2.md — note on DSL evolution path
(concurrent phases, multi-actor per phase) deferred post-v1.
33 lines
1.0 KiB
YAML
33 lines
1.0 KiB
YAML
# Fixture 3 (lone_wolf) — see development/CAMPAIGN_CLUSTERING.md §2.
|
|
#
|
|
# One opportunistic scanner, Delivery phase only, no follow-up, no shared
|
|
# signals with anyone else. Surrounded by background noise. The clusterer
|
|
# must keep the wolf and every noise scanner as their own singleton —
|
|
# none should be absorbed into anyone else.
|
|
#
|
|
# This is the simplest of the six fixtures and exists primarily to prove
|
|
# the end-to-end pipeline (DSL → factory → clusterer → metrics) before
|
|
# we invest in the harder scenarios.
|
|
corpus:
|
|
campaigns:
|
|
- campaign:
|
|
id: lone-wolf-001
|
|
actors:
|
|
- id: wolf-a
|
|
asn: 14061
|
|
ip_pool: sticky
|
|
ja3: null
|
|
hassh: null
|
|
hours_active_utc: [3, 4, 5]
|
|
jitter_seconds: 30
|
|
phases:
|
|
- name: delivery
|
|
actor: wolf-a
|
|
target_selector:
|
|
service: any
|
|
count: 1
|
|
dwell_seconds: 1
|
|
duration_days: 1
|
|
noise:
|
|
scanner_count: 8
|