DECNET

Author	SHA1	Message	Date
anti	10fa8a84d1	docs(roadmap): mark TTL + TCP/IP stack fingerprinting complete TTL extraction was already wired in the active prober and passive sniffer plus profiler rollup; the checkbox was just stale. TCP/IP stack now includes ToS/DSCP/ECN, IP-ID sequence classification, and ISN sequence classification as of the previous three commits.	2026-04-26 20:30:46 -04:00
anti	c595d039bd	feat(sniffer): ISN sequence classifier (reuses seq_class helper) Mirrors the IP-ID classifier for TCP ISN values: per-source-IP rolling deque (maxlen=8) populated from each inbound SYN's tcp.seq, classified on every emission. A 'random' verdict is the modern norm; 'incremental', 'zero', or 'constant' indicates legacy stacks or hand-rolled raw-socket tooling — a strong fingerprint signal. Active prober now also captures server_isn (single sample, not classified in-flight; downstream consumers correlating multi-probe results can apply seq_class.classify_sequence themselves). Profiler rollup carries the latest non-'unknown' label into attacker.tcp_fingerprint. Dedup key already covers isn_class from the previous commit, so transitions emit cleanly. UI surfaces ISN class as a colour-coded tag with a ⚠ glyph for non-random verdicts, since they're the genuinely interesting case.	2026-04-26 20:30:24 -04:00
anti	0e40cc8ae1	feat(sniffer): IP-ID sequence classifier (random/incremental/zero/constant) Adds a per-source-IP rolling sample buffer (deque, maxlen=8) for IP-ID values seen on attacker SYNs and a stdlib-only classifier in decnet/sniffer/seq_class.py. Each new SYN appends ip.id and re-classifies the buffer; the result is logged on tcp_syn_fingerprint events alongside sample count. The dedup key now folds in ipid_class so a transition from 'unknown' to a definitive verdict emits exactly one fresh event instead of being suppressed by the old (os\|options) key. Profiler rollup carries the latest non-'unknown' label into attacker.tcp_fingerprint. UI surfaces it as a colour-coded tag in the TCP STACK panel: random neutral, incremental amber, zero/constant green (the strong signal).	2026-04-26 20:28:32 -04:00
anti	b0b08754d0	feat(fingerprint): ToS/DSCP/ECN extraction in active + passive TCP fingerprint Active prober now reads ip.tos from the SYN-ACK and emits tos/dscp/ecn alongside the existing TTL/window/options fields. dscp is folded into the fingerprint hash so different DSCP markings produce distinct signatures. Passive sniffer logs the same three fields on tcp_syn_fingerprint events; profiler rollup carries them into the attacker tcp_fingerprint snapshot; AttackerDetail's TCP STACK panel now surfaces DSCP and ECN cells.	2026-04-26 20:25:37 -04:00
anti	453ab177b4	style(web): scoped Orchestrator.css mirroring Bounty/DeckyFleet pattern Replaces inline styles + .bounty-root reuse with a dedicated .orchestrator-root scope. Adds animated status pill (live/connecting/ error), bordered seg-group kind filter that matches DeckyFleet's fleet-filter-group, dedicated kind chips (matrix-green for traffic, violet for file), failure-row tint, and a brief 'fresh' tint for just-prepended live rows that fades after 5s.	2026-04-26 20:09:33 -04:00
anti	8d1c449173	docs(debt): log DEBT-042 + DEBT-043 from orchestrator UI scope DEBT-042 — orchestrator failure-count badge is computed from the in-memory SSE window; remediation is a dedicated stats endpoint. DEBT-043 — no frontend test framework configured; the planned Orchestrator.tsx component test couldn't be written without first adding vitest + RTL.	2026-04-26 20:01:58 -04:00
anti	c5ad04620b	feat(web): Orchestrator page + SSE hook + AUTOMATION nav group New /orchestrator route. Paginated read-only event list with kind filter (all\|traffic\|file), pause-stream toggle, in-window failure badge ('X failures / 1h'), and an SSE-driven 'live' status pill. Streamed rows prepend on top up to a 500-row in-memory cap. Sidebar gains an AUTOMATION nav group; Orchestrator is the first child. Future workers (mutator/prober activity) plug in as siblings.	2026-04-26 20:01:02 -04:00
anti	3de19eb102	feat(orchestrator): periodic prune of orchestrator_events Every 100 ticks, trim per-dst_decky_uuid history down to 10000 rows (oldest first). Keeps the events table bounded on long-running fleets without paying the cost on every write.	2026-04-26 19:58:43 -04:00
anti	5b5ff54fa2	feat(web): orchestrator events read API + SSE stream GET /api/v1/orchestrator/events — paginated list with optional kind=traffic\|file filter. GET /api/v1/orchestrator/events/stream — SSE: snapshot on connect, live forward of orchestrator.> bus events mapped to 'traffic' / 'file' SSE event names. Repo gains list_orchestrator_events(limit, offset, kind?, since_ts?), count_orchestrator_events(kind?), and prune_orchestrator_events (per_dst_cap=10000) for periodic worker-side trimming.	2026-04-26 19:58:12 -04:00
anti	900c0c3ef5	refactor(bus): rename ORCHESTRATOR_ACTIVITY → ORCHESTRATOR_TRAFFIC Aligns the bus token with the DB column value; OrchestratorEvent.kind is 'traffic'/'file' but the topic was 'activity'/'file'. The asymmetry made consumer code (UI filter, SSE event names) need a translation layer. No external subscribers existed yet.	2026-04-26 19:53:40 -04:00
anti	4c37ece39e	feat(orchestrator): MVP synthetic life-injection worker (SSH only) Adds a new decnet orchestrate worker whose job is to keep the honeypot ecosystem from looking suspiciously static — a frozen LAN with no inter-host traffic and no filesystem aging is its own honeypot tell. MVP scope: - New OrchestratorEvent table + repo methods (purpose-built sibling to Log so synthetic events stay separable from attacker-driven ones). - New orchestrator.{activity,file}.<decky_id> bus topics + system.orchestrator.health heartbeat. - SSH-only driver. Traffic action runs python3 inside src container to TCP-connect dst:22 and read the SSH banner — real on-the-wire SSH-protocol traffic without shipping creds. File action drops or refreshes a small file via docker exec on the destination. - Random scheduler (50/50 traffic/file when >=2 SSH-capable deckies are running). Diurnal shaping, role-aware pairing, and session-aware backoff are explicit non-goals for MVP. - CLI registration, systemd unit (SupplementaryGroups=docker), worker-registry entry so the dashboard shows orchestrator health. - 11 tests: scheduler policy, driver argv shape + injection-safety, end-to-end one-tick integration with FakeBus + SQLite.	2026-04-26 19:43:20 -04:00
anti	cc2deb73f7	feat(web): Identities + Campaigns list pages + THREAT DATA nav group Adds proper /identities and /campaigns list pages following the Bounty/Attackers convention (page-header + page-title-group + controls-row + logs-section + logs-table + EmptyState). Both pages live-update via the existing identity / campaign SSE streams. Sidebar: Attackers, Identities, Campaigns now group under a THREAT DATA NavGroup, matching the SWARM grouping pattern. CampaignDetail and IdentityDetail rewritten to use the house class system (page-header / logs-section / chip / dim-chip) instead of inline styles. The campaign chip on IdentityDetail navigates to /campaigns/:uuid; both pages share a small fp-group helper for fingerprint listings (added to Dashboard.css).	2026-04-26 09:32:00 -04:00
anti	7fafdd66de	feat(deploy): systemd units for identity + campaign clusterers decnet-clusterer.service.j2 ships the identity clusterer that landed last session (was overlooked) — bus-woken on attacker.>, publishes identity.> events. decnet-campaign-clusterer.service.j2 ships the campaign clusterer from this session — bus-woken on identity.>, publishes campaign.> events plus the cross-family identity.campaign.assigned. After= decnet-clusterer.service so the identity layer is up before the campaign layer reads its rows. decnet.target Wants both new units. Both follow the same security hardening profile as enrich + reuse-correlator.	2026-04-26 09:22:02 -04:00
anti	d531cea536	feat(web): read-only campaigns API + SSE + frontend API: /api/v1/campaigns (paginated list), /api/v1/campaigns/{uuid} (soft-merge chain follow), /api/v1/campaigns/{uuid}/identities (member identities), and /api/v1/campaigns/events (SSE under campaign.> + JWT-via-?token=, snapshot-on-connect). Mirror of the identity router; same auth, same shape, same OpenAPI tags pattern. Frontend: CampaignDetail.tsx page (same visual vocabulary as IdentityDetail), useCampaignStream hook (mirror of useIdentityStream), /campaigns/:id route, IdentityDetail's CAMPAIGN badge becomes clickable and navigates to the campaign. useIdentityStream now listens for identity.campaign.assigned so the badge appears live without a manual refresh.	2026-04-26 09:20:17 -04:00
anti	75af00c9c8	test(clustering): full-bound passes through production campaign clusterer Runs the chained identity + campaign clustering pipeline against all seven fixtures via from_synthetic / from_synthetic_identity adapters and ratchets every YAML floor to 1.0 — the production clusterer (and the reference clusterers used in the per-fixture tests) all score perfectly across ARI / homogeneity / completeness / singleton_recall on each fixture. Three substrate fixes surfaced by the ratchet: - Tuning: shared_infra now Jaccards payload+C2 only; decky_set moved into cohort_weight to prevent fleet-scarcity false-merges (F1's shared_wordlist failure mode). Tier weight raised to 1.0 so shared payload+C2 alone crosses threshold (F5's intended pass). - Adapter: from_synthetic_identity now reads SyntheticSession started_at + duration_s for session_windows and per-decky timestamps (the production-row adapter still uses start_ts/end_ts when available). - Fixture data: paused_campaign.yaml's JA3 collided exactly with vpn_hopping.yaml's (same TLS extension list). The collision fused two unrelated campaigns under the chained identity layer in the noise_floor composite. Made paused's JA3 distinct. Also wires Campaign / CampaignsResponse into models/__init__.py's __all__ that was missed in the schema commit.	2026-04-26 09:13:59 -04:00
anti	6936a1426c	feat(clustering): campaign-clusterer worker + bus topics + CLI The campaign clusterer worker mirrors the identity-side worker shell (bus connect, heartbeat, control listener, slow-tick fallback) but wakes on identity.> instead of attacker.> — campaign-level work is gated on identity-layer changes, not raw observations. The connected-components implementation reads identities via list_identities_for_clustering, projects them with from_identity_row, runs union-find over combined_campaign_weight, writes campaigns rows, sets attacker_identities.campaign_id, and runs the same revocable- merge pass as the identity layer (a merged-out campaign whose identities no longer co-cluster with the winner gets revoked). Bus: adds campaign.> family (formed / identity.assigned / merged / unmerged) plus the cross-family identity.campaign.assigned so existing identity-stream subscribers see the badge update without having to subscribe to campaign.>. Wiki Service-Bus.md updated in wiki-checkout in the same wave per the project's bus-signals discipline. CLI: decnet campaign-clusterer registered as master-only via MASTER_ONLY_COMMANDS; --poll-interval / --daemon mirror the identity clusterer command surface.	2026-04-26 09:04:00 -04:00
anti	0946bab424	feat(clustering): campaign-level similarity primitives The signal taxonomy for the campaign clusterer (next commit). Mirror of the identity-layer module but with edge families that don't translate 1:1: phase-handoff (load-bearing for F5 multi_operator — the signal the identity-side fingerprint-disagreement veto deliberately isn't), shared-infra (vetoed at identity level, primary positive signal here), temporal-overlap (pairwise-relative — F7 invariance preserved), cohort (weak supporting weight only). Tier weights tuned so phase-handoff alone crosses threshold (F5), shared-infra + temporal-overlap together cross (canonical co-op pattern), and shared-infra + cohort together do NOT (F1 shared_wordlist's failure mode). The F7 time-shift invariant is explicitly tested on every time-bearing edge and on the combined weight.	2026-04-26 08:57:46 -04:00
anti	0a1cf65ddb	feat(db): Campaign SQLModel + repo write/read methods Adds the campaigns table and the BaseRepository / SQLModelRepository methods that the campaign-clusterer worker (next commit) needs to populate it. Mirrors the AttackerIdentity layer: schema_version from day one for federation gossip, soft-merge via merged_into_uuid with a chain-walking get_campaign_by_uuid, list_campaigns excluding merged- out rows while list_all_campaigns returns the unfiltered set for the revoke pass. attacker_identities.campaign_id gets a real FK now that the target table exists.	2026-04-26 08:54:28 -04:00
anti	059d1dba75	feat(web): live identity-resolution updates via SSE useIdentityStream hook mirrors useTopologyStream — opens an EventSource against /api/v1/identities/events with the JWT in ?token=, dispatches the five named events (snapshot, formed, observation.linked, merged, unmerged) to the consumer, reconnects 3s after any error. AttackerDetail subscribes whenever it has an attacker id loaded. On any event whose payload references this observation's uuid OR the attacker's current identity_id, refetch /attackers/{id} so the IDENTITY badge appears (or follows through merges / unmerges) live without a tab refocus. IdentityDetail subscribes whenever it has an identity id loaded. On any event whose payload references this identity_id (formed for it, merge winner / loser, unmerge resurrected / former-winner), it refetches both the identity row and its observations list. Both consumers filter inside onEvent — the hook itself is dumb glue and stays unaware of which uuids any given component cares about.	2026-04-26 08:38:27 -04:00
anti	97aa57faed	feat(api): SSE stream for identity events at /api/v1/identities/events Mirrors GET /api/v1/topologies/{id}/events: subscribes to identity.> on the bus for the duration of the request and forwards each event as a named SSE frame (formed / observation.linked / merged / unmerged). The endpoint is broadly scoped (every identity event, not per-uuid) because both AttackerDetail and IdentityDetail need the same firehose: AttackerDetail watches for an identity.formed that finally binds its identity_id; IdentityDetail watches for observation.linked / merged / unmerged against its current row. A per-uuid filter would force the client to know its identity before subscribing, which it doesn't always. JWT via ?token= (EventSource can't set headers), require_stream_viewer gate, sse_connection_slot per-user cap, snapshot-on-connect with the first 50 identities so the client buffer renders without a separate REST call. Bus-disabled / unreachable path keeps the connection alive on keepalives so the client doesn't reconnect-storm; it can re-poll the REST API on its own timer.	2026-04-26 08:36:17 -04:00
anti	e364ef8859	feat(clustering): revocable merges (merge + unmerge) Reworks the clusterer's tick to handle multi-identity components and re-evaluate prior merges. Two passes per tick: Pass 1 — per-component reconciliation: * Fresh component → mint identity (commit 4 path). * Single-identity component → link unassigned observations. * Multi-identity component → soft-merge: pick the smallest-uuid winner deterministically, set merged_into_uuid on each loser, link unassigned observations to the winner. Observations stay FK'd to their original identity row — the merge is a soft pointer, not a re-point. Audit trail preserved; cached subscribers resolve through the chain. Pass 2 — revocable-merge undo: * For each merged-out identity, check whether its observations still cluster with its winner's. If not, the merge is contradicted by new evidence — clear merged_into_uuid and emit identities_unmerged. The resurrected identity keeps its original uuid, so subscribers that cached it during the merged interval re-attach without a new lookup. A pre-built merge-chain dict feeds Pass 1 so the effective-identity lookup is O(1) per observation. The chain has a hop cap (paranoia against accidental cycles in the underlying state). Repo additions on BaseRepository + SQLModelRepository: * list_all_identities() — includes merged-out rows. * update_identity_merged_into(uuid, winner_or_None) — single setter for both merge and unmerge. DummyRepo coverage stub updated. Tests: * Two distinct identities bridged by a new observation merge with the smaller uuid as winner. * A pre-seeded soft-merge whose underlying observations diverge gets revoked; resurrected uuid emerges with merged_into_uuid cleared. * Tick is idempotent under no state changes.	2026-04-26 08:33:32 -04:00
anti	87412da1ca	test(clustering): F6 noise-floor ratchets for production clusterer Two targeted invariants instead of a wholesale YAML-bounds re-use, because the existing F6 bounds were tuned for the reference composite_signals_clusterer (fingerprint OR C2). The production clusterer trades that aggregation for tier discipline + the fingerprint-disagreement veto, so its score profile differs even when its judgments are correct — multi_operator stays as 2 truth identities, paused_campaign's two DSL actors remain a single cluster because they share fingerprints, etc. Wholesale bounds re-use would fight the design. The two production-side ratchets: 1. singleton_recall ≥ 0.95 at campaign-level scoring — truth- singleton noise scanners must not be absorbed into real campaigns. This is the F6 failure mode that motivates the fixture. 2. Intra-campaign recovery under cross-corpus interference: * vpn_hopping's 5 rotations consolidate to one cluster. * shared_wordlist A and B stay in disjoint clusters despite sharing credentials with each other (and with the noise floor). A future commit can revisit when the production clusterer's identity- level truth alignment improves (e.g. when paused_campaign's DSL is extended to mark its two actors as one truth identity).	2026-04-26 08:28:31 -04:00
anti	7923006203	test(clustering): F7 slow-burn time-agnostic invariant Fixture 7 ratchet: one campaign across 3 multi-week operational windows with stable JA3 + HASSH + C2. The production clusterer must fold all 3 into one cluster despite multi-week silence between windows; completeness = 1.0. Time-shift invariance test: applying a +90 day delta to every session start (and the per-attacker first/last seen) must produce the same cluster membership as the baseline. This is the runtime counterpart of the static no-time-fields check on Observation. If either check ever fails, the clusterer has accidentally grown a recency-aware edge — fixture 7's whole reason for existing.	2026-04-26 08:26:23 -04:00
anti	6a4592a8f5	test(clustering): low/very-low tier safety + F1/F2 ratchets Pins down the tier-discipline contract end-to-end: - Credentials-only overlap doesn't fuse observations (F1 in miniature). - ASN-only overlap doesn't fuse observations (F2 in miniature). - All three weak tiers (medium + low + very-low) stacked still don't fuse — only a high-tier signal does. - F1 (shared_wordlist) at identity-level: no false merges, every row is its own predicted cluster, homogeneity = 1.0. - F2 (vpn_hopping): 5 distinct ASNs collapse into 1 predicted cluster, proving JA3 / HASSH dominate ASN as the design requires. The combination math itself was wired in commit 5; this commit is the failure-mode regression suite that gates future tuning of the tier weights.	2026-04-26 08:25:23 -04:00
anti	ed323581fe	feat(clustering): fingerprint-disagreement veto for fixture 5 Two operators cooperating on one campaign can share C2 endpoints + stage-1 payloads while running distinct tooling — fixture 5 (multi_operator) is the canonical demonstration. The identity clusterer must NOT fuse them: shared infra is a campaign-level signal, not an identity-level one. The campaign clusterer (downstream work) handles that grouping over identities. Mechanism: when two observations have non-null fingerprints AND the fingerprints fully disagree, the high-weight tier drops the payload and C2 contributions to zero. JA3 / HASSH agreement still returns 1.0 directly — no veto applies when something agrees. Partial agreement (one slot agrees, another disagrees) is treated as agreement, since stable-tool partial overlap is more consistent with one identity than two. The veto only triggers when there is actual disagreement evidence — two un-fingerprinted observations sharing a C2 still cluster, since the absence of fingerprints is not the same as disagreement on them. Fixture 5 production-clusterer assertion added at identity level: ARI = 1.0, homogeneity = 1.0, exactly 2 predicted clusters from 2 truth identities. Phase-handoff edges (from the TODO) belong to the downstream campaign clusterer, not this identity clusterer.	2026-04-26 08:24:22 -04:00
anti	f7da33726c	feat(clustering): combined edge weight + medium-tier wiring The clusterer now drops a single high-tier function call in favor of a tier-weighted sum. Tier multipliers (high=1.0, medium=0.6, low=0.2, very_low=0.05) are tuned so the threshold (1.0) admits high-tier agreement alone while leaving every weaker tier — and every combination of weaker tiers — under threshold. Per-tier discipline tested: - high alone clusters - medium alone does NOT cluster (supporting signal only) - low alone does NOT cluster (fixture 1's failure mode) - very-low alone does NOT cluster (fixture 2's failure mode) - all three weak tiers stacked still don't reach threshold - high + medium clusters (high already saturates) The combination is forward-compatible: low + very-low contributions are computed today but always project to 0.0 because the production adapter doesn't populate credentials / ASN-edge inputs into the fixture path yet. Their contribution becomes load-bearing in commit 7 when the low-tier landing tightens the F1 / F2 bounds. Fixture 4 (paused_campaign) ratchet added: high-tier signal carries the multi-day-silence campaign into one identity. Time-agnostic invariant — silence is irrelevant to the edge weight.	2026-04-26 08:22:10 -04:00
anti	de2f4c3a62	feat(clustering): wire high-weight edges end-to-end The connected-components clusterer now writes attacker_identities rows + sets attackers.identity_id when high-weight signals (JA3 / HASSH / payload-hash / C2-endpoint exact match) agree across observations. Singletons stay un-fingerprinted and un-clustered. Algorithm split: - cluster_observations(observations) — pure union-find over the high-weight edge function. Same code path for fixture validation and production tick. - from_attacker_row(row) — production-row adapter; recovers JA3 + HASSH from Attacker.fingerprints JSON. Payload + C2 join from logs in later commits; the function shape doesn't change. Repo additions on BaseRepository + SQLModelRepository: - list_attackers_for_clustering(limit=None) - create_attacker_identity(row) - set_attacker_identity_id(attacker_uuid, identity_uuid) DummyRepo coverage stub updated. v1 behavior is conservative: only assigns identities to observations whose identity_id is currently NULL. Multi-identity components are skipped this pass — merge / re-assign lands in commit 10 with revocable merges. Fixture bounds tightened against the production clusterer: - lone_wolf (F3) — singletons stay singletons - shared_wordlist (F1) — credential-only overlap doesn't cluster (high-weight tier doesn't include credentials) - vpn_hopping (F2, identity-level) — 5 rotated IPs with stable JA3 + HASSH fold into one identity, ARI = 1.0, completeness = 1.0	2026-04-26 08:19:56 -04:00
anti	a9775c4000	feat(clustering): similarity-graph primitives Adds the four weight-tier edge functions as pure, time-agnostic scoring primitives over an Observation projection. Each returns a score in [0, 1]; the connected-components impl will combine + threshold in subsequent commits. Tier semantics (from IDENTITY_RESOLUTION.md): - high — JA3/HASSH/payload-hash/C2-endpoint exact match - medium — phase-bucketed command-sequence Jaccard - low — credential-attempt-set Jaccard (defeated alone by F1) - very low — ASN equality (defeated alone by F2) Time-agnostic invariant is a static test: Observation has no time fields, so no edge function can silently start using them. Fixture 7 forbids recency-decay clustering on multi-month APT campaigns. A from_synthetic() adapter projects SyntheticAttacker corpora into Observation; the production-row adapter lands when the clusterer starts reading the attackers table.	2026-04-26 08:13:29 -04:00
anti	fb522af107	feat(bus): reserve identity.unmerged topic Revocable merges (a contradiction-driven undo of identity.merged) ship in the clusterer work; this reserves the topic up-front so identity.> subscribers receive it day one without a re-subscribe. The clusterer worker's ClusterResult fan-out now publishes on identity.unmerged when populated. The skeleton clusterer never populates it; the revocable-merge commit will. Wiki update lives in wiki-checkout/Service-Bus.md (separate repo).	2026-04-26 08:10:56 -04:00
anti	e545f7d8d3	feat(clustering): identity clusterer worker skeleton Adds the decnet clusterer master-only command + provider-subpackage shape (base.py + factory.py + impl/connected_components.py) so subsequent commits can land similarity-graph features without churning callers. The skeleton ConnectedComponentsClusterer.tick is a no-op; the worker shell is fully wired (bus consumer on attacker.observed + attacker.scored, slow-tick fallback, health heartbeat, control listener, ClusterResult fan-out to identity.formed/observation.linked /merged). Subscribers on identity.> see no events from this clusterer until edge functions land, but the lifecycle is in place.	2026-04-26 08:09:11 -04:00
anti	6b6a808a4a	test(clustering): fixture 7 slow_burn + recency_decay reference Multi-month APT campaign modeling real APT operational tempo: recon over weeks, exploitation later, action-on-objectives later still. The unique signal this fixture stresses is TIME-AGNOSTIC IDENTITY across multi-week silences — a clusterer that silently expires old edges fragments any campaign that operates over months. Three DSL actors represent the operator's three operational windows (week 2, month 2, month 3 of a 90-day campaign), all sharing JA3 + HASSH + payload + C2 callback. Campaign-level fixture only — the three actors mint distinct truth_identity_id rows by design (same modeling caveat as fixtures 4 and 5). The fixture's narrative mirrors how an APT works a deep nested topology (DECNET MazeNET mode): map decoy networks for weeks, only then commit to exploitation. Slow-and-low pacing is the signal. recency_decay_clusterer added to fixture_harness — same edge construction as composite_signals_clusterer, but each edge weighted by exp(-time_distance / half_life_days) and dropped below a threshold. Adversarial reference for slow_burn: with 14-day half- life and 0.5 threshold, edges between operational windows (24+ days apart) decay below threshold and drop. The campaign fragments into three clusters; completeness collapses. This is the canonical production failure mode for graph clusterers that bound memory or bias toward "what's hot" by silently expiring old edges. Catching it in synthetic data is what fixture 7 exists for; the replay tier will surface real-world drift / dwell patterns that calibrate the half-life threshold the real algorithm should tolerate. Four tests: corpus shape (window-isolated sessions, stable fingerprint), pipeline pass via composite_signals_clusterer (time- agnostic — folds all three windows), adversarial fragmentation (3 clusters at 14-day half-life), long-half-life sanity (gentle decay unions everything; confirms behavior depends on the half-life parameter, not on something unrelated).	2026-04-26 07:58:23 -04:00
anti	7021fda0e6	test(clustering): fixture 6 noise_floor (composite + cross-corpus) Bundles all five prior fixtures' campaigns into one corpus alongside 10 fresh Delivery-only noise scanners (on top of lone_wolf's 8 inherited). The fixture covers cross-corpus interference — signal collisions across fixtures' JA3/HASSH/C2 strings, factory ID re-use, clusterer ambiguity that only manifests when multiple campaigns score together. Each constituent fixture already ships its own in-fixture adversarial test; this one is the control for the class of failures that single-corpus fixtures cannot catch. Composition is declared via a fixture-6-specific include_fixtures block in noise_floor.yaml. The test file's loader expands it into a full corpus.campaigns spec at runtime so the factory itself stays unaware — no factory primitive added for what only this fixture needs. The 8 noise scanners declared by lone_wolf flow through naturally; the extra_noise_scanners count adds 10 more. composite_signals_clusterer (added in the fixture-5 commit) is the pass clusterer — union-find combining (ja3, hassh) match OR overlapping C2 callback. Approximates the planned similarity graph well enough that every campaign resolves and every singleton stays singleton in the merged corpus. Three tests: corpus integrity (every campaign id present, 12 campaign-driven attackers + 18 noise = 30 total), pipeline pass against the global bounds, and an explicit singleton-recall assertion (21 truth-singletons — 1 lone wolf, 18 noise, 2 shared_wordlist actors whose campaigns are size 1 — all kept singleton by the composite clusterer). Singleton recall is the load-bearing metric here: noise absorption is the failure mode that makes campaign attribution useless in practice.	2026-04-26 07:49:36 -04:00
anti	27f7de9886	test(clustering): fixture 5 multi_operator + c2/shift/composite refs Three new reference clusterers in fixture_harness: * c2_callback_clusterer — union-find on overlapping C2 callback sets across an attacker's sessions. Pass-clusterer for fixture 5 where two operators with distinct tooling share a C2 endpoint as the campaign signal. * shift_clusterer — deliberately-bad reference that buckets attackers by majority session-start hour into night/day/swing. Adversarial reference for fixture 5; proves operational schedule is NOT a campaign signal. * composite_signals_clusterer — union-find combining (ja3, hassh) match OR overlapping C2 callback. Will serve as the pass- clusterer for fixture 6 (noise_floor) where multiple campaigns with heterogeneous signal types are scored together. Also factored a small _union_find helper for the new clusterers (existing time_window/credential_jaccard left untouched to avoid mixing refactor with feature work). Fixture 5 (multi_operator): one campaign, two operators with distinct UKC roles. Actor A (broker, night shift): Delivery → Exploitation → Persistence → C2. Actor B (post-ex, day shift): Discovery → Lateral Movement → Collection → Exfiltration. Distinct JA3/HASSH/ASN/IPs; shared C2 + payload hash. Four tests: corpus shape (distinct fingerprints, shared C2, disjoint shifts), pipeline pass via c2_callback_clusterer, explicit harness sanity that fingerprint_clusterer cannot resolve this fixture (documents which signal carries the campaign), and adversarial shift_clusterer fragmentation. Phase-handoff edges (the real load-bearing signal per the design doc) wait for the production clusterer; this fixture will prove they're needed when it ships.	2026-04-26 07:46:14 -04:00
anti	304592abfe	test(clustering): fixture 4 paused_campaign + active_days/time_window Adds the actor.active_days primitive to the campaign factory so a DSL actor can be bound to specific day indexes. Falls back to the non-paused day pool when absent (existing fixtures unchanged). Intersects with pause_windows so the campaign-wide silence still wins if both are set. Adds time_window_clusterer reference to fixture_harness — union-find over attackers, edge if their session time-ranges are within gap_days of each other. Deliberately-bad reference for fixture 4: multi-day silent stretches fragment a single campaign because the clusterer has no signal that bridges the gap. Fixture 4 (paused_campaign): one campaign modeled as two DSL actors representing the operator's two operational windows (active days 1-2 and 6-7), separated by a silent stretch (days 3-5). Both share JA3 + HASSH + payload + C2 callback; only their active_days differ. Five tests: corpus shape (rows in their windows, shared signals), pipeline pass via fingerprint_clusterer at level=campaign, adversarial fragmentation via time_window_clusterer (1-day union threshold cannot bridge the 4-day silence → completeness collapses), huge-gap sanity (gap_days=10 unions both halves), silent-stretch invariant (no session leaks into the configured pause window). Identity-level scoring is fixture 2's job; this fixture is campaign-level only — modeling caveat documented in the YAML.	2026-04-26 07:39:46 -04:00
anti	0def6f7e37	test(clustering): fixture 2 vpn_hopping + fingerprint/asn references One campaign, one DSL actor, ip_pool: rotating + rotation_count: 5 across 5 synthetic private-use ASNs (RFC 6996 64512-64516). Stable JA3, HASSH, and payload_hash across every rotation — these are the "signals the attacker can't cheaply rotate" per IDENTITY_RESOLUTION.md and the load-bearing reason all 5 observation rows must resolve to one identity / one campaign. Two new reference clusterers in fixture_harness.py: * fingerprint_clusterer — groups by (ja3, hassh). Un-fingerprinted rows stay singleton so it doesn't trivially fuse all noise into one mega-cluster. Approximates the stable-signal arm of the planned similarity graph. * asn_clusterer — deliberately-bad reference for fixture 2's adversarial test. Group-by-ASN shatters the campaign into 5 singletons; completeness collapses to 0. Four tests in test_vpn_hopping_fixture.py: corpus shape (5 rows, 1 identity, 1 campaign, 5 distinct ASNs/IPs, stable fingerprints), pass at campaign level, pass at identity level (asserts ARI exactly 1.0), asn_clusterer breaches the completeness floor.	2026-04-26 07:34:18 -04:00
anti	943bb3a39d	docs(identity): resolve merge revocability + SSE open questions Open Question 1 (merge revocability): adopted. The clusterer will clear merged_into_uuid on contradicting evidence and publish a new identity.unmerged topic alongside the existing three identity.* topics so subscribers on identity.> get it from day one. Open Question 2 (AttackerDetail UX on identity_id change): adopted SSE over refresh-on-focus. New endpoint will mirror the existing topology mutator SSE (bus subscription on identity.>, JWT via ?token=, snapshot-on-connect + live forward). Risk 2 (API URL stability for soft-merged loser UUIDs): struck — already shipped in commit `dc3d08d` (read-only API follows merged_into_uuid and surfaces the canonical winner).	2026-04-26 07:33:36 -04:00
anti	f6b83755eb	test(clustering): factory honors ip_pool: rotating + 3-level truth labels Fifth and final commit of the identity-resolution substrate. Unblocks fixture 2 (vpn_hopping) by making the synthetic factory match production shape: an actor rotating across N IPs produces N SyntheticAttacker rows that share fingerprints + truth_identity_id but differ on ip / asn — exactly the shape the future clusterer needs to recover via JA3/HASSH match. Factory: * SyntheticSession + SyntheticAttacker gain truth_identity_id field. * DSL: ip_pool: rotating + rotation_count: N produces N observation rows per actor. Optional rotation_asns: [...] cycles ASN per row; defaults to the actor's primary asn. * Sessions distribute round-robin across the actor's rotated rows. * Noise scanners get truth_identity_id == truth_actor_id == truth_campaign_id (each is its own singleton at every level). * GeneratedCorpus.truth_labels(level=) accepts "campaign" (default, back-compat), "identity", or "actor" — picks the oracle the metric harness scores against. Harness: * assert_fixture_bounds gains truth_level kwarg (default "campaign") so identity-resolution fixtures can score against truth_identity_id without churning the campaign-clustering test files. Tests: 9 new (rotation_count emits N rows, shared identity + fingerprints, distinct IPs, rotation_asns distribution + cycling, round-robin session distribution, identity-level truth labels, sticky default unchanged, sessions inherit identity label). 598 tests green across clustering / factories / db / web / bus / profiler / correlation.	2026-04-26 07:19:39 -04:00
anti	4f1077be72	feat(bus): identity.* topic family (formed / observation.linked / merged) Fourth of the five-step identity-resolution substrate. Constants and builder ship now; no publishers exist yet — they land with the clusterer worker. Subscribers (webhook worker, dashboard SSE relay) can register against identity.> from day one. * decnet/bus/topics.py — IDENTITY root + IDENTITY_FORMED / IDENTITY_OBSERVATION_LINKED / IDENTITY_MERGED leaves; identity() builder mirroring the attacker() / system() helpers. Module docstring topic-tree updated. * tests/bus/test_topics.py — assert builder produces the expected three topic strings + rejects empty event_type. Wiki Service-Bus.md and a new Identity-Resolution.md page land in the companion wiki-checkout commit.	2026-04-26 07:15:44 -04:00
anti	448212ebcd	feat(web-ui): IdentityDetail page + conditional Identity badge on AttackerDetail Third of the five-step identity-resolution substrate. Frontend hooks into the empty /api/v1/identities/* surface from commit 2; renders nothing visible when identity_id is null (which is the universal state until the clusterer ships). * decnet_web/src/components/IdentityDetail.tsx — new page. Header with uuid + optional CAMPAIGN / MERGED-INTO badges, stats row (observations / JA3 / HASSH / payloads / C2), fingerprint tag lists parsed from the JSON-in-TEXT columns, observations table that links back to AttackerDetail, conditional analyst-notes panel. * decnet_web/src/components/AttackerDetail.tsx — IDENTITY badge inserted in the header row alongside TRAVERSAL. Clicking navigates to /identities/<uuid>. AttackerData interface gains the optional identity_id field. * decnet_web/src/App.tsx — /identities/:id route + lazy-loaded chunk. Verified by `tsc --noEmit` (clean) and `vite build` (clean — produces IdentityDetail-*.js as its own lazy chunk). The repo has no JS test harness; build + type-check are the gate.	2026-04-26 07:12:37 -04:00
anti	dc3d08dd41	feat(web): read-only /api/v1/identities/* endpoints + repo methods Second of the five-step identity-resolution substrate. Ships the API surface against the empty AttackerIdentity table from commit 1 — every endpoint returns empty/404 cleanly until the clusterer populates rows. Routes (auth-gated, viewer role): * GET /api/v1/identities — paginated list, excludes merged-out rows * GET /api/v1/identities/{uuid} — detail; transparently follows merged_into_uuid to surface the canonical winner * GET /api/v1/identities/{uuid}/observations — Attacker rows FK'd to the (resolved) identity uuid Repository (BaseRepository abstract + SQLModelRepository concrete): * get_identity_by_uuid (with merge-chain following, hop-bounded) * list_identities / count_identities (excluding merged-out) * list_observations_for_identity / count_observations_for_identity Tests: 12 new (empty-table behavior, seeded data, merge-chain resolution, repo-level smoke against real SQLite). Also fixes the pre-existing test_base_repo_coverage failure (DEBT-041 added abstract methods without updating the DummyRepo stub) — included here because this PR adds 5 more abstract methods, fixing it as a bonus. 474 db/web/profiler/correlation tests green.	2026-04-26 07:08:55 -04:00
anti	84c1ca9c9b	feat(identity): AttackerIdentity table + nullable attackers.identity_id FK Schema-only commit, first of the five-step substrate for identity resolution. The clusterer that populates identities lands later; this ships the table empty and the FK uniformly NULL on existing rows. * decnet/web/db/models/attackers.py — new AttackerIdentity SQLModel (uuid PK, schema_version, fingerprint summary lists, kd_digraph_simhash, merged_into_uuid self-FK, all clusterer-populated fields nullable). Attacker grows a nullable indexed identity_id FK + docstring marking it as the per-IP observation row. * decnet/web/db/models/__init__.py — re-exports AttackerIdentity. * tests/db/test_identity_schema.py — 9 schema invariants: table exists, identity_id nullable + indexed, FK targets attacker_identities.uuid, schema_version defaults to 1, attacker rows inserted with NULL identity_id, FK constraint blocks orphans. 463 unrelated db/web/profiler/correlation tests still green. See development/IDENTITY_RESOLUTION.md for the full design.	2026-04-26 07:00:24 -04:00
anti	7904ef1308	docs(identity): IDENTITY_RESOLUTION.md design spec Pre-implementation design for the observation/identity/campaign three-level hierarchy. Sibling-add approach (not rename) — keep the attackers table name, add attacker_identities as a sibling, nullable attackers.identity_id FK. Documents the rationale, schema, bus topics, API surface, and the 5-commit implementation sequence. Companion to development/CAMPAIGN_CLUSTERING.md. Substrate for the clusterer worker designed there; ships empty so the campaign clustering fixtures can encode honest multi-row-per-actor scenarios.	2026-04-26 06:56:40 -04:00
anti	e80f3eec54	test(clustering): fixture 1 (shared_wordlist) + fixture-harness extraction Two campaigns sharing a credential wordlist; everything else (ASN, IPs, JA3, HASSH, active hours) divergent. Pass condition: clusterer must NOT merge. Protects against the "credential overlap is identity" failure mode that commodity wordlists invite. * tests/clustering/fixture_harness.py — shared assert_fixture_bounds helper + identity_clusterer (placeholder, trivially correct on all-singleton fixtures) + credential_jaccard_clusterer (deliberately- bad reference used to PROVE the fixture catches what it should). * tests/clustering/test_shared_wordlist_fixture.py — bounds pass with identity, bounds FAIL (homogeneity → 0) with the bad credential clusterer. The latter is the proof the fixture earns its keep. * tests/fixtures/campaigns/shared_wordlist.{yaml,expected.yaml}. * tests/clustering/test_lone_wolf_fixture.py — refactored onto the shared harness. No behavior change.	2026-04-26 06:38:17 -04:00
anti	00254629f8	feat(clustering): UKC phase enum + synthetic campaign factory + metric harness Pre-implementation scaffolding for campaign clustering. The simulator is the spec — algorithm code follows once fixtures + metrics are stable. * decnet/clustering/ukc.py — UKCPhase enum (19 phases across In/Through/Out stages), OBSERVABLE_PHASES set, stage_of() helper. Vocabulary aligns with future MITRE ATT&CK tagging so synthetic data and runtime phase inference don't need renaming when TTP-tagging lands. * tests/factories/campaign_factory.py — YAML DSL parser + deterministic generator emitting truth-labeled SyntheticAttacker / SyntheticSession records. Validates phase names, warns on unobservable phases, supports multi-campaign + noise corpora. * tests/clustering/metrics.py — pure-Python ARI / homogeneity / completeness / singleton_recall (no sklearn dep). Decided before any algorithm exists, on purpose. * tests/fixtures/campaigns/lone_wolf.{yaml,expected.yaml} — fixture 3 from the design doc; simplest of the six, exercises the full pipeline with an identity-clusterer placeholder. * development/CAMPAIGN_CLUSTERING.md — design spec for the feature. * development/DEVELOPMENT_V2.md — note on DSL evolution path (concurrent phases, multi-actor per phase) deferred post-v1.	2026-04-26 06:29:10 -04:00
anti	3eb67c9400	refactor(intel): re-key attacker_intel on attacker_uuid (closes DEBT-041) The threat-intel surface was IP-keyed on day one as an expedient — the worker is woken by IP-bearing bus events. ANTI's call: don't carry that debt. NO IPs as primary keys anywhere on the attacker-intel surface. Schema: - attacker_uuid is now the canonical key — UNIQUE + FK to attackers.uuid. - attacker_ip stays as a denormalised, indexed, NON-UNIQUE value column. Updated on every upsert; useful for SIEM payloads and audit lookups, but explicitly NOT a key. Model docstring says so. - Pre-v1, no Alembic migration needed. SQLModel.metadata.create_all() builds the new shape on fresh DBs. Repo: - upsert_attacker_intel now keys on attacker_uuid. - get_attacker_intel_by_ip → get_attacker_intel_by_uuid. - get_unenriched_attacker_ips → get_unenriched_attackers, returning [{uuid, ip}] tuples so the worker writes by UUID and dispatches provider calls by IP without a second round-trip. Worker: - _enrich_one(uuid, ip, ...) — UUID lands on the row, IP rides for provider egress. - attacker.intel.enriched bus payload gains attacker_uuid alongside attacker_ip — webhook → SIEM consumers benefit; no removal. API: - GET /api/v1/attackers/{ip}/intel deleted outright (rip-and-replace, never deployed beyond dev). - GET /api/v1/attackers/{uuid}/intel is the only public route, matching every other /attackers/* route. Frontend: - <IntelPanel uuid={id!} /> uses the URL param directly, fetches in parallel with the rest of AttackerDetail rather than waiting on attacker.ip. Tests: re-keyed in place, 39 passed (same coverage as before the refactor). Provider-impl tests untouched. DEBT-041: closed in DEBT.md (entry preserved as historical rationale, summary table flipped to ✅, remaining-open list shortened by one).	2026-04-26 05:35:29 -04:00
anti	a009549326	feat(web): IntelPanel on AttackerDetail + DEBT-041 entry Read-only IP-keyed intel surface on the attacker detail page. Renders the aggregate verdict (color-coded MALICIOUS/SUSPICIOUS/BENIGN/NO SIGNAL) plus a per-provider row with verdict, queried-at timestamp, and provider-specific detail (GreyNoise classification, AbuseIPDB 0-100 score, Feodo C2 listing + malware family, ThreatFox IOC match + malware family). 404 from the API renders as 'NO INTEL CACHED YET' with a hint that decnet enrich will populate it on the next pass — TTL drives the refresh, no manual button. DEBT-041 documents the API/UI IP-keying as a v1 expedient that will need a UUID-keyed sibling endpoint before federation lands. NAT collisions, attacker.uuid consistency across attacker routes, and the sequential-fetch UX are all callouts on that ticket; the migration sketch is laid out so the v1.x followup is unambiguous. Frontend build: clean (55.58 kB AttackerDetail bundle, +~5kB for the panel). Note: not browser-tested in this session — recommend a manual smoke against a deployed master before tagging.	2026-04-26 05:25:25 -04:00
anti	8a6d632ab0	feat(deploy): systemd unit for decnet-enrich + register in worker panel Mirrors decnet-reuse-correlator.service.j2: same hardening posture (NoNewPrivileges, ProtectSystem=full, etc.), same restart policy, same log file convention. The decnet init renderer picks it up automatically via the decnet-*.service.j2 glob. Also reconciles a naming inconsistency I shipped earlier: the heartbeat name was 'intel' (the package) but the CLI command and unit are 'enrich' (the action). Renamed the heartbeat to 'enrich' so the workers panel displays the same string the operator types and the same string in the systemd unit file. Convention across the project: heartbeat name = registry key = unit basename = CLI command name. Registers 'enrich' in worker_registry.KNOWN_WORKERS and in the start-all preferred order. The decnet.target Wants= list also picks up the new unit so 'systemctl start decnet.target' brings everything up together.	2026-04-26 05:20:54 -04:00
anti	4ec0dd75c8	docs(roadmap): mark threat-intel enrichment shipped Out-of-band 'decnet enrich' worker landed across commits feat(intel): attacker_intel table → factory → providers → worker → CLI → API. v1 ships GreyNoise Community + AbuseIPDB + abuse.ch (Feodo Tracker bulk feed and ThreatFox per-IP). Shodan / Censys / OTX remain in the DEVELOPMENT_V2 backlog.	2026-04-26 05:18:05 -04:00
anti	d3d9bd5aa7	feat(intel): `decnet enrich` CLI + GET /attackers/{ip}/intel endpoint CLI command mirrors the reuse-correlate shape (--poll-interval, --ttl-hours, --daemon). Run it under systemd as a sibling worker. The API endpoint returns the most recent cached row for an attacker IP or 404. Auth-gated via require_viewer like every other attacker route. Also extends the worker test with a real FakeBus so the attacker.intel.enriched publish path is exercised end-to-end (no longer a no-op against NullBus).	2026-04-26 05:17:25 -04:00
anti	cd70136d09	feat(intel): wire GreyNoise, AbuseIPDB, Feodo Tracker + ThreatFox Four concrete IntelProvider impls — three per-IP queries plus one bulk feed: * GreyNoiseProvider — community endpoint, optional API key for higher rate limit. 404 = unknown (cache the absence so we don't re-query). * AbuseIPDBProvider — score threshold mapping (>=75 malicious, >=25 suspicious, else benign). Self-disables with a clear error when no API key is configured rather than burning quota. * FeodoProvider — fetches the bulk botnet C2 IP feed once per refresh window and answers every lookup from an in-memory set. Listed = C2. * ThreatFoxProvider — POST /api/v1/ search_ioc query, optional Auth-Key header. Match in data[] = malicious; no_result = absence-not-benign. Every provider routes through decnet.net.http.stealth_client so the egress UA never leaks 'DECNET'.	2026-04-26 05:15:17 -04:00

1 2 3 4 5 ...

765 Commits