Commit Graph

793 Commits

Author SHA1 Message Date
f11def0af1 fix(collector): strip port from remote_addr before attacker identity resolution
host:port in remote_addr was creating a distinct Attacker row per TCP
connection instead of per IP. Split on the last ':' in parse_rfc5424;
preserve the port as fields['remote_port'] so repeated source ports are
retained as fingerprint signal in bounty payloads.
2026-05-10 04:06:42 -04:00
6a6f5807aa fix(pr3): adapt to quic-go v0.59.0 API — drop H3App, capture h3 SETTINGS via http3.Settingser
quic-go v0.59.0 (shipped with Caddy v2.11.2) removed quic.Connection as
a public interface and quic-go/logging as a public package, breaking
H3App's connection-wrapping approach.

Resolution:
- Remove H3App (h3app.go) entirely; Caddy handles h3 natively when h3
  is in the protocols list.
- Rewrite h3conn.go to keep only tryParseH3ControlStream + varint/name
  utilities (tested, useful for future stream-level tapping if the API
  ever re-exposes it).
- FPHandler.ServeHTTP: for h3 requests, type-assert ResponseWriter to
  http3.Settingser (the public interface exposed by quic-go/http3 v0.59),
  read the peer's Settings after ReceivedSettings channel closes, emit
  h3_settings fp record.
- https/entrypoint.sh: include h3 in CADDY_PROTOCOLS (Caddy now owns
  UDP/443); remove DECNET_H3_GLOBAL block.
- Update go.mod/go.sum to caddy v2.11.2 + quic-go v0.59.0.
- Update test_https_compose_h3_app.py to expect h3 in protocols when
  http/3 is selected, and assert decnet_h3 block is absent.
- All Go tests (9) and Python tests (15) remain green.
2026-05-10 03:43:34 -04:00
5675dd8ebc feat(pr3): canonical wire-order header capture for h1/h2 + H3App for SETTINGS
- Renames caddy.listeners.decnet_h2fp → decnet_fp; adds h1 raw-byte
  header capture (plainTappingConn) and h2 continuous HPACK decode loop
  (parseH2HeadersLoop) so headers_ordered reflects actual wire order, not
  Go map iteration order.
- Adds H3App Caddy module (decnet_h3) that owns UDP/443 via quic-go,
  wraps accepted QUIC connections with h3SettingsTappingConn to intercept
  the h3 control stream and extract RFC 9114 SETTINGS in wire order.
- Wires access_log emission from FPHandler.ServeHTTP via responseCapture.
- Updates syslog_bridge.py (canonical + per-service copies) with inline
  _compute_ja4h and new fp socket record branches: http_request_headers,
  h3_settings, access_log.
- Fixes ingester proto field alias (bridge emits 'proto', ingester expected
  'protocol') and exposes _process_fingerprint_bounties test alias.
- Go tests: h1/h2/h3 golden-byte tests all green; h3_tracer_test covers
  varint parser, GREASE detection, truncated-stream safety.
- Python tests: 15/15 green across bridge JA4H hash parity, ingester
  compat (old + new event shapes), and Caddyfile h3 template assertions.
2026-05-10 03:29:00 -04:00
8d1f26c0c7 fix(https): move Flask backend to 8443 to avoid netns conflict with http service on 8080 2026-05-10 02:31:08 -04:00
44ab42d80c fix(server): add from __future__ import annotations for Python <3.9 compat 2026-05-10 02:23:13 -04:00
d09b891a55 fix(syslog_bridge): add fp socket reader to canonical template — sync was overwriting per-service copies 2026-05-10 02:17:56 -04:00
42b5d97a50 fix(syslog_bridge): rewrite both templates with from __future__ annotations, fp socket imports, and start_fp_socket_reader 2026-05-10 02:06:53 -04:00
1669f25733 fix(syslog_bridge): add from __future__ import annotations for Python <3.9 compat 2026-05-10 01:58:43 -04:00
255ccebf29 fix(entrypoint): fail-fast if Flask does not bind within timeout instead of silently starting Caddy with no backend 2026-05-10 01:51:09 -04:00
d4f391bab1 fix(caddy): remove explicit tls from listener_wrappers — Caddy applies it by default 2026-05-10 01:45:03 -04:00
38cf1e6c6d fix(caddy+syslog): add UnmarshalCaddyfile to H2FP/FP handlers; add start_fp_socket_reader to syslog_bridge 2026-05-10 01:39:04 -04:00
6618b3c2a1 fix(topology): publish UDP/443 on gateway base when https service has http/3 enabled 2026-05-10 01:33:01 -04:00
7b54944fcc fix(https): remove ports from compose fragment — MACVLAN makes port publishing incompatible with network_mode 2026-05-10 01:29:46 -04:00
46963cbeec fix(deployer): chown synced _caddy_modules back to source owner after root copy 2026-05-10 01:26:13 -04:00
f2b0d286b3 fix(caddy): correct caddyhttp import path to modules/caddyhttp 2026-05-10 01:22:00 -04:00
f1ac1b4004 fix(deploy): sync _caddy_modules into http/https build contexts before compose up 2026-05-10 01:11:44 -04:00
3154224f68 fix(docker): hoist ARG BASE_IMAGE before first FROM so it scopes to all stages 2026-05-10 01:05:00 -04:00
52a52eee78 fix(network): reload network before checking Containers on IPAM drift
networks.list() returns bare objects — Containers is always empty
without a reload(). The active-endpoint guard from the prior commit
never fired because it was checking a stale empty dict.
2026-05-10 00:56:56 -04:00
251181255b fix(network): reuse existing decnet_lan when active deckies are connected
Docker refuses network removal (403) when containers hold endpoints.
The old IPAM-drift path tried to disconnect+remove even with live
containers — disconnect silently failed, remove raised APIError.

Since DECNET assigns IPs explicitly in compose (never via Docker's
auto-assign pool), an ip_range mismatch on an existing same-driver
network is harmless. Bail out early and attach to the existing network
whenever Containers is non-empty.
2026-05-10 00:50:41 -04:00
92632d7afd feat(pr2): HTTP/2+HTTP/3 fingerprint extractors — JA4H, H2 SETTINGS, JA4-QUIC 2026-05-10 00:47:19 -04:00
0653e500b5 feat(services): HTTP/2 + HTTP/3 support via Caddy reverse-proxy
Swap Werkzeug for Caddy as the protocol layer for http and https decoy
services. Flask keeps owning app logic (fake_app, custom_body, headers,
syslog) on 127.0.0.1:8080; Caddy terminates h1/h2/h2c/h3 on the wire
with real-world TLS/QUIC fingerprints.

- Add `multi_enum` FieldType to ServiceConfigField + _coerce
- Add `http_versions` field to HTTPService (h1/h2c) and HTTPSService
  (h1/h2/h3); selecting h3 emits UDP/443 port mapping in compose
- Rewrite both Dockerfiles with multi-stage Caddy binary copy +
  setcap for port binding as the logrelay user
- Entrypoints parse HTTP_VERSIONS JSON, render a Caddyfile, start
  Flask in background, wait for it, then exec Caddy
- https/server.py drops direct TLS handling; Caddy owns the cert
- Add ProxyFix to both server.py so Flask sees real attacker IPs
- Frontend: multi_enum checkbox-group renderer in ServiceConfigFields;
  FormValue union extended to string[]; compactPayload skips []
- Fix stale test_smtp_relay_schema_matches_smtp: relay schema is a
  superset of smtp, not equal; update assertions accordingly
2026-05-10 00:04:37 -04:00
41b8e9b7b3 feat(realism/llm): GET/PUT /api/v1/realism/llm + worker hot-reload tick 2026-05-09 23:12:29 -04:00
155ab59ee8 feat(realism/llm): DB-backed LLMConfig, factory DB-first dispatch, Ollama HTTP mode 2026-05-09 23:09:36 -04:00
f10201e885 feat(secrets): Fernet encrypt/decrypt helper for DB-stored operator secrets 2026-05-09 23:07:24 -04:00
4c6b12dcf8 feat(stix_export): wire fingerprint bounties through all endpoints + tests
Remaining files from the fingerprint-bounties + characterizes-SRO commit:
misp_export, repository, bounties mixin, all 4 router endpoints, and test suite
updates. Prerequisite: previous commit added _extract_fingerprint_bounty_data
and the stix_export changes.
2026-05-09 09:14:48 -04:00
51d0fc7b6c feat(stix_export): HTTP quirks + JARM in protocol_fingerprints; characterizes SRO
Wire fingerprint bounties (JARM hashes, HTTP header quirks) from the bounties
table into the DecnetActorFingerprintExt.protocol_fingerprints group so the
sniffer/profiler-captured HTTP fingerprinting data surfaces in every STIX export.

Add a stix2.Relationship(relationship_type="characterizes") SRO linking each
x-decnet-behave-profile SDO back to its ThreatActor so graph-traversal tools
can follow the edge without relying on the bare x_decnet_behave_profile_ref
custom string property alone.

New repo surface:
- get_fingerprint_bounties_by_ip(ip) -> list[dict]
- get_all_fingerprint_bounties_for_export() -> dict[str, list[dict]]

All 4 export endpoints (per-attacker + fleet, STIX + MISP) extended with the
new gather slot. 50/50 tests green, mypy clean.
2026-05-09 09:14:29 -04:00
97c99a4e03 feat(ttp): rich ThreatActor STIX extensions via CustomExtension + CustomObject
- stix_custom.py: DecnetActorFingerprintExt (@CustomExtension) wrapping
  network_behavior (os_guess/hop_distance/tcp_fingerprint/timing_stats/
  phase_sequence/behavior_class/beacon fields/tool_guesses) and
  protocol_fingerprints (ja3_hashes/hassh_hashes/kex_order_raw/
  ssh_client_banners/tls_cert_sha256/payload_simhashes/c2_endpoints).
  XDecnetBehaveProfile (@CustomObject x-decnet-behave-profile) carrying
  full BEHAVE-SHELL observation envelopes + kd_digraph_simhash.
  FINGERPRINT_EXT_DEF singleton extension-definition SDO.
- Drop legacy flat x_decnet_ja3_hashes / x_decnet_hassh_hashes /
  x_decnet_c2_endpoints (pre-v1, no consumers).
- stix_export: _threat_actor() wired to behavior + observations;
  build_attacker_bundle/build_fleet_bundle grow observations parameter.
- Repo: list_observations_by_attacker + get_all_observations_for_export
  abstract + sqlmodel impl; all four export endpoints extended.
- 18 new tests; inter-DECNET round-trip (stix2.parse → typed objects)
  is the primary fidelity assertion.
2026-05-09 08:52:19 -04:00
1200ac9132 feat(stix): STIX→MISP download export (per-attacker + fleet)
Adds GET /api/v1/attackers/{uuid}/export/misp and
GET /api/v1/attackers/export/misp backed by misp_export.py, which
converts existing STIX bundles to MISP events via misp-stix
ExternalSTIX2toMISPParser. Fleet endpoint emits {response:[...]}
collection (one event per attacker). Frontend: STIX/MISP buttons on
AttackerDetail header and Attackers list. 13 new tests green.
2026-05-09 08:04:25 -04:00
8990d9321d fix(ttp/stix): add Sighting SRO per process execution to link commands to threat-actor 2026-05-09 07:47:44 -04:00
d6a091be75 fix(ttp/stix): extract commands from both 'command' and 'command_text' keys 2026-05-09 07:43:44 -04:00
c210a56fc8 feat(ttp/stix): fleet-wide STIX 2.1 export — GET /api/v1/attackers/export/stix 2026-05-09 07:37:41 -04:00
f827197cc8 feat(ttp/stix): add deduped process SCOs for attacker commands 2026-05-09 07:33:30 -04:00
1ee7a4a481 fix(ttp/stix_export): _aware() handles ISO string timestamps from DB 2026-05-09 07:26:48 -04:00
fe0ed4a251 feat(ttp): STIX 2.1 bundle export for individual attackers
GET /api/v1/attackers/{uuid}/export/stix returns a self-contained STIX
2.1 bundle: ip observation, threat-actor, ATT&CK attack-patterns with
canonical MITRE IDs, uses relationships, per-tag sightings, file SCOs
for artifacts, domain-name SCOs for SMTP targets, and a provider intel
note. Attack-pattern SDOs carry the MITRE bundle IDs so consumers
deduplicating against the public ATT&CK bundle get exact matches.
2026-05-09 07:21:22 -04:00
1d3086a5c7 feat(web): GET /api/v1/ttp/techniques/{id}/groups — MITRE-tracked groups using a technique
Surfaces the intrusion-set reverse index from the loaded ATT&CK
bundle: given a technique, returns the list of groups MITRE has
documented as using it. Read-only — explicitly NOT an attribution
claim about a DECNET attacker. The frontend pulls this lazily when
the operator expands a technique panel; payload-size cost on every
TTPTagDetailRow makes embedding wasteful for techniques with 50+
documented groups.

- decnet/web/router/ttp/api_get_groups_for_technique.py exposes
  GET /api/v1/ttp/techniques/{technique_id}/groups, response_model
  list[GroupRef]. Same JWT-viewer auth gating as the rest of the
  TTP router. 404 when the technique_id doesn't resolve in the
  bundle.
- Sub-techniques are queried directly (no auto-union with parent)
  to match ATT&CK Navigator semantics; callers that want a broader
  view query the parent themselves.
- tests/ttp/test_groups_for_technique.py covers happy path, 404,
  sub-technique attribution independence, empty-list-on-zero-groups,
  and that responses include mitre_url + aliases.
- tests/web/test_api_attackers.py: fix pre-existing fixture drift
  introduced by a2a61b63 — three TestGetAttackerDetail cases were
  missing AsyncMock for repo.latest_observation_per_primitive,
  causing TypeError on await of MagicMock. The new groups endpoint
  doesn't share code with attacker_detail; this is a drive-by fix
  surfaced by the same suite run.
2026-05-09 06:45:25 -04:00
84a075e405 feat(ttp): promote mitre_url to first-class TTPTag column + propagate everywhere
Phase 2 attached mitre_url to intel-emitted tags' evidence JSON;
Phase 3 promotes it to a real column populated for *every* tag —
intel, credential, behavioral, canary, identity, email, rule-engine —
from one source. Pre-v1, so the SQLModel field is added directly
without an Alembic migration.

- TTPTag gains mitre_url: Optional[str] (not indexed — derived
  deeplink, not a query target; technique_id is already indexed).
- _emit.py and rule_engine._evaluate_rules both populate mitre_url
  via attack_stix.mitre_url_for(sub_technique_id or technique_id).
  Sub-technique URL when present, else parent. The two construction
  sites stay separate because the rule_engine path carries per-emit
  span instrumentation that emit_tags() can't preserve without
  threading a span object through; minimal-change beats forced
  refactor here.
- intel_lifter strips mitre_url from evidence_extra in all four
  decision functions. The column is canonical now; duplicating in
  the JSON column would drift when the bundle moves. The unused
  TechniqueEmission import + tracking dicts removed too.
- IdentityTechniqueRow / TechniqueRollupRow / TTPTagDetailRow /
  CampaignTechniqueRow gain mitre_url: Optional[str].
- sqlmodel_repo/ttp.py:_mitre_url_for added; the 5 row-builder sites
  pass mitre_url=_mitre_url_for(sub_technique_id or technique_id)
  alongside the existing technique_name resolution.
- api_get_tag_details.py needs no change — list_tags_by_scope_and
  _technique already returns model_dump() rows that flow the new
  column through **row spread to TTPTagDetailRow.
- tests/ttp/test_emit_attaches_mitre_url.py covers both construction
  paths (top-level, sub-tech, unknown, multi-emit) and a regression
  test that intel_lifter evidence dicts no longer contain mitre_url.
2026-05-09 06:40:08 -04:00
e50474cb66 feat(ttp): add mitre_url_for + groups_using_technique helpers
Two reusable bundle-derived lookups that the next two commits build
on:

- mitre_url_for(tid) returns the canonical attack.mitre.org URL by
  reading external_references on the cached attack-pattern. Backed
  by the existing lru-cached _attack_pattern_by_id so per-call cost
  is constant. Handles top-level techniques and sub-techniques
  (T1059.004 -> .../techniques/T1059/004).
- GroupRef + groups_using_technique(tid) surface the intrusion-set
  reverse index from the loaded bundle: given a technique, return
  the MITRE-tracked groups documented as using it. Sorted by
  group_id for deterministic responses; lru-cached. Sub-technique
  semantics match ATT&CK Navigator (do NOT auto-union with parent).
- decnet/ttp/data/intel_loader._mitre_url_for collapses to a thin
  re-export of attack_stix.mitre_url_for; the loader keeps mitre_url
  on TechniqueEmission for the eventual STIX export.
- tests/ttp/test_attack_url.py covers both helpers: top-level + sub
  URLs, unknown -> None / (), GroupRef immutability + hashability,
  deterministic ordering, sub-technique distinct from parent.
2026-05-09 06:32:04 -04:00
d25f69ba1b feat(ttp): extract intel_lifter provider mappings to YAML data + ATT&CK external_reference enrichment
The four provider→technique tables (AbuseIPDB cat→techniques,
GreyNoise tag→techniques, ThreatFox threat_type→techniques, plus
the Feodo binary-listed signal) used to live as Final[dict] constants
in intel_lifter.py. Two real problems with that:

1. Drift between rules/ttp/R0054.yaml..R0058.yaml (which declare
   the full slate per provider) and the Python dicts (which decide
   which slate-member fires per signal). The v2 audit comment in
   intel_lifter.py documented that they had silently drifted.
2. No ATT&CK provenance on emissions — the loaded STIX bundle has
   rich external_references (canonical attack.mitre.org URLs) that
   never surfaced because the lifter had no path back to them.

Mappings now live as YAML at decnet/ttp/data/intel/{provider}.yaml,
validated at load against the loaded ATT&CK bundle, with each entry
enriched by attack_stix._attack_pattern_by_id to attach the canonical
MITRE URL to every emission.

- decnet/ttp/data/intel_loader.py: pydantic-validated schema +
  ProviderMapping/Signal/TechniqueEmission frozen dataclasses +
  load_provider_mapping(provider) lru-cached.
- Per-technique high_score_threshold inlined into YAML
  (collapses the separate _ABUSEIPDB_HIGH_SCORE_GATED dict).
- external_reference field follows the STIX 2.1 external-reference
  shape (source_name + url + optional external_id) so the future
  STIX/MISP exporter is a direct translation.
- intel_lifter.py: dicts deleted, decision functions read from
  ProviderMapping accessors. Decision-flow constants (T1071/T1595
  bare-classification fallbacks in _greynoise_decisions) stay in
  code — they're not table rows.
- Each emit slot's evidence_extra now carries mitre_url for any
  technique resolved in the bundle (every one in practice).
- tests/ttp/test_intel_mappings.py: snapshot equivalence vs the
  legacy dicts, high-score gate behavior, every-signal-has-an-
  external-reference, every-emission-has-a-mitre-url, negative
  paths (unknown technique_id raises AttackBundleError, mismatched
  provider field rejected, dir listing matches expected providers).

The YAML schema + mitre_url enrichment lays groundwork for the
future STIX exporter; this commit does NOT build that exporter.
2026-05-09 06:18:25 -04:00
a3f1cea2d6 feat(ttp): fetch + verify MITRE ATT&CK LICENSE alongside the bundle
MITRE's ATT&CK Terms of Use require reproducing their copyright +
license alongside any cached copy of ATT&CK data. Today we ship the
bundle but not the license — this commit closes that compliance gap.

- attack_version.py pins ATTACK_LICENSE_URL +
  ATTACK_LICENSE_SHA256 + ATTACK_LICENSE_FILENAME, sourced from the
  same attack-stix-data repo as the bundle.
- attack_stix.py:_fetch_license downloads LICENSE.txt next to the
  bundle. License sha mismatch is logged + refreshed (license text
  gets occasional formatting tweaks; not a security event), unlike
  the bundle which stays fail-closed.
- _ensure_license is the compliance ratchet: resolve_bundle_path
  refuses to return without LICENSE.txt on disk. Override-mode
  (DECNET_ATTACK_BUNDLE) checks for a sibling LICENSE.txt first,
  then DECNET_ATTACK_LICENSE, then the cache dir.
- python -m decnet.ttp.attack_stix license prints the cached license
  to stdout for operator audit.
- loaded_license_path() exposes the active license path read-only.
- tests/ttp/test_attack_license.py covers happy paths (sibling +
  explicit env), refusal when DECNET_ATTACK_LICENSE points at a
  missing file, the CLI subcommand, and the pinned-sha shape.
2026-05-09 06:17:46 -04:00
432057f44a feat(ttp): fail-closed validation that lifter+UKC IDs resolve in ATT&CK bundle
Drift between the technique/tactic IDs hardcoded in the lifters and
what the loaded ATT&CK STIX bundle actually contains is silent in the
status quo: a renamed-or-retired technique just stops being tagged.
Every emission point now has an explicit validator that asserts its
IDs resolve in the loaded bundle, called once at TTP-worker boot.

- intel_lifter.all_emitted_technique_ids() collects every technique
  the four provider tables (AbuseIPDB / GreyNoise / Feodo / ThreatFox)
  plus the decision-flow constants in _greynoise_decisions and
  _feodo_decisions can emit. validate_against_attack_bundle() runs it
  through attack_stix.assert_known_technique_ids().
- ukc.validate_against_attack_bundle() asserts every key in
  ATTACK_TACTIC_TO_UKC resolves, with TA0100..TA0106 documented as
  _NON_ENTERPRISE_TACTICS (lives in the ICS bundle, not the
  enterprise bundle DECNET loads).
- decnet/ttp/worker.py:run_ttp_worker_loop calls both validators
  before subscribing to the bus. A bundle-vs-code mismatch refuses
  to start the worker rather than silently mistagging events.
- tests/ttp/test_attack_bundle_validation.py covers the happy path
  for both validators, the negative path (injected bogus tactic ID
  raises AttackBundleError), the ICS exemption, and the lone T1078
  reference in credential_lifter.
2026-05-09 05:58:06 -04:00
d743d38cac feat(ttp): load MITRE ATT&CK from official STIX 2.1 bundle
Replace the hand-maintained TECHNIQUE_NAMES dict (pinned to v15.1) with
a runtime loader that reads the official enterprise-attack-N.json STIX
bundle. Version bumps now require only updating attack_version.py;
sub-technique parents, tactic IDs, and kill-chain phases all come from
MITRE's published data.

- decnet/ttp/attack_version.py pins version 19.0 + sha256 + URL
- decnet/ttp/attack_stix.py is the lazy STIX loader. Resolution order:
  DECNET_ATTACK_BUNDLE env -> ~/.cache/decnet/attack/ -> fetch from
  the pinned MITRE GitHub URL. SHA-256 verified before parse;
  mismatch fails closed.
- decnet/ttp/attack_catalog.py collapses to a shim re-exporting
  technique_name() so the ~9 router/repo call sites don't churn.
- python -m decnet.ttp.attack_stix fetch warms the cache and can
  print sha256 for version-bump workflows.
- test_attack_catalog.py now asserts every rule-emitted ID resolves
  in the loaded bundle (same contract, real source) and exercises
  the SHA-256-mismatch fail-closed path.
2026-05-09 05:54:36 -04:00
65ddaaa681 fix(behave_shell/F.0): tighten prompt detector — log lines ending in '>' no longer vote
_detect_prompt_suffix accepted ANY line ending in $#%> as a PS1 prompt,
so a single `cat /var/log/dpkg.log` (195 lines closing in `<none>`)
flooded environmental.shell_type votes and flipped a plainly-bash
session to fish.

A prompt line now requires either a trailing space after the suffix
(default PS1 shape across bash/zsh/fish/PowerShell) or a PS1-shape
token (user@host, "PS " prefix, or a Windows drive-letter prefix).

Regression tests pin the dpkg.log false-positive and a $-terminated
prose line.
2026-05-09 02:57:40 -04:00
0c1fc68b13 feat(deploy): wire attribution worker — CLI + systemd unit + registry
* decnet attribution — Typer command mirroring decnet reuse-correlate
  (--multi-actor-tick, --daemon flags). Calls run_attribution_loop
  with the dependency-injected repo.
* deploy/decnet-attribution.service.j2 — systemd unit mirroring
  decnet-reuse-correlator.service.j2: ExecStart=decnet attribution,
  same hardening posture (NoNewPrivileges, ProtectSystem=full,
  ProtectHome=read-only, dedicated /var/log/decnet/decnet.attribution.log).
* worker_registry.KNOWN_WORKERS += "attribution" — heartbeat already
  publishes as system.attribution.health from
  attribution_worker._WORKER_NAME, so the Workers panel surfaces the
  row the moment the unit is enabled.
* api_start_all_workers preferred-order list + "attribution" between
  reuse-correlator and enrich so a fresh start-all brings it up
  alongside its peers.

After this commit `systemctl enable --now decnet-attribution` (or
the dashboard's start-all) actually launches the engine.
2026-05-09 02:31:59 -04:00
33f7d5a9ff feat(web): expose attribution state on AttackerDetail backend (Phase 6)
GET /api/v1/attackers/{uuid}/attribution

Returns the merger output for an attacker's identity:

    {
        "identity_uuid": "abc..." | null,
        "primitives": [
            {primitive, current_value, state, confidence,
             observation_count, last_change_ts, last_observation_ts},
            ...
        ]
    }

Pre-attribution-worker: identity_uuid=null, primitives=[]. Surfacing
identity_uuid keeps the cross-attacker rollup story visible to the
frontend ahead of v1's clusterer landing.

api_events SSE relay also subscribes to attribution.> and forwards
to the AttackerDetail page filtered on payload.identity_uuid (the
identity is resolved at stream open from the URL's attacker_uuid;
attribution payloads are identity-keyed, not attacker-keyed). New
SSE event names: attribution.state_changed,
attribution.multi_actor_suspected.

Frontend (AttackerDetail.tsx badge rendering, useAttackerStream
consumer) deferred — there's already WIP on AttackerDetail.tsx in
the working tree; merging the badge logic is a separate commit
once that lands.

Tests: 4 endpoint scenarios — 401 unauth, 404 unknown attacker,
200 empty (no stub), 200 with primitive-ordered rows.
2026-05-09 02:21:59 -04:00
e2c7e16793 feat(correlation/attribution): cross-primitive multi-actor detection (Phase 5)
Add tick_multi_actor() — periodic walk of attribution_state firing
attribution.profile.multi_actor_suspected when an identity carries
>= MULTI_ACTOR_MIN_PRIMITIVES rows in multi_actor state.

* Repo's list_multi_actor_identities() already filters to >= 2
  primitives; the correlator just dispatches.
* In-memory dedup keyed on identity_uuid -> frozenset(primitives):
  same set as last fire -> no re-emit. Set grows -> re-emit.
  Set shrinks below threshold -> evict so a future re-flap re-fires.
  Restart-resets are honest because attribution_state persists; a
  v1 multi_actor_suspect_log table can replace this if needed.
* run_attribution_loop() now supervises three concurrent tasks:
  observation handler, multi_actor tick loop, health/control. Tick
  interval comes from _thresholds.MULTI_ACTOR_TICK_SECS (60s) with
  test override.

Tests: 6 scenarios — single-primitive doesn't fire, two-primitive
co-flag fires, dedup blocks unchanged set, set growth re-fires,
threshold drop re-arms, multiple identities fire independently.
2026-05-09 02:18:42 -04:00
dd265d7520 feat(correlation/attribution): wire bus handler, persist state (Phase 4)
attribution_worker.handle_observation_event now executes the full
end-to-end path:

* ensure stub identity (Phase 1)
* observations_for_identity_primitive() — new repo helper joining
  observations through attackers.identity_id, so v1's clusterer
  gets cross-attacker rollup for free
* aggregate_observations() with ValueKind dispatched off the BEHAVE
  PRIMITIVE_REGISTRY; unknown primitives default to categorical
* upsert_attribution_state() — last_change_ts locked when state is
  unchanged so the dashboard can render "stable since X"
* publish attribution.profile.state_changed only on transition;
  idempotent re-runs over the same observation set fire nothing
  (loop-prevention invariant matching ttp.tagged)

Tests:
* 5 end-to-end attribution scenarios over in-memory SQLite + FakeBus.
* test_base_repo's DummyRepo + coverage body now stub every abstract
  surface BaseRepository declares — the 6 added by this branch plus
  the 12 left un-stubbed by earlier work (BEHAVE Phase 1, TTP
  rollups, iter helpers). The coverage test could not previously
  even instantiate.
* test_aggregate_categorical's dispatcher rejection updated for the
  Phase 3 + 4 contract — ValueError on unknown kinds, not
  NotImplementedError.
2026-05-09 02:16:12 -04:00
c39802a4bb feat(correlation/attribution): hash + numeric merge functions (Phase 3)
aggregate_numeric(): EWMA + dispersion (CV) over numeric primitive
values. Stable when CV < 20% AND mean shift < 30%; drifting on >= 30%
mean shift; conflicted on CV > 100%. Confidence is 1 - min(CV, 1).
multi_actor is intentionally NOT a numeric state — bimodal
distributions belong to the categorical detector once the value space
is bucketed.

aggregate_hash(): counts distinct hash values within
HASH_DRIFT_WINDOW_SECS of the most recent observation. 0 rotations =
stable, 1..HASH_DRIFT_MAX = drifting, > HASH_DRIFT_MAX = conflicted.
Reads rotation events; never recomputes hashes (DEBT-032 already
produces them via decnet.correlation.fingerprint_rotation).

aggregate_observations() dispatcher now routes "categorical" |
"numeric" | "hash" | None and rejects unknown kinds with ValueError
(louder than NotImplementedError now that all three v0 mergers
exist). 17 synthetic-input tests cover both new mergers and the
dispatcher.
2026-05-09 01:59:11 -04:00
4956977739 feat(correlation/attribution): categorical merge state machine (Phase 2)
aggregate_categorical(): pure function over a per-(identity, primitive)
observation list. Five-state vocabulary, last-N=5 window comparison
with one-outlier-tolerant majority threshold:

* unknown — < 3 observations
* stable — recent 5 agree (≥ 4 of 5 share top value), older 5 same
* drifting — recent 5 stable but disagrees with older 5, or older
  was conflicted and recent stabilised
* conflicted — recent 5 split, no two-value alternation pattern
* multi_actor — recent 5 split + alternation between exactly two
  values (operator A↔B handoff). Confidence capped at 0.6 per
  _thresholds.MULTI_ACTOR_MAX_CONFIDENCE; flapping primitives on
  flaky networks would otherwise look like two operators.

aggregate_observations() dispatcher honours value_kind="categorical"
(or None) and raises NotImplementedError for "numeric" / "hash" so
Phase 3 lands cleanly. 14 synthetic-input tests cover every state
+ boundary condition.
2026-05-08 23:18:22 -04:00
c2891d6cca feat(correlation/attribution): substrate + idle handler (Phase 1)
v0 Phase 1 of ATTRIBUTION-ENGINE.md:

* AttributionStateRow SQLModel keyed on (identity_uuid, primitive)
  per ANTI direction — re-keying state rows when the v1 clusterer
  merges attackers is the migration debt v0 should not bake in.
  ATTRIBUTION-ENGINE.md updated with the deviation note.
* AttributionMixin: ensure_stub_identity_for_attacker, idempotent
  upsert_attribution_state, get_attribution_state[_for_identity],
  list_multi_actor_identities (the Phase 5 correlator's read).
* attribution.profile.{state_changed,multi_actor_suspected} bus
  topics + builder; wiki Service-Bus.md updated separately.
* attribution_worker.py: subscribes to attacker.observation.>,
  ensures stub identity per event, logs and continues. No merger,
  no state writes, no derived events — Phase 4 wires those.
* attribution/{aggregate.py,_thresholds.py} skeletons: Phase 2
  fills _aggregate_categorical, Phase 3 adds numeric+hash+dispatcher.
2026-05-08 23:16:13 -04:00
e94ab608d9 fix(profiler/behave_shell): tolerate non-UTF-8 bytes in shard reads
Real-world bug surfaced on the first live decky run: sessrec.c's
json_escape (decnet/templates/_shared/sessrec/sessrec.c:111-141)
only escapes bytes < 0x20 + DEL — bytes >= 0x80 pass through raw.
An attacker pasting Latin-1 / GB18030 / any non-UTF-8 8-bit text
yields a shard line that chokes Python's default UTF-8 text-mode
read with 'utf-8 codec can't decode byte 0xac'.

Three changes:

1. _events_for_sid now opens with errors='surrogateescape', preserving
   byte fidelity through the JSON parse. Surrogate-half chars
   correctly fail isascii() / isalpha() so the typed-letter
   histograms filter them out automatically. Tightening sessrec.c to
   escape >= 0x80 is filed for v0.2 — that's the proper forensic-data
   fix; the surrogateescape read makes the engine robust meanwhile.

2. Regression test
   (test_handler_tolerates_non_utf8_bytes_in_shard) builds a shard
   with raw 0xAC bytes inside a JSON 'data' string and asserts the
   handler still persists observations.

3. Collector's _emit_session now logs at WARNING (was DEBUG) when
   find_shard_with_sid returns None, citing the three usual causes
   (ARTIFACTS_ROOT perms, _SERVICE_RE whitelist, sessrec/collector
   race). Surfaces the silent-skip class of bug in seconds instead of
   hours — the first live run hid a perm mismatch
   (User=anti without SupplementaryGroups=decnet) for an entire
   session window before the symptom was traced upstream.
2026-05-08 22:52:46 -04:00