Commit Graph

131 Commits

Author SHA1 Message Date
ff51ce55e2 fix(tests): eliminate tarpit OOM from global asyncio.sleep mock
Two interacting bugs caused asyncio.sleep to be mocked globally,
letting tarpit_watcher_worker spin the event loop on a non-async
mock and accumulate _increment_mock_call records without bound:

1. test_ingester.py patched `decnet.web.ingester.asyncio.sleep` via
   the asyncio singleton — any code in the process using asyncio.sleep
   (including the tarpit worker) hit the fake_sleep side_effect.
   Fix: add `_sleep = asyncio.sleep` alias in ingester.py and patch
   `decnet.web.ingester._sleep` instead — scopes the mock to ingester.

2. test_api_startup_guards.py called `_run_lifespan_startup` without
   DECNET_CONTRACT_TEST=true, which started the real tarpit task in a
   manually-constructed event loop that the tests never cancelled.
   Fix: set DECNET_CONTRACT_TEST=true inside _run_lifespan_startup so
   the lifespan skips all background workers.
2026-05-10 10:06:21 -04:00
9a7b03700c refactor(intel): migrate AttackerIntel JSON-string columns to native SQLAlchemy JSON
Five list columns (greynoise_tags, abuseipdb_categories, threatfox_threat_types,
threatfox_ioc_types, threatfox_malware_families) and four dict columns
(*_raw) are now Column(JSON) with list/dict type annotations and
default_factory=list/dict. Providers return native Python objects; the
application-layer json.dumps/json.loads round-trip and _decode_json_list
helpers are gone. to_intel_event_payload() reads columns directly.

Also caps pytest xdist at -n 4 and excludes tests/api from norecursedirs
to prevent schemathesis workers from OOM-killing the dev loop.
2026-05-10 09:17:15 -04:00
6e7020f2aa feat(ttp): implement E.3.14b intel catch-up via attacker.session.ended
On every attacker.session.ended event, the TTP worker now reads the
persisted AttackerIntel row (if any) and synthesizes an intel-source
TaggerEvent so intel-derived tags emit even when attacker.intel.enriched
was dropped or arrived before the worker started.

Key changes:
- AttackerIntel.to_intel_event_payload() — single source of truth for
  the intel-row → lifter payload projection; shared by future callers
  without importing decnet.intel.* (no-SPOF contract preserved).
- BaseRepository.get_attacker_intel_row_by_uuid() — returns the live
  SQLModel instance so the catch-up path can call to_intel_event_payload().
- _build_intel_catchup_event() in ttp/worker.py — looks up the intel row,
  builds the TaggerEvent, returns None on absent row (silence, not error).
- _process_event() extended: appends the catch-up event to tagger_events
  when topic contains "session.ended". Deterministic source_id keeps
  compute_tag_uuid idempotent across replays; INSERT OR IGNORE deduplicates
  against any prior attacker.intel.enriched path.

DummyRepo stub + coverage call added per feedback_run_base_repo_test.md.
2026-05-10 08:27:22 -04:00
39518e33b4 feat(ttp): implement evidence-shape validation and confidence range constraint
- TolerantTagger.tag validates evidence keys against EVIDENCE_SCHEMA TypedDicts;
  TypeError (programmer error) propagates instead of being swallowed
- IntelEvidence and EmailEvidence expanded from stubs to full per-provider
  key sets (total=False); IntelEvidence old stub fields replaced wholesale
- EVIDENCE_SCHEMA map added to models/ttp.py and imported by base.py
- TTPTag __table_args__ gains confidence [0,1] CheckConstraint (DB-enforced)
- xfail removed from test_confidence_outside_range_rejected_at_insert and
  test_evidence_shape_violation_propagates_as_typeerror — both now pass
- TypeError removed from _SWALLOWED_EXCS fuzz list; test_intel_evidence_keys
  updated to assert the real provider key set
2026-05-10 07:56:52 -04:00
e4626879f6 perf(pytest): 194s → 4s collection — lazy heavy imports + norecursedirs
Four-part fix for the collection bottleneck that was blocking the dev loop:

1. Lazy mitreattack.stix20 import in attack_stix.py — deferred to first
   _load() call (TYPE_CHECKING guard at top level)

2. Lazy misp_stix_converter import in both MISP export routers — moved
   from module level into the route handler body

3. Lazy attack_catalog / attack_stix in ttp.py repo mixin — thin wrapper
   functions so the import chain never fires at module load time

4. tests/api/conftest.py — `from decnet.web.api import app` moved inside
   the `client()` fixture; `pytest_ignore_collect` broadened to skip all
   test_schemathesis*.py variants (not just test_schemathesis.py), which
   were launching a subprocess server at module-import time

5. pyproject.toml — `norecursedirs` for tests/live, tests/stress,
   tests/service_testing, tests/docker, tests/perf so these directories
   are never entered; `-m` filter removed from addopts (now redundant);
   `--dist loadscope` → `--dist load` to unblock workers immediately

6. behave_core / behave_shell rename — BEHAVE packages dropped the
   `decnet_` prefix; reinstalled editable installs and updated all 14
   import sites across profiler, ttp, bus, and correlation modules
2026-05-10 06:41:25 -04:00
f11def0af1 fix(collector): strip port from remote_addr before attacker identity resolution
host:port in remote_addr was creating a distinct Attacker row per TCP
connection instead of per IP. Split on the last ':' in parse_rfc5424;
preserve the port as fields['remote_port'] so repeated source ports are
retained as fingerprint signal in bounty payloads.
2026-05-10 04:06:42 -04:00
5675dd8ebc feat(pr3): canonical wire-order header capture for h1/h2 + H3App for SETTINGS
- Renames caddy.listeners.decnet_h2fp → decnet_fp; adds h1 raw-byte
  header capture (plainTappingConn) and h2 continuous HPACK decode loop
  (parseH2HeadersLoop) so headers_ordered reflects actual wire order, not
  Go map iteration order.
- Adds H3App Caddy module (decnet_h3) that owns UDP/443 via quic-go,
  wraps accepted QUIC connections with h3SettingsTappingConn to intercept
  the h3 control stream and extract RFC 9114 SETTINGS in wire order.
- Wires access_log emission from FPHandler.ServeHTTP via responseCapture.
- Updates syslog_bridge.py (canonical + per-service copies) with inline
  _compute_ja4h and new fp socket record branches: http_request_headers,
  h3_settings, access_log.
- Fixes ingester proto field alias (bridge emits 'proto', ingester expected
  'protocol') and exposes _process_fingerprint_bounties test alias.
- Go tests: h1/h2/h3 golden-byte tests all green; h3_tracer_test covers
  varint parser, GREASE detection, truncated-stream safety.
- Python tests: 15/15 green across bridge JA4H hash parity, ingester
  compat (old + new event shapes), and Caddyfile h3 template assertions.
2026-05-10 03:29:00 -04:00
92632d7afd feat(pr2): HTTP/2+HTTP/3 fingerprint extractors — JA4H, H2 SETTINGS, JA4-QUIC 2026-05-10 00:47:19 -04:00
41b8e9b7b3 feat(realism/llm): GET/PUT /api/v1/realism/llm + worker hot-reload tick 2026-05-09 23:12:29 -04:00
f10201e885 feat(secrets): Fernet encrypt/decrypt helper for DB-stored operator secrets 2026-05-09 23:07:24 -04:00
4c6b12dcf8 feat(stix_export): wire fingerprint bounties through all endpoints + tests
Remaining files from the fingerprint-bounties + characterizes-SRO commit:
misp_export, repository, bounties mixin, all 4 router endpoints, and test suite
updates. Prerequisite: previous commit added _extract_fingerprint_bounty_data
and the stix_export changes.
2026-05-09 09:14:48 -04:00
97c99a4e03 feat(ttp): rich ThreatActor STIX extensions via CustomExtension + CustomObject
- stix_custom.py: DecnetActorFingerprintExt (@CustomExtension) wrapping
  network_behavior (os_guess/hop_distance/tcp_fingerprint/timing_stats/
  phase_sequence/behavior_class/beacon fields/tool_guesses) and
  protocol_fingerprints (ja3_hashes/hassh_hashes/kex_order_raw/
  ssh_client_banners/tls_cert_sha256/payload_simhashes/c2_endpoints).
  XDecnetBehaveProfile (@CustomObject x-decnet-behave-profile) carrying
  full BEHAVE-SHELL observation envelopes + kd_digraph_simhash.
  FINGERPRINT_EXT_DEF singleton extension-definition SDO.
- Drop legacy flat x_decnet_ja3_hashes / x_decnet_hassh_hashes /
  x_decnet_c2_endpoints (pre-v1, no consumers).
- stix_export: _threat_actor() wired to behavior + observations;
  build_attacker_bundle/build_fleet_bundle grow observations parameter.
- Repo: list_observations_by_attacker + get_all_observations_for_export
  abstract + sqlmodel impl; all four export endpoints extended.
- 18 new tests; inter-DECNET round-trip (stix2.parse → typed objects)
  is the primary fidelity assertion.
2026-05-09 08:52:19 -04:00
1200ac9132 feat(stix): STIX→MISP download export (per-attacker + fleet)
Adds GET /api/v1/attackers/{uuid}/export/misp and
GET /api/v1/attackers/export/misp backed by misp_export.py, which
converts existing STIX bundles to MISP events via misp-stix
ExternalSTIX2toMISPParser. Fleet endpoint emits {response:[...]}
collection (one event per attacker). Frontend: STIX/MISP buttons on
AttackerDetail header and Attackers list. 13 new tests green.
2026-05-09 08:04:25 -04:00
d6a091be75 fix(ttp/stix): extract commands from both 'command' and 'command_text' keys 2026-05-09 07:43:44 -04:00
c210a56fc8 feat(ttp/stix): fleet-wide STIX 2.1 export — GET /api/v1/attackers/export/stix 2026-05-09 07:37:41 -04:00
f827197cc8 feat(ttp/stix): add deduped process SCOs for attacker commands 2026-05-09 07:33:30 -04:00
fe0ed4a251 feat(ttp): STIX 2.1 bundle export for individual attackers
GET /api/v1/attackers/{uuid}/export/stix returns a self-contained STIX
2.1 bundle: ip observation, threat-actor, ATT&CK attack-patterns with
canonical MITRE IDs, uses relationships, per-tag sightings, file SCOs
for artifacts, domain-name SCOs for SMTP targets, and a provider intel
note. Attack-pattern SDOs carry the MITRE bundle IDs so consumers
deduplicating against the public ATT&CK bundle get exact matches.
2026-05-09 07:21:22 -04:00
1d3086a5c7 feat(web): GET /api/v1/ttp/techniques/{id}/groups — MITRE-tracked groups using a technique
Surfaces the intrusion-set reverse index from the loaded ATT&CK
bundle: given a technique, returns the list of groups MITRE has
documented as using it. Read-only — explicitly NOT an attribution
claim about a DECNET attacker. The frontend pulls this lazily when
the operator expands a technique panel; payload-size cost on every
TTPTagDetailRow makes embedding wasteful for techniques with 50+
documented groups.

- decnet/web/router/ttp/api_get_groups_for_technique.py exposes
  GET /api/v1/ttp/techniques/{technique_id}/groups, response_model
  list[GroupRef]. Same JWT-viewer auth gating as the rest of the
  TTP router. 404 when the technique_id doesn't resolve in the
  bundle.
- Sub-techniques are queried directly (no auto-union with parent)
  to match ATT&CK Navigator semantics; callers that want a broader
  view query the parent themselves.
- tests/ttp/test_groups_for_technique.py covers happy path, 404,
  sub-technique attribution independence, empty-list-on-zero-groups,
  and that responses include mitre_url + aliases.
- tests/web/test_api_attackers.py: fix pre-existing fixture drift
  introduced by a2a61b63 — three TestGetAttackerDetail cases were
  missing AsyncMock for repo.latest_observation_per_primitive,
  causing TypeError on await of MagicMock. The new groups endpoint
  doesn't share code with attacker_detail; this is a drive-by fix
  surfaced by the same suite run.
2026-05-09 06:45:25 -04:00
84a075e405 feat(ttp): promote mitre_url to first-class TTPTag column + propagate everywhere
Phase 2 attached mitre_url to intel-emitted tags' evidence JSON;
Phase 3 promotes it to a real column populated for *every* tag —
intel, credential, behavioral, canary, identity, email, rule-engine —
from one source. Pre-v1, so the SQLModel field is added directly
without an Alembic migration.

- TTPTag gains mitre_url: Optional[str] (not indexed — derived
  deeplink, not a query target; technique_id is already indexed).
- _emit.py and rule_engine._evaluate_rules both populate mitre_url
  via attack_stix.mitre_url_for(sub_technique_id or technique_id).
  Sub-technique URL when present, else parent. The two construction
  sites stay separate because the rule_engine path carries per-emit
  span instrumentation that emit_tags() can't preserve without
  threading a span object through; minimal-change beats forced
  refactor here.
- intel_lifter strips mitre_url from evidence_extra in all four
  decision functions. The column is canonical now; duplicating in
  the JSON column would drift when the bundle moves. The unused
  TechniqueEmission import + tracking dicts removed too.
- IdentityTechniqueRow / TechniqueRollupRow / TTPTagDetailRow /
  CampaignTechniqueRow gain mitre_url: Optional[str].
- sqlmodel_repo/ttp.py:_mitre_url_for added; the 5 row-builder sites
  pass mitre_url=_mitre_url_for(sub_technique_id or technique_id)
  alongside the existing technique_name resolution.
- api_get_tag_details.py needs no change — list_tags_by_scope_and
  _technique already returns model_dump() rows that flow the new
  column through **row spread to TTPTagDetailRow.
- tests/ttp/test_emit_attaches_mitre_url.py covers both construction
  paths (top-level, sub-tech, unknown, multi-emit) and a regression
  test that intel_lifter evidence dicts no longer contain mitre_url.
2026-05-09 06:40:08 -04:00
0c1fc68b13 feat(deploy): wire attribution worker — CLI + systemd unit + registry
* decnet attribution — Typer command mirroring decnet reuse-correlate
  (--multi-actor-tick, --daemon flags). Calls run_attribution_loop
  with the dependency-injected repo.
* deploy/decnet-attribution.service.j2 — systemd unit mirroring
  decnet-reuse-correlator.service.j2: ExecStart=decnet attribution,
  same hardening posture (NoNewPrivileges, ProtectSystem=full,
  ProtectHome=read-only, dedicated /var/log/decnet/decnet.attribution.log).
* worker_registry.KNOWN_WORKERS += "attribution" — heartbeat already
  publishes as system.attribution.health from
  attribution_worker._WORKER_NAME, so the Workers panel surfaces the
  row the moment the unit is enabled.
* api_start_all_workers preferred-order list + "attribution" between
  reuse-correlator and enrich so a fresh start-all brings it up
  alongside its peers.

After this commit `systemctl enable --now decnet-attribution` (or
the dashboard's start-all) actually launches the engine.
2026-05-09 02:31:59 -04:00
33f7d5a9ff feat(web): expose attribution state on AttackerDetail backend (Phase 6)
GET /api/v1/attackers/{uuid}/attribution

Returns the merger output for an attacker's identity:

    {
        "identity_uuid": "abc..." | null,
        "primitives": [
            {primitive, current_value, state, confidence,
             observation_count, last_change_ts, last_observation_ts},
            ...
        ]
    }

Pre-attribution-worker: identity_uuid=null, primitives=[]. Surfacing
identity_uuid keeps the cross-attacker rollup story visible to the
frontend ahead of v1's clusterer landing.

api_events SSE relay also subscribes to attribution.> and forwards
to the AttackerDetail page filtered on payload.identity_uuid (the
identity is resolved at stream open from the URL's attacker_uuid;
attribution payloads are identity-keyed, not attacker-keyed). New
SSE event names: attribution.state_changed,
attribution.multi_actor_suspected.

Frontend (AttackerDetail.tsx badge rendering, useAttackerStream
consumer) deferred — there's already WIP on AttackerDetail.tsx in
the working tree; merging the badge logic is a separate commit
once that lands.

Tests: 4 endpoint scenarios — 401 unauth, 404 unknown attacker,
200 empty (no stub), 200 with primitive-ordered rows.
2026-05-09 02:21:59 -04:00
dd265d7520 feat(correlation/attribution): wire bus handler, persist state (Phase 4)
attribution_worker.handle_observation_event now executes the full
end-to-end path:

* ensure stub identity (Phase 1)
* observations_for_identity_primitive() — new repo helper joining
  observations through attackers.identity_id, so v1's clusterer
  gets cross-attacker rollup for free
* aggregate_observations() with ValueKind dispatched off the BEHAVE
  PRIMITIVE_REGISTRY; unknown primitives default to categorical
* upsert_attribution_state() — last_change_ts locked when state is
  unchanged so the dashboard can render "stable since X"
* publish attribution.profile.state_changed only on transition;
  idempotent re-runs over the same observation set fire nothing
  (loop-prevention invariant matching ttp.tagged)

Tests:
* 5 end-to-end attribution scenarios over in-memory SQLite + FakeBus.
* test_base_repo's DummyRepo + coverage body now stub every abstract
  surface BaseRepository declares — the 6 added by this branch plus
  the 12 left un-stubbed by earlier work (BEHAVE Phase 1, TTP
  rollups, iter helpers). The coverage test could not previously
  even instantiate.
* test_aggregate_categorical's dispatcher rejection updated for the
  Phase 3 + 4 contract — ValueError on unknown kinds, not
  NotImplementedError.
2026-05-09 02:16:12 -04:00
c2891d6cca feat(correlation/attribution): substrate + idle handler (Phase 1)
v0 Phase 1 of ATTRIBUTION-ENGINE.md:

* AttributionStateRow SQLModel keyed on (identity_uuid, primitive)
  per ANTI direction — re-keying state rows when the v1 clusterer
  merges attackers is the migration debt v0 should not bake in.
  ATTRIBUTION-ENGINE.md updated with the deviation note.
* AttributionMixin: ensure_stub_identity_for_attacker, idempotent
  upsert_attribution_state, get_attribution_state[_for_identity],
  list_multi_actor_identities (the Phase 5 correlator's read).
* attribution.profile.{state_changed,multi_actor_suspected} bus
  topics + builder; wiki Service-Bus.md updated separately.
* attribution_worker.py: subscribes to attacker.observation.>,
  ensures stub identity per event, logs and continues. No merger,
  no state writes, no derived events — Phase 4 wires those.
* attribution/{aggregate.py,_thresholds.py} skeletons: Phase 2
  fills _aggregate_categorical, Phase 3 adds numeric+hash+dispatcher.
2026-05-08 23:16:13 -04:00
bb77d13f9a feat(api/attackers): per-attacker SSE events stream
GET /api/v1/attackers/{uuid}/events streams behavioural events for
one attacker. Mirrors decnet/web/router/topology/api_events.py
end-to-end: ?token= auth, require_stream_viewer gate,
sse_connection_slot per-user cap, snapshot-on-connect, three bus
subscriptions (attacker.observation.>, attacker.fingerprint_rotated,
attacker.scored) merged through asyncio.Queue, 15s keepalive,
request.is_disconnected() exit, finally task cancellation.

Per-attacker filter keys on payload['attacker_uuid'] which the
profiler worker stamps onto every published payload (Phase 5 P5.0
amendment) — O(1) drop without a repo round-trip per event.

_sse_name_for derives SSE event names:
  attacker.observation.<primitive> → observation.<primitive>
  attacker.fingerprint_rotated     → fingerprint.rotated
  attacker.scored                  → attacker.scored

10 tests cover snapshot, live forward, per-attacker filter (drops
other attackers' events), fingerprint.rotated forward, 404, 401, and
the sse-name derivation across all four cases. Topology events
regression green.
2026-05-08 20:23:29 -04:00
5ff89eefe7 feat(profiler): wire BEHAVE-SHELL extraction onto attacker.session.ended
The profiler worker now consumes attacker.session.ended on the bus
AND walks unprofiled session_recorded log rows on every tick. Both
paths converge on a single handler that:

1. Validates required payload fields (session_id, decky_id, service,
   attacker_ip, shard_path).
2. Builds evidence_ref shard:{decky}/{service}/{shard_basename}#{sid}
   and skips when has_observations_for_evidence is True (idempotent
   re-runs).
3. Resolves attacker_uuid via get_attacker_uuid_by_ip; defers if the
   profiler tick hasn't materialised the row yet.
4. Reads the asciinema shard, slices events for the sid, calls
   extract_session, persists each Observation via upsert_observation
   (per-row; batch transaction filed as follow-up), then publishes
   each on the bus best-effort (fire-and-forget per DEBT-029 §6).

Architecture:
* Handler lives in decnet/profiler/behave_shell/_handler.py — pure
  function, unit-tested in isolation.
* Worker.py adds _behave_pump (queue feed), _drain_behave_queue
  (per-tick drain), _behave_poll_tick (cursor scan over
  session_recorded logs), and _payload_from_log_row (Log → bus-shape
  payload projection).
* Poll cursor uses a separate state key
  (attacker_worker_session_cursor) so the correlation tick's cursor
  doesn't conflate.
* has_observations_for_evidence promoted to BaseRepository abstract.

22 new tests across handler / drain / poll layers covering happy
path, all skip paths, isolation against handler exceptions,
idempotency on re-run, and cursor key separation. TTP worker bus
tests still green — payload field is purely additive.

Closes BEHAVE-INTEGRATION.md Phase 4.
2026-05-08 18:57:45 -04:00
588ea4e411 refactor(artifacts): extract shard-finder out of transcripts router
Move `_find_shard_with_sid`, `_resolve_shard`, `_validate_names`,
`_get_index`, and the index cache from
`decnet/web/router/transcripts/api_get_transcript.py` into
`decnet/artifacts/shards.py`. The shared module speaks
`ValueError`; the router keeps thin wrappers that translate to
`HTTPException(400)` so the route's error UX is unchanged.

This unblocks the BEHAVE-INTEGRATION Phase 4 worker wiring — the
profiler worker (and the collector's session aggregator) need to
disk-reach asciinema shards but must not import from a FastAPI
router.

11 new unit tests for the shared helper. Existing transcript router
tests pass (the shard fixture's monkeypatch points at the shared
module's ARTIFACTS_ROOT now).
2026-05-08 18:49:11 -04:00
09f598ce47 feat(profiler/behave_shell): G.2 operational.opsec_discipline
* careful — operator hits OPSEC_HISTORY_TOKENS AND tail-K commands
  include _CLEANUP_TOKEN_HASHES (re-imported from temporal.py).
* learning — history hit without cleanup-tail follow-through.
* careless — no history-clearing vocabulary at all.

Confidence 0.45 (small lexicon, soft); 0.30 below
MIN_COMMANDS_FOR_FULL_CONFIDENCE.
2026-05-08 16:29:48 -04:00
a2a61b636e feat(web): drop SessionProfile, wire observations into AttackerDetail (DEBT-050 / DEBT-036 closure)
Destructive half of BEHAVE-INTEGRATION.md Phase 1. SessionProfile +
its kd_* columns + the dialect ALTER TABLE migration helpers are
deleted outright; pre-v1, the table shipped empty, no migration
ceremony required (per the no-new-_migrate_-pre-v1 memory rule).
DEBT-036 closes via DEBT-050 supersedure. AttackerDetail's
``observations`` field is wired to the new ``observations`` table
and returns an empty list until the BEHAVE-SHELL extractor (DEBT-050
Phase 2) starts emitting.

decnet/web/db/models/attackers.py — SessionProfile class deleted
(~135 lines), KD_PAUSE_*/KD_START_OF_ACTION_IDLE_S module constants
deleted, module docstring updated to point at the observations
table. AttackerIdentity.kd_digraph_simhash is KEPT — it's the v2
federation centroid hook, not a SessionProfile field; docstring
repointed to the BEHAVE primitive that will populate it.

decnet/web/db/sqlmodel_repo/attackers/sessions.py — DELETED.
SessionProfilesMixin dropped from the AttackersMixin MRO.

decnet/web/db/repository.py — abstract upsert_session_profile +
get_session_profile removed.

decnet/web/db/sqlite/repository.py + mysql/repository.py —
_migrate_session_profile_table helpers and their initialize() calls
removed. mysql initialize() now goes attackers → column_types →
admin (no session_profile step).

decnet/web/db/models/__init__.py — SessionProfile re-export gone.

decnet/web/db/models/attacker_intel.py — docstring cross-reference
to SessionProfile.schema_version retargeted to AttackerIdentity.

decnet/web/router/attackers/api_get_attacker_detail.py — adds
``observations: []`` to the response by calling
``repo.latest_observation_per_primitive(uuid)`` and projecting to a
list sorted by primitive path. Empty until the extractor lands;
shape matches BEHAVE-INTEGRATION.md §"AttackerDetail consumer".

tests/profiler/test_session_profile.py — DELETED (56 lines).
tests/db/test_base_repo.py — DummyRepo loses upsert_session_profile
and get_session_profile overrides.
tests/db/mysql/test_mysql_migration.py — initialize-call-order
assertion updated; session_profile step removed from the expected
sequence; docstring records why.
tests/ttp/test_lifter_absence.py — docstring "no SessionProfile" →
"no ObservationRow".
2026-05-03 07:33:37 -04:00
0972325527 feat(web/db): observations table + repo + bus prefix (BEHAVE-INTEGRATION Phase 1)
Additive Phase 1 of BEHAVE-INTEGRATION.md. Lays the storage layer
the BEHAVE-SHELL extractor (DEBT-050) will write into. Nothing
breaks; SessionProfile coexists for now and is dropped in the
follow-up commit.

decnet/web/db/models/observations.py — new ObservationRow SQLModel
mirroring the BEHAVE Observation envelope field-for-field
(core/decnet_behave_core/spec/envelope.py). ``id`` is a hex-string
UUID (matching BEHAVE), not a typed UUID column. ``identity_ref``
is str | None — written by the future attribution engine, NULL
until then. ``attacker_uuid`` is the one DECNET-side
denormalisation; FK'd to attackers.uuid for cheap AttackerDetail
joins. ``evidence_ref`` is NOT NULL for DECNET emissions even
though the upstream envelope makes it optional — the worker's
"already profiled?" check keys on it. UniqueConstraint(evidence_ref,
primitive) enforces idempotency at the schema level so re-running
the extractor on the same shard+sid produces a DB-side conflict
the upsert path resolves deterministically. Class is named
``ObservationRow`` (not ``Observation``) to avoid colliding with
the BEHAVE Pydantic envelope at sites that import both.

decnet/web/db/sqlmodel_repo/observations.py — ObservationsMixin.
Three public methods backing the canonical queries from
BEHAVE-INTEGRATION.md §"Storage": ``upsert_observation`` (idempotent
on the natural key), ``latest_observation_per_primitive`` (per-
primitive MAX(ts) subquery, portable across SQLite and MySQL — no
DISTINCT ON), ``observations_time_series`` (asc-by-ts). Plus
``has_observations_for_evidence`` for the worker's session-already-
profiled check.

decnet/bus/topics.py — ATTACKER_OBSERVATION_PREFIX = "observation"
constant + ``attacker_observation(primitive)`` builder. Full topic
shape ``attacker.observation.<primitive>`` matches what BEHAVE's
spec.event_adapter.event_topic_for produces upstream. Documentation
+ pattern matching only — bus auth is socket file perms (DEBT-029
§2), not topic-level.

decnet/web/db/repository.py — abstract ``upsert_observation``,
``latest_observation_per_primitive``, ``observations_time_series``
on BaseRepository.

tests/db/test_observations.py — 11 tests covering upsert round-trip,
idempotency under the unique constraint, latest-per-primitive
ordering across multiple sessions, time-series asc-ordering, empty-
attacker contract, every BEHAVE ValueKind round-tripping through
the JSON column, and the has_observations_for_evidence check.

tests/db/test_base_repo.py — DummyRepo gains the three new abstract
overrides so its coverage suite still instantiates.
2026-05-03 07:25:10 -04:00
3f080f601d feat(intel,ingester): mal_hash feed + observed_attachments table (DEBT-046)
New MalHashProvider sibling ABC (decnet/intel/base.py) since SHA-256
is a different keyspace from IntelProvider's IPs. MalwareBazaarProvider
mirrors FeodoProvider's bulk-feed shape: 24h refresh via _ensure_fresh
/ _refresh, in-memory set[str] of hex-lowercased hashes, set-membership
lookup. Auth-keyed via DECNET_MALWAREBAZAAR_AUTH_KEY; absent key
silent-no-ops the lane (single warning, no HTTP traffic).

Per-hash observations persist to a new observed_attachments table.
DECNET is a honeypot platform — every attachment hash an attacker
delivers is intel, regardless of whether anyone classified it. Verdict
is sticky: True never downgrades to False/None on subsequent
observations. Out of scope: API surface, federation export, retention.

Ingester _publish_email_received calls the provider for each attachment
sha256, sets mal_hash_match on the bus payload (omitted entirely when
the message had no attachments — keeps R0046's `is True` predicate
silent on hash-less mail, matching pre-paydown behavior), and upserts
the row regardless of provider availability.
2026-05-03 05:56:46 -04:00
03beff3840 feat(orchestrator): authoritative failure-count badge endpoint (DEBT-042)
New GET /api/v1/orchestrator/events/stats?since=1h&success=false&kind=...
backed by repo.count_orchestrator_failures(since_ts, kind), which
counts failed rows across both orchestrator_events and
orchestrator_emails since the cutoff.

Window parser accepts ^\d+[smhd]$, capped at 7d. Today only
success=false is accepted on this surface so the endpoint isn't
accidentally repurposed before the next consumer is properly
designed.

Orchestrator.tsx polls the endpoint on mount + every 30 s and
renders the authoritative DB-derived count instead of deriving from
the in-memory SSE buffer + one paginated page (which silently
excluded failures older than the local window).
2026-05-03 05:26:45 -04:00
6c6f97e840 feat(prober,correlation): attacker fingerprint rotation detection (DEBT-032)
When the prober observes a NEW hash for an
(attacker_uuid, port, probe_type) triple it has seen before — VPS
rotation, SSH server rebuild, TLS cert swap — emit a derived
attacker.fingerprint_rotated event carrying both old and new hash.
Detection is a small library (decnet.correlation.fingerprint_rotation)
called inline from the prober at each of the three emit sites
(JARM/HASSH/TCPFP). No new daemon. New AttackerFingerprintState table
holds per-triple last-hash state; Attacker.rotation_count and
Attacker.last_rotation_at are stamped on every diff. Library is sync,
fully unit-tested via injected publish_fn / syslog_fn callbacks.
2026-05-03 05:12:51 -04:00
7036a86e76 refactor(artifacts): extract resolve_artifact_path to shared module
Move artifact path validation + symlink-escape check out of the
admin-gated download endpoint into decnet/artifacts/paths.py so the
TTP EmailLifter can disk-reach .eml files at tag-time without
duplicating regex/root logic (DEBT-047).

The router now catches ArtifactPathError and re-raises HTTPException(400);
behavior is unchanged.
2026-05-02 20:02:47 -04:00
c714941069 feat(bus): project EmailLifter heavyweight fields onto email.received
The decky's Layer-2 extension (commit 291b78c1) emits body_simhash /
body_base64_bytes / html_smuggling on the message_stored log and adds
macro_indicator / encrypted booleans to each attachments_json
manifest entry. Lift them all onto the email.received bus payload:

* body_simhash — passes through as-is (16 hex chars or "")
* body_base64_bytes — coerced to int (0 on absent / malformed)
* attachment_macros / attachment_password_protected — OR-reduced
  across the per-attachment manifest booleans; matches R0046's
  matched_trigger semantics where a single positive lane fires the
  rule
* html_smuggling — coerced bool from the decky's 0/1 int

Pre-Layer-2 message_stored events (older deckies, malformed log
rows) project to safe defaults: empty simhash, zero base64-bytes,
all booleans False — the EmailLifter then stays silent, never
fires a false positive on missing data.

R0042 (mass-phish) / R0046 macro / R0046 password / R0046 smuggling
/ R0048 (encoded payload) all fire end-to-end after this commit.
R0046 mal_hash_match and R0047 BEC remain deferred per their
respective DEBT entries (filed in the next commit).
2026-05-02 19:10:30 -04:00
fb85762703 feat(bus): publish email.received from ingester after SMTP artifact persist
Wires the EmailLifter (R0041–R0048) producer that DEBT.md item #3
deferred. After the existing add_bounty() call in _extract_bounty
(line 615), call _publish_email_received() which:

* resolves the attacker_uuid via repo.get_attacker_uuid_by_ip; drops
  the publish if unresolved (the TTP worker can't anchor orphan
  events)
* projects the message_stored fields onto the EmailLifter wire
  contract: from_domain / mail_from_domain / return_path_domain
  parsed via _domain_of, rcpt_count + rcpt_domains via
  _rcpt_projection, attachment_sha256s + attachment_extensions
  derived from the existing attachments_json manifest, urls from
  urls_json, dkim_signed/spf_pass coerced from 0/1 ints to bool
* mirrors _publish_probe_pending's bus-per-call pattern and
  swallows all exceptions (the bus is the notification layer, not
  the source of truth)

Fires for both relay and non-relay SMTP services. R0041 / R0043 /
R0044 / R0045 are now live end-to-end; R0046 partial (extension
lane). Heavyweight predicates (R0042 simhash, R0046-deep, R0047 /
R0048 body_text) stay deferred per the EmailLifter heavyweight
DEBT entry.
2026-05-02 18:39:13 -04:00
999d3494b4 feat(intel): persist per-provider taxonomy on AttackerIntel for TTP dispatch
The 2026-05-02 ship-time audit of the R0054-R0058 intel rule pack found
that AbuseIPDB / GreyNoise / ThreatFox stored only the aggregate verdict
(score / classification / listed-bool) plus the raw response blob. The
TTP IntelLifter expects per-provider taxonomy fields (categories, tags,
threat_types) that were never populated, so R0054 / R0055 / R0057
emitted zero tags in production despite passing unit tests.

Add typed columns: abuseipdb_categories, greynoise_tags, greynoise_name,
feodo_malware_family, threatfox_threat_types, threatfox_ioc_types,
threatfox_malware_families. Each provider now parses the relevant
taxonomy out of the upstream response and writes it through
column_updates. JSON-list columns ride as TEXT with default "[]" to
keep the SQLite/MySQL backend split honest, deserialised back to native
lists by the repo on read.
2026-05-02 18:07:57 -04:00
84699f89da feat(ttp): show canonical ATT&CK technique names in the TTPs UI
"T1595" alone is opaque; "T1595 — Active Scanning" tells you the
story at a glance. The names come from a backend-side static catalogue
pinned to the same ATT&CK release as the rule engine
(_ATTACK_RELEASE = "v15.1") — names are the canonical MITRE labels,
not author-supplied strings on rules, so a rule author can't typo a
name and the entire fleet sees the typo.

- New `decnet/ttp/attack_catalog.py` with `TECHNIQUE_NAMES` covering
  every technique_id + sub_technique_id emitted by `rules/ttp/`
  (R0001..R0058 → 69 IDs in the v0 pack).
- `IdentityTechniqueRow` / `TechniqueRollupRow` / `CampaignTechniqueRow`
  / `TTPTagDetailRow` gain optional `technique_name` /
  `sub_technique_name` fields. Repo + router populate them from the
  catalogue at row-construction time. None when an ID isn't in the
  catalogue — UI falls back to the bare ID.
- Coverage test (`tests/ttp/test_attack_catalog.py`) walks every
  YAML rule and asserts every emitted ID has a catalogue entry, so
  a future rule author who forgets to update the catalogue gets a
  loud failure rather than a silent UI fallback.

Frontend:
- `TTPsObservedSection` shows "T1595.002 — Active Scanning:
  Vulnerability Scanning" instead of just the ID, with overflow
  ellipsis + tooltip for narrow viewports. Inspector header /
  TECHNIQUE row also surface the names.
2026-05-02 03:10:07 -04:00
42e9492118 feat(ttp): inspector drawer surfaces evidence + rule_id behind each technique
The TTPsObservedSection rollup tells the operator "we saw T1059" but
not why. Click any technique row → side drawer opens listing every
ttp_tag row in scope with the persisted evidence JSON, firing
rule_id / rule_version, source_kind / source_id, confidence, and
created_at. Mirrors the CredentialReuseInspector / BountyInspector
pattern (drawer-backdrop + bd-head/bd-body + kvs grid).

Backend:
- New `GET /api/v1/ttp/tags/by-{scope}/{uuid}/{technique_id}`
  (`scope ∈ {identity, attacker, session}`, optional
  `?sub_technique_id=`, `?limit=` capped to 1000). Returns raw
  TTPTag rows newest-first.
- New `TTPTagDetailRow` Pydantic model + re-export.
- New repo method `list_tags_by_scope_and_technique` on
  TTPMixin (+ abstract on BaseRepository) — single query branched
  on scope; identity scope projects through `Attacker.identity_id`
  the same way `list_techniques_by_identity` does.
- Tests: evidence round-trips, sub_technique filter, JWT-required,
  empty scope, unknown scope rejected.

Frontend:
- New `TTPInspector.tsx` + `TTPInspector.css` (violet accent, slide
  animation, focus-trapped panel matching the existing inspector
  family).
- `TTPsObservedSection`'s TechniqueBar is now click+keyboard
  activatable; clicking opens the inspector for that
  (technique, sub_technique) tuple.

mypy clean. 532 passed in the targeted sweep.
2026-05-02 02:55:05 -04:00
c4e29e3bf9 fix(ttp): resolve attacker_uuid from attacker_ip on bus-event consume
The collector's `attacker.session.ended` envelope carries
`attacker_uuid: null` and `attacker_ip: <ip>` because the collector
doesn't talk to the DB. The TTP worker passed that null straight
through, and `TTPTag.__init__` raised the documented invariant:

    ValueError: ttp_tag requires at least one of attacker_uuid /
                identity_uuid; both NULL is not a valid anchor.

The worker now resolves `attacker_uuid` from `attacker_ip` via
`BaseRepository.get_attacker_uuid_by_ip` before fanning out the
event. When the IP isn't in the DB yet (profiler hasn't ingested
the row), the event is dropped with one log line — better than
exploding mid-tag.

- New `get_attacker_uuid_by_ip(ip) -> str | None` on the repo
  (BaseRepository abstract + AttackersCoreMixin impl).
- `_resolve_attacker_uuid` helper in `decnet/ttp/worker.py` runs
  before `_build_events`. Short-circuits when the payload already
  has either anchor; drops the event when neither anchor is
  resolvable.
- Tests pin: short-circuit on existing uuid/identity, repo lookup,
  drop on unknown IP, drop on "Unknown" sentinel, drop on
  no-anchor payload, drop on repo failure.
2026-05-02 02:44:30 -04:00
e08bfc4a73 fix(ttp): /api/v1/ttp/rules returns the live rule catalogue
The endpoint was a contract-phase stub returning `[]` even though the
RuleStore loaded all 58 YAML rules at worker startup. UI saw an empty
table; operators couldn't tell whether anything was wired up.

- `api_list_rules` now calls `get_rule_store().load_compiled()` and
  serializes each CompiledRule + its operational state into a
  RuleCatalogueRow. Sorted by rule_id for stable golden snapshots.
- Add `description: str` to RuleSchema (pydantic) and CompiledRule
  (NamedTuple, defaulted) + propagate through `_compile_one` so the
  catalogue surfaces the human-readable YAML description, not just
  the slug-style `name`.
- Update `tests/ttp/test_rule_engine.py` _fields assertion for the
  new column; new `tests/api/ttp/test_rules_catalogue.py` pins the
  catalogue contents (R0001/R0014 presence, row shape, sort order).

Worker behaviour is unchanged: it was already loading rules
correctly. This is purely a read-side wiring fix on the operator API.
2026-05-02 01:54:06 -04:00
7ab0df3680 chore(cleaning): deleted swp vimfile 2026-05-02 01:39:17 -04:00
301d3feee9 feat(ttp): E.4.a extract decnet/cli/ttp.py with worker run + backfill CLI
The TTP worker entry moved out of decnet/cli/workers.py into its own
module so the TTP CLI surface (worker + admin verbs) is colocated,
mirroring decnet/cli/canary.py / webhook.py / swarm.py.

- New `decnet/cli/ttp.py` with `decnet ttp` (worker, ExecStart-stable
  for decnet-ttp.service) and `decnet ttp-backfill --since-days N`.
- `decnet ttp-backfill` walks Attacker.commands and CanaryTrigger
  history, dispatches each row through the live CompositeTagger,
  persists tags via repo.insert_tags (idempotent INSERT OR IGNORE).
  --dry-run / --source command|canary|all / --batch-size supported.
- Backfill deliberately bypasses bus publish — historical replay
  must not re-trigger SIEM/webhook fan-out per TTP_TAGGING.md
  §"Bus topics" loop-prevention invariant.
- Added `iter_attacker_commands_since` / `iter_canary_triggers_since`
  read-only iterators on TTPMixin + abstract bindings on
  BaseRepository.
- Master-only via gating; both `ttp` and `ttp-backfill` listed in
  MASTER_ONLY_COMMANDS.
2026-05-02 01:35:17 -04:00
403d83faba feat(ttp): E.3.15 UKC bridge — production phase-handoff edge fires
Add BaseRepository.list_ttp_decky_phases(identity_uuid) returning
per-decky tag observations as (decky_id, tactic, created_at_ts) rows
ordered by creation time. Rewrite from_identity_row() to project
tactic → UKCPhase via tactic_to_ukc_phase and populate the four
phase-handoff maps (first/last_phase_per_decky,
first/last_seen_per_decky) so combined_campaign_weight finally lights
up on real DB rows — not just synthetic fixtures.

ConnectedComponentsCampaignClusterer.tick() pulls each active
identity's per-decky phase observations before projecting features.
Repo failures are non-fatal: a partial repo falls back to the empty
phase-handoff signal (legacy behavior) so the worker stays up.

tests/clustering/test_ttp_phase_handoff.py pins the production-row
pair clearing CAMPAIGN_EDGE_THRESHOLD on a C2 → DISCOVERY hand-off —
the trip-wire that says the whole project paid off.

commands_by_phase_on_decky itself stays empty on the production path:
it is consumed only by the synthetic-fixture similarity surface, and
the phase-handoff edge does not use it. Synthetic fixtures still
populate it directly via from_synthetic_identity.
2026-05-01 21:01:58 -04:00
7a89fbb357 feat(ttp): E.3.12 EmailLifter (R0041-R0048)
SMTP message-level technique tagger per Appendix A.6: open relay abuse
(rcpt_count + foreign From), mass phishing (rcpt_count + body simhash),
phishing-kit X-Mailer, IDN/punycode URL, sender masquerade composite
(From/Return-Path/DKIM/SPF), malicious attachment (macro/.lnk/.iso/.img/
hash match), BEC subject+body composite, encoded payload in body.

PII discipline (TTP_TAGGING.md §'Hard parts §6') is enforced at the
lifter layer via _filter_evidence(): emitted TTPTag.evidence is
restricted to the EmailEvidence-allowed allowlist (body_sha256,
matched_headers — names only, rcpt_domain_set — domains only,
attachment_sha256s, rcpt_count) plus PII-safe match discriminators
(matched_kit, matched_trigger, matched_url_host, etc). Raw addresses,
raw body bytes, full URLs, and decoded base64 previews NEVER appear in
evidence — defense-in-depth over the YAML evidence_fields hint.

Tests: tests/ttp/test_email_lifter.py per-rule positive + negative +
PII allowlist guard + state modulation. tests/ttp/rule_precision/
test_email_rules.py xfail flipped to real precision (R0041-R0048
H-band ≥95%). Corpus rows updated to acknowledge that R0045 (masquerade)
co-fires with R0041 / R0047 when the sender-masquerade signals are
present alongside open-relay or BEC patterns — overlap is by design,
not a precision bug.
2026-05-01 20:31:03 -04:00
89ce893792 feat(ttp): E.3.4 API handlers wired to repo (rollups + Navigator)
Five GET rollup endpoints (techniques, by-identity, by-attacker,
by-campaign, by-session) and the Navigator export (fleet +
per-identity) now call into the TTPMixin methods. Rule catalogue
endpoint still returns [] — backed by the RuleStore which lands
at E.3.5/E.3.6.
2026-05-01 08:06:53 -04:00
fee697694d feat(ttp): E.3.3 repository — insert_tags + listing rollups (dual backend)
Dialect-split: portable rollup queries on TTPMixin; bulk insert with
ON CONFLICT DO NOTHING / INSERT IGNORE in the per-dialect repos.
Confidence-floor (< 0.3) drop applied at mixin layer before the
dialect hook. BaseRepository now declares the six TTP methods abstract.

Tests in tests/web/db/test_ttp_repo.py flipped from pytest.fail stubs
to real dual-backend behavioral tests; tests/ttp/test_confidence.py
drop-below-floor xfail removed.
2026-05-01 08:04:46 -04:00
b6e31e64e9 feat(ttp): E.1.10 repository contract — TTPMixin with insert_tags + list_techniques_by_{identity,attacker,campaign,session} + list_distinct_techniques
Empty NotImplementedError bodies; the SQL lands at E.3 implementation.
Mixin composed onto SQLModelRepository alongside the existing domain
mixins. Dialect-specific INSERT-OR-IGNORE syntax overrides land in
the per-backend subclasses at E.3 per the dual-DB-backend convention.
2026-05-01 07:21:37 -04:00
b7f206c8c5 feat(ttp): E.1.9 API contract — seven router endpoints, admin-gated state mutations, response models
Mounts /api/v1/ttp/* with empty-list / empty-Navigator responses.
GET endpoints viewer-gated; POST/DELETE /rules/{rule_id}/state
admin-gated server-side. POST parses JSON manually so a malformed
body returns the documented 400 (per feedback_schemathesis_400).

Drops xfail-strict markers from E.2.8 tests now that the router is
mounted; 26 tests pass against the contract handlers.
2026-05-01 07:20:13 -04:00
19cc8aa859 feat(ttp): E.1.7 worker contract — run_ttp_worker_loop, _TOPICS, registry entry 2026-05-01 06:33:34 -04:00
ce7efdfdd2 feat(ttp): E.1.1 schema contract — TTPTag, TTPRule, TTPRuleState, evidence TypedDicts, compute_tag_uuid
First contract commit of TTP tagging. Shapes only — no behavior.

- TTPTag SQLModel: deterministic UUIDv5 PK; (source_kind, source_id)
  discriminated provenance; nullable attacker_uuid + identity_uuid
  with ON DELETE CASCADE; native sqlalchemy.JSON evidence column;
  required attack_release; CheckConstraint('attacker_uuid IS NOT
  NULL OR identity_uuid IS NOT NULL'); composite indexes for the
  primary query patterns (identity_uuid+technique_id,
  attacker_uuid+technique_id, technique_id+created_at); __init__
  guard raising ValueError with both anchor names in the message
  (belt-and-braces for MySQL <8.0.16 where CHECK is silent).
- compute_tag_uuid(): RFC-4122 UUIDv5 over the six tag-identity
  fields under a fixed _TTP_TAG_NS. Pure, deterministic, replay-safe.
- Per-source_kind evidence TypedDicts (CommandEvidence,
  IntelEvidence, EmailEvidence, CanaryFingerprintEvidence) — PII
  rule lives in the type: EmailEvidence has no field for raw rcpt
  addresses or body bytes.
- TTPRule + TTPRuleState tables for the DatabaseRuleStore (E.1.11).
- All symbols re-exported from decnet.web.db.models per the
  package's existing convention.

Tests for invariants (CHECK behavior, evidence round-trip across
SQLite+MySQL, idempotency property, init-guard ordering) land in
E.2.1/E.2.2 with xfail-strict markers per Appendix E discipline.
2026-05-01 06:03:45 -04:00