DECNET

Author	SHA1	Message	Date
anti	869d1eabb7	feat(clustering): roll session digraph SimHashes into identity centroid The identity clusterer folds an identity's per-session motor.digraph_simhash observations into one 8-byte bitwise-majority centroid (denoises per-session jitter) and writes it to AttackerIdentity.kd_digraph_simhash via update_identity_fingerprints — the orphaned column is now populated. list_identities_for_clustering projects it so the campaign clusterer can read it. Extends the repo abstract + DummyRepo stub/coverage.	2026-06-16 17:05:34 -04:00
anti	372375194c	refactor(db): run Alembic at boot, retire ad-hoc _migrate_* helpers initialize() now delegates to _apply_schema(): real boots run 'alembic upgrade head' (schema owned by the migration history); tests (DECNET_TESTING=1) keep create_all, which is faster and needs no upgrade path. MySQL wraps the upgrade in the existing GET_LOCK advisory lock so concurrent uvicorn workers don't race on DDL. Deletes the three _migrate_* crimes (attackers-table legacy drop + GeoIP backfill, TEXT->MEDIUMTEXT widening) — all now handled by the baseline migration and the _BIG_TEXT model variants. Drops the test file that only exercised the deleted helpers; adds tests pinning the alembic-vs-create_all gate and guarding that every model table is in the migration head.	2026-06-16 16:31:10 -04:00
anti	337520c7ad	fix(security): close INFO ASVS findings — secret echo, TLS floor, mandatory tarball SHA, CORS/Content-Type guards, BUG-17 - V7.1.3: env known-insecure-default error no longer echoes the rejected secret value. - V9.1.4: syslog-over-TLS forwarder + listener pin minimum_version=TLSv1_2. - V12.1.2: updater tarball SHA-256 verification is now mandatory and fail-closed — /update and /update-self reject a missing digest (400), the executor rejects missing/mismatched digests before extract/apply. Every push path supplies it. - V13.1.4: reject a wildcard '' in DECNET_CORS_ORIGINS at startup. - V13.1.5: enforce application/json on JSON write endpoints (415 otherwise), exempting multipart upload routes. - BUG-17: SSE error log records the user uuid, not the resume cursor. Also completes V2.1.7 consistently: the attacker-injectable PYTEST env bypass is replaced with explicit DECNET_TESTING=1 in the three remaining sites (env.validate_public_binding, config logging, mysql url builder). Tests added for every fix; unanimous adversarial review (no update-outage risk — all push paths verified to send the digest).	2026-06-10 13:50:06 -04:00
anti	245975a6dd	fix(security): close LOW ASVS findings — env bypass, SSE/deployment authz, CN fail-close, password byte-limit, exception leaks, BUG-12..16 Auth/session (V2.1.7, V4.1.5, V4.1.6, V2.1.4/V2.1.5): - env secret validation no longer bypassed by attacker-injectable PYTEST* env; gated on explicit DECNET_TESTING=1 (set only in conftest). - must_change_password now enforced on the SSE header-JWT path, not just ticket mint. - GET /system/deployment-mode requires viewer auth (was leaking role + topology size). - CreateUser/ResetUser passwords min_length=12; passwords >72 bytes rejected explicitly instead of bcrypt silently truncating. Swarm ingestion (V9.1.3, BUG-16): - Log listener hard-rejects peers with unparseable/empty cert CN (fail closed, ingests nothing) instead of tagging 'unknown'. - Shutdown handlers no longer swallow real errors (narrowed to CancelledError). Info leakage (V7.1.2, V14.1.2): - Exception text sanitized on swarm-update, health, tarpit, realism, file-drop, blank-topology endpoints (raw tc/docker stderr, DB/Docker errors logged server-side, generic detail returned). pyproject license corrected to AGPL-3.0. Correctness (BUG-12..16): - BUG-12 atomic credential upsert (UNIQUE constraint + IntegrityError retry, consistent principal_key canonicalization). - BUG-13 rule-tail watermark uses >= with seen-id dedup (no same-second drop). - BUG-14 worker wake cleared before wait (no lost wake during tick). - BUG-15 intel gather tolerates an unexpected provider raise. - BUG-16 see above. Already-closed (verified, no change): V2.1.6, V5.1.3, V9.1.2. Accept-risk + documented: V2.1.8 cache window, V3.1.3 idle timeout. Tests added for every fix; unanimous adversarial review after two refute-fix rounds.	2026-06-10 13:27:14 -04:00
anti	6a8af315fb	fix(core): close HIGH ASVS findings V7.1.1 and correctness bugs BUG-1..6 - V7.1.1: /swarm/check no longer returns raw exception text; logs detail server-side, returns generic 'probe failed'. - BUG-1: register EditAction -> SSHDriver so edit ticks no longer crash. - BUG-2: topology reconcile matches generator-named deckies by expected-name membership instead of a hyphen heuristic. - BUG-3: intel provider lookups acquire the per-provider semaphore so declared concurrency bounds are enforced. - BUG-4: RuleIndex.install evicts a rule from kinds it no longer applies to. - BUG-5: UnixSocketBus.connect() is lock-guarded with a double-check so concurrent first-connects open exactly one socket and reader task. - BUG-6/V5.1.3: multi-token JSON-field search binds each token to a distinct parameter instead of collapsing to the last value. Regression tests added for every fix, verified red-before/green-after. V4.1.1c/V12.1.1 (updater master-CN gate) and V12.5.1 (tarball include-list) confirmed already fixed in prior commits and left untouched.	2026-06-09 23:12:49 -04:00
anti	698ecaa322	feat(auth): jti claim and token-revocation store Stateless JWTs had no revocation path: a stolen token stayed valid for its full 24h even after the victim changed their password, and there was no logout. This lays the foundation for revoking them. - User.tokens_valid_from: per-user bulk-revocation cutoff (compared against the token's iat). RevokedToken(jti PK, exp): single-token denylist, pruned opportunistically on insert so it never outgrows live-but-revoked tokens. - login() now mints a jti; create_access_token already stamps iat/exp. - repo.revoke_token / is_token_revoked / set_tokens_valid_from (abstract + shared sqlmodel impl + DummyRepo coverage stubs). - Centralized validate path in dependencies.py: every auth dependency now resolves the user and fails closed on (1) missing jti (legacy/pre-deploy token -> one forced re-login), (2) iat before the cutoff, (3) a denylisted jti. Denylist lookups ride a 10s membership cache mirroring the user cache. - Contract/fuzz harness seeds its fixed-uuid principal under DECNET_CONTRACT_TEST so its minted token resolves to a live admin user.	2026-05-30 18:18:41 -04:00
anti	f2b3393669	chore: relicense to AGPL-3.0-or-later and add SPDX headers Replaces LICENSE (GPLv3 -> AGPLv3) and prepends `SPDX-License-Identifier: AGPL-3.0-or-later` to every source file across decnet/, decnet_web/, tests/, scripts/, and tools/. Rationale: closes the GPLv3 ASP loophole so any party operating a modified DECNET as a network service must offer their modified source. Personal copyright (Samuel Paschuan) + inbound=outbound contributions make a future unilateral relicense infeasible. - LICENSE: full AGPL-3.0 text (gnu.org/licenses/agpl-3.0.txt) - COPYRIGHT: project copyright notice - tools/add_spdx_headers.py: idempotent header injector (shebang- and PEP 263-aware) Touches 1565 source files (.py, .ts, .tsx, .js, .jsx, .css, .sh). No behavior change; comments only.	2026-05-22 21:04:16 -04:00
anti	05c0721a51	feat(db): add DeckyLifecycle table for async deploy/mutate tracking One row per (decky, operation) attempt. State machine: pending -> running -> succeeded \| failed (+ error text). Rows are append-only after terminal; retries write a new row. Sibling of DeckyShard rather than a rework -- DeckyShard tracks runtime container state observed via heartbeat, this tracks operation lifecycle. New table, UUID PK. Adds BaseRepository abstract methods (create_lifecycle, update_lifecycle, get_lifecycle_by_ids, find_open_lifecycle, sweep_stale_lifecycle) with SQLModelRepository mixin impl. Backbone for the upcoming 202-Accepted async API.	2026-05-22 16:20:00 -04:00
anti	3977f06374	feat(ttp/ipv6_leak): wire Ipv6LeakLifter into composite tagger and worker - Add "ipv6_leak" to KNOWN_SOURCE_KINDS in ttp/base.py - Register Ipv6LeakLifter(store) in factory.py get_tagger() - Subscribe worker to attacker.fingerprinted; route by Event.type so JARM/HASSH/ipv6_leak share the topic without source_kind collision - Add bump_attacker_ipv6_leak() to BaseRepository (abstract) + TTPMixin (implementation): increments ipv6_leak_count, sets last_ipv6_* denorm fields, appends-with-dedup to AttackerIdentity.ipv6_link_local_iids - Call bump_attacker_ipv6_leak from _process_event after insert_tags - Add DummyRepo stub + coverage call in tests/db/test_base_repo.py	2026-05-17 20:41:55 -04:00
anti	6e7020f2aa	feat(ttp): implement E.3.14b intel catch-up via attacker.session.ended On every attacker.session.ended event, the TTP worker now reads the persisted AttackerIntel row (if any) and synthesizes an intel-source TaggerEvent so intel-derived tags emit even when attacker.intel.enriched was dropped or arrived before the worker started. Key changes: - AttackerIntel.to_intel_event_payload() — single source of truth for the intel-row → lifter payload projection; shared by future callers without importing decnet.intel.* (no-SPOF contract preserved). - BaseRepository.get_attacker_intel_row_by_uuid() — returns the live SQLModel instance so the catch-up path can call to_intel_event_payload(). - _build_intel_catchup_event() in ttp/worker.py — looks up the intel row, builds the TaggerEvent, returns None on absent row (silence, not error). - _process_event() extended: appends the catch-up event to tagger_events when topic contains "session.ended". Deterministic source_id keeps compute_tag_uuid idempotent across replays; INSERT OR IGNORE deduplicates against any prior attacker.intel.enriched path. DummyRepo stub + coverage call added per feedback_run_base_repo_test.md.	2026-05-10 08:27:22 -04:00
anti	4c6b12dcf8	feat(stix_export): wire fingerprint bounties through all endpoints + tests Remaining files from the fingerprint-bounties + characterizes-SRO commit: misp_export, repository, bounties mixin, all 4 router endpoints, and test suite updates. Prerequisite: previous commit added _extract_fingerprint_bounty_data and the stix_export changes.	2026-05-09 09:14:48 -04:00
anti	97c99a4e03	feat(ttp): rich ThreatActor STIX extensions via CustomExtension + CustomObject - stix_custom.py: DecnetActorFingerprintExt (@CustomExtension) wrapping network_behavior (os_guess/hop_distance/tcp_fingerprint/timing_stats/ phase_sequence/behavior_class/beacon fields/tool_guesses) and protocol_fingerprints (ja3_hashes/hassh_hashes/kex_order_raw/ ssh_client_banners/tls_cert_sha256/payload_simhashes/c2_endpoints). XDecnetBehaveProfile (@CustomObject x-decnet-behave-profile) carrying full BEHAVE-SHELL observation envelopes + kd_digraph_simhash. FINGERPRINT_EXT_DEF singleton extension-definition SDO. - Drop legacy flat x_decnet_ja3_hashes / x_decnet_hassh_hashes / x_decnet_c2_endpoints (pre-v1, no consumers). - stix_export: _threat_actor() wired to behavior + observations; build_attacker_bundle/build_fleet_bundle grow observations parameter. - Repo: list_observations_by_attacker + get_all_observations_for_export abstract + sqlmodel impl; all four export endpoints extended. - 18 new tests; inter-DECNET round-trip (stix2.parse → typed objects) is the primary fidelity assertion.	2026-05-09 08:52:19 -04:00
anti	c210a56fc8	feat(ttp/stix): fleet-wide STIX 2.1 export — GET /api/v1/attackers/export/stix	2026-05-09 07:37:41 -04:00
anti	f827197cc8	feat(ttp/stix): add deduped process SCOs for attacker commands	2026-05-09 07:33:30 -04:00
anti	fe0ed4a251	feat(ttp): STIX 2.1 bundle export for individual attackers GET /api/v1/attackers/{uuid}/export/stix returns a self-contained STIX 2.1 bundle: ip observation, threat-actor, ATT&CK attack-patterns with canonical MITRE IDs, uses relationships, per-tag sightings, file SCOs for artifacts, domain-name SCOs for SMTP targets, and a provider intel note. Attack-pattern SDOs carry the MITRE bundle IDs so consumers deduplicating against the public ATT&CK bundle get exact matches.	2026-05-09 07:21:22 -04:00
anti	dd265d7520	feat(correlation/attribution): wire bus handler, persist state (Phase 4) attribution_worker.handle_observation_event now executes the full end-to-end path: * ensure stub identity (Phase 1) * observations_for_identity_primitive() — new repo helper joining observations through attackers.identity_id, so v1's clusterer gets cross-attacker rollup for free * aggregate_observations() with ValueKind dispatched off the BEHAVE PRIMITIVE_REGISTRY; unknown primitives default to categorical * upsert_attribution_state() — last_change_ts locked when state is unchanged so the dashboard can render "stable since X" * publish attribution.profile.state_changed only on transition; idempotent re-runs over the same observation set fire nothing (loop-prevention invariant matching ttp.tagged) Tests: * 5 end-to-end attribution scenarios over in-memory SQLite + FakeBus. * test_base_repo's DummyRepo + coverage body now stub every abstract surface BaseRepository declares — the 6 added by this branch plus the 12 left un-stubbed by earlier work (BEHAVE Phase 1, TTP rollups, iter helpers). The coverage test could not previously even instantiate. * test_aggregate_categorical's dispatcher rejection updated for the Phase 3 + 4 contract — ValueError on unknown kinds, not NotImplementedError.	2026-05-09 02:16:12 -04:00
anti	c2891d6cca	feat(correlation/attribution): substrate + idle handler (Phase 1) v0 Phase 1 of ATTRIBUTION-ENGINE.md: * AttributionStateRow SQLModel keyed on (identity_uuid, primitive) per ANTI direction — re-keying state rows when the v1 clusterer merges attackers is the migration debt v0 should not bake in. ATTRIBUTION-ENGINE.md updated with the deviation note. * AttributionMixin: ensure_stub_identity_for_attacker, idempotent upsert_attribution_state, get_attribution_state[_for_identity], list_multi_actor_identities (the Phase 5 correlator's read). * attribution.profile.{state_changed,multi_actor_suspected} bus topics + builder; wiki Service-Bus.md updated separately. * attribution_worker.py: subscribes to attacker.observation.>, ensures stub identity per event, logs and continues. No merger, no state writes, no derived events — Phase 4 wires those. * attribution/{aggregate.py,_thresholds.py} skeletons: Phase 2 fills _aggregate_categorical, Phase 3 adds numeric+hash+dispatcher.	2026-05-08 23:16:13 -04:00
anti	a2a61b636e	feat(web): drop SessionProfile, wire observations into AttackerDetail (DEBT-050 / DEBT-036 closure) Destructive half of BEHAVE-INTEGRATION.md Phase 1. SessionProfile + its kd_* columns + the dialect ALTER TABLE migration helpers are deleted outright; pre-v1, the table shipped empty, no migration ceremony required (per the no-new-_migrate_-pre-v1 memory rule). DEBT-036 closes via DEBT-050 supersedure. AttackerDetail's ``observations`` field is wired to the new ``observations`` table and returns an empty list until the BEHAVE-SHELL extractor (DEBT-050 Phase 2) starts emitting. decnet/web/db/models/attackers.py — SessionProfile class deleted (~135 lines), KD_PAUSE_*/KD_START_OF_ACTION_IDLE_S module constants deleted, module docstring updated to point at the observations table. AttackerIdentity.kd_digraph_simhash is KEPT — it's the v2 federation centroid hook, not a SessionProfile field; docstring repointed to the BEHAVE primitive that will populate it. decnet/web/db/sqlmodel_repo/attackers/sessions.py — DELETED. SessionProfilesMixin dropped from the AttackersMixin MRO. decnet/web/db/repository.py — abstract upsert_session_profile + get_session_profile removed. decnet/web/db/sqlite/repository.py + mysql/repository.py — _migrate_session_profile_table helpers and their initialize() calls removed. mysql initialize() now goes attackers → column_types → admin (no session_profile step). decnet/web/db/models/__init__.py — SessionProfile re-export gone. decnet/web/db/models/attacker_intel.py — docstring cross-reference to SessionProfile.schema_version retargeted to AttackerIdentity. decnet/web/router/attackers/api_get_attacker_detail.py — adds ``observations: []`` to the response by calling ``repo.latest_observation_per_primitive(uuid)`` and projecting to a list sorted by primitive path. Empty until the extractor lands; shape matches BEHAVE-INTEGRATION.md §"AttackerDetail consumer". tests/profiler/test_session_profile.py — DELETED (56 lines). tests/db/test_base_repo.py — DummyRepo loses upsert_session_profile and get_session_profile overrides. tests/db/mysql/test_mysql_migration.py — initialize-call-order assertion updated; session_profile step removed from the expected sequence; docstring records why. tests/ttp/test_lifter_absence.py — docstring "no SessionProfile" → "no ObservationRow".	2026-05-03 07:33:37 -04:00
anti	0972325527	feat(web/db): observations table + repo + bus prefix (BEHAVE-INTEGRATION Phase 1) Additive Phase 1 of BEHAVE-INTEGRATION.md. Lays the storage layer the BEHAVE-SHELL extractor (DEBT-050) will write into. Nothing breaks; SessionProfile coexists for now and is dropped in the follow-up commit. decnet/web/db/models/observations.py — new ObservationRow SQLModel mirroring the BEHAVE Observation envelope field-for-field (core/decnet_behave_core/spec/envelope.py). ``id`` is a hex-string UUID (matching BEHAVE), not a typed UUID column. ``identity_ref`` is str \| None — written by the future attribution engine, NULL until then. ``attacker_uuid`` is the one DECNET-side denormalisation; FK'd to attackers.uuid for cheap AttackerDetail joins. ``evidence_ref`` is NOT NULL for DECNET emissions even though the upstream envelope makes it optional — the worker's "already profiled?" check keys on it. UniqueConstraint(evidence_ref, primitive) enforces idempotency at the schema level so re-running the extractor on the same shard+sid produces a DB-side conflict the upsert path resolves deterministically. Class is named ``ObservationRow`` (not ``Observation``) to avoid colliding with the BEHAVE Pydantic envelope at sites that import both. decnet/web/db/sqlmodel_repo/observations.py — ObservationsMixin. Three public methods backing the canonical queries from BEHAVE-INTEGRATION.md §"Storage": ``upsert_observation`` (idempotent on the natural key), ``latest_observation_per_primitive`` (per- primitive MAX(ts) subquery, portable across SQLite and MySQL — no DISTINCT ON), ``observations_time_series`` (asc-by-ts). Plus ``has_observations_for_evidence`` for the worker's session-already- profiled check. decnet/bus/topics.py — ATTACKER_OBSERVATION_PREFIX = "observation" constant + ``attacker_observation(primitive)`` builder. Full topic shape ``attacker.observation.<primitive>`` matches what BEHAVE's spec.event_adapter.event_topic_for produces upstream. Documentation + pattern matching only — bus auth is socket file perms (DEBT-029 §2), not topic-level. decnet/web/db/repository.py — abstract ``upsert_observation``, ``latest_observation_per_primitive``, ``observations_time_series`` on BaseRepository. tests/db/test_observations.py — 11 tests covering upsert round-trip, idempotency under the unique constraint, latest-per-primitive ordering across multiple sessions, time-series asc-ordering, empty- attacker contract, every BEHAVE ValueKind round-tripping through the JSON column, and the has_observations_for_evidence check. tests/db/test_base_repo.py — DummyRepo gains the three new abstract overrides so its coverage suite still instantiates.	2026-05-03 07:25:10 -04:00
anti	674ac7dd13	test(db): cover BaseRepository.update_identity_fingerprints DummyRepo couldn't instantiate — TLS-cert fingerprint rollup added a new abstract method without a stub here. Add the override and a call site so the abstract pass body is hit.	2026-04-28 13:01:37 -04:00
anti	0a1cf65ddb	feat(db): Campaign SQLModel + repo write/read methods Adds the campaigns table and the BaseRepository / SQLModelRepository methods that the campaign-clusterer worker (next commit) needs to populate it. Mirrors the AttackerIdentity layer: schema_version from day one for federation gossip, soft-merge via merged_into_uuid with a chain-walking get_campaign_by_uuid, list_campaigns excluding merged- out rows while list_all_campaigns returns the unfiltered set for the revoke pass. attacker_identities.campaign_id gets a real FK now that the target table exists.	2026-04-26 08:54:28 -04:00
anti	e364ef8859	feat(clustering): revocable merges (merge + unmerge) Reworks the clusterer's tick to handle multi-identity components and re-evaluate prior merges. Two passes per tick: Pass 1 — per-component reconciliation: * Fresh component → mint identity (commit 4 path). * Single-identity component → link unassigned observations. * Multi-identity component → soft-merge: pick the smallest-uuid winner deterministically, set merged_into_uuid on each loser, link unassigned observations to the winner. Observations stay FK'd to their original identity row — the merge is a soft pointer, not a re-point. Audit trail preserved; cached subscribers resolve through the chain. Pass 2 — revocable-merge undo: * For each merged-out identity, check whether its observations still cluster with its winner's. If not, the merge is contradicted by new evidence — clear merged_into_uuid and emit identities_unmerged. The resurrected identity keeps its original uuid, so subscribers that cached it during the merged interval re-attach without a new lookup. A pre-built merge-chain dict feeds Pass 1 so the effective-identity lookup is O(1) per observation. The chain has a hop cap (paranoia against accidental cycles in the underlying state). Repo additions on BaseRepository + SQLModelRepository: * list_all_identities() — includes merged-out rows. * update_identity_merged_into(uuid, winner_or_None) — single setter for both merge and unmerge. DummyRepo coverage stub updated. Tests: * Two distinct identities bridged by a new observation merge with the smaller uuid as winner. * A pre-seeded soft-merge whose underlying observations diverge gets revoked; resurrected uuid emerges with merged_into_uuid cleared. * Tick is idempotent under no state changes.	2026-04-26 08:33:32 -04:00
anti	de2f4c3a62	feat(clustering): wire high-weight edges end-to-end The connected-components clusterer now writes attacker_identities rows + sets attackers.identity_id when high-weight signals (JA3 / HASSH / payload-hash / C2-endpoint exact match) agree across observations. Singletons stay un-fingerprinted and un-clustered. Algorithm split: - cluster_observations(observations) — pure union-find over the high-weight edge function. Same code path for fixture validation and production tick. - from_attacker_row(row) — production-row adapter; recovers JA3 + HASSH from Attacker.fingerprints JSON. Payload + C2 join from logs in later commits; the function shape doesn't change. Repo additions on BaseRepository + SQLModelRepository: - list_attackers_for_clustering(limit=None) - create_attacker_identity(row) - set_attacker_identity_id(attacker_uuid, identity_uuid) DummyRepo coverage stub updated. v1 behavior is conservative: only assigns identities to observations whose identity_id is currently NULL. Multi-identity components are skipped this pass — merge / re-assign lands in commit 10 with revocable merges. Fixture bounds tightened against the production clusterer: - lone_wolf (F3) — singletons stay singletons - shared_wordlist (F1) — credential-only overlap doesn't cluster (high-weight tier doesn't include credentials) - vpn_hopping (F2, identity-level) — 5 rotated IPs with stable JA3 + HASSH fold into one identity, ARI = 1.0, completeness = 1.0	2026-04-26 08:19:56 -04:00
anti	dc3d08dd41	feat(web): read-only /api/v1/identities/* endpoints + repo methods Second of the five-step identity-resolution substrate. Ships the API surface against the empty AttackerIdentity table from commit 1 — every endpoint returns empty/404 cleanly until the clusterer populates rows. Routes (auth-gated, viewer role): * GET /api/v1/identities — paginated list, excludes merged-out rows * GET /api/v1/identities/{uuid} — detail; transparently follows merged_into_uuid to surface the canonical winner * GET /api/v1/identities/{uuid}/observations — Attacker rows FK'd to the (resolved) identity uuid Repository (BaseRepository abstract + SQLModelRepository concrete): * get_identity_by_uuid (with merge-chain following, hop-bounded) * list_identities / count_identities (excluding merged-out) * list_observations_for_identity / count_observations_for_identity Tests: 12 new (empty-table behavior, seeded data, merge-chain resolution, repo-level smoke against real SQLite). Also fixes the pre-existing test_base_repo_coverage failure (DEBT-041 added abstract methods without updating the DummyRepo stub) — included here because this PR adds 5 more abstract methods, fixing it as a bonus. 474 db/web/profiler/correlation tests green.	2026-04-26 07:08:55 -04:00
anti	84c1ca9c9b	feat(identity): AttackerIdentity table + nullable attackers.identity_id FK Schema-only commit, first of the five-step substrate for identity resolution. The clusterer that populates identities lands later; this ships the table empty and the FK uniformly NULL on existing rows. * decnet/web/db/models/attackers.py — new AttackerIdentity SQLModel (uuid PK, schema_version, fingerprint summary lists, kd_digraph_simhash, merged_into_uuid self-FK, all clusterer-populated fields nullable). Attacker grows a nullable indexed identity_id FK + docstring marking it as the per-IP observation row. * decnet/web/db/models/__init__.py — re-exports AttackerIdentity. * tests/db/test_identity_schema.py — 9 schema invariants: table exists, identity_id nullable + indexed, FK targets attacker_identities.uuid, schema_version defaults to 1, attacker rows inserted with NULL identity_id, FK constraint blocks orphans. 463 unrelated db/web/profiler/correlation tests still green. See development/IDENTITY_RESOLUTION.md for the full design.	2026-04-26 07:00:24 -04:00
anti	ce4be68501	feat(creds): cred-reuse foundation + vectorstore scaffold Lays the storage and bus substrate for the "credential reuse patterns" task in DEVELOPMENT.md and scaffolds decnet/vectorstore/ as the future substrate for statistical attacker re-identification over behavioral fingerprints. No correlator, profiler, API, or dashboard wiring in this commit — see TODO.md for the handoff. Schema: - Credential.attacker_uuid (nullable FK to attackers.uuid), backfilled by the profiler post-write to avoid coupling the capture path to the profiler's ordering. - CredentialReuse table — UUID PK, JSON list columns for the accumulating attacker_uuids/ips/deckies/services, target_count (the discriminative scalar), confidence reserved for a future fuzzy-credential pass. Repo: - upsert_credential_reuse / list_credential_reuses / get_credential_reuse_by_id / update_credential_attacker_uuid. - Renamed pre-existing get_credential_reuse(secret_sha256) to get_credential_attempts_for_secret(secret_sha256) — the new findings table needs the cleaner name. Bus topics: - credential.captured (one per Credential upsert) - credential.reuse.detected (correlator-emitted on insert/grow) Vectorstore subpackage (decnet/vectorstore/, flat layout mirroring decnet/bus/): - BaseVectorStore ABC keyed by (kind, id) — kind discriminator means new feature families are additive, no schema migration. - FakeVectorStore (in-memory L2 KNN), NullVectorStore (no-op for DECNET_VECTORSTORE_ENABLED=false), SqliteVecVectorStore (lazy sqlite_vec extension load, one vec0 virtual table per kind). - get_vectorstore() env-driven dispatch with graceful fallback to FakeVectorStore when the sqlite-vec extension isn't on the host, so workers don't crash on a missing optional dep. Tests: 26 new (11 cred-reuse repo, 15 vectorstore). Existing credentials and base-repo tests updated for the rename. Total: 34 passing on the touched files.	2026-04-26 03:18:34 -04:00
anti	6b16c844b6	fix(creds): MQTT regression + secret_kind for hash credentials Honest correction to the "every cred-emitting service" claim. Audit of templates/* found three gaps: 1. MQTT — was working through the legacy adapter, silently dropped when Phase 3 (`e696c2b`) deleted it. Now migrated to encode_secret() alongside the others. 2. Postgres — `auth, pw_hash=…` event captures the MD5 challenge-response the attacker sent. Plaintext irrecoverable, so it never fit the (principal, secret_b64=raw_bytes) shape. Lands in Credential as secret_kind="postgres_md5_challenge". 3. VNC — `auth_response, response=…hex` event captures the 16-byte DES-encrypted challenge. Same situation as Postgres: plaintext irrecoverable. Lands as secret_kind="vnc_des_response". Adds a `secret_kind` discriminator column to Credential (default "plaintext", indexed). The dedup tuple gains secret_kind so two credentials with the same sha256 but different kinds are fundamentally different rows — different challenges produce different bytes for the same plaintext password, so cross-kind reuse matches are meaningless and would only confuse analytics. The model now genuinely covers every cred-emitting service in the fleet: plaintext SSH, Telnet, FTP, POP3, IMAP, SMTP, Redis, LDAP, MQTT postgres_md5_* Postgres vnc_des_response VNC Username-only services (MySQL/MSSQL — TDS pre-encryption captures the user but never sees the password byte) intentionally don't feed Credential — they're recon signals, not cred attempts. 40 tests pass in the touched scope. New cases: secret_kind dedups independently in the repo; Postgres MD5 + VNC DES emitters thread through; MQTT round-trips through the native branch.	2026-04-25 06:16:57 -04:00
anti	2f47f67eef	feat(creds): future-proof Credential storage model Replaces the opaque Bounty.bounty_type='credential' path with a dedicated `credentials` table whose schema is forward-compatible across every auth-bearing service in the fleet. Hoisted indexed columns (secret_sha256, principal, service, attacker_ip) carry the universal reuse-analytics signal; service-specific JSON keys ride in `fields`. Cross-service reuse queries become an indexed lookup on secret_sha256 instead of JSON_EXTRACT scans. Schema decisions baked in (per ANTI): - New `Credential` table, not extension to Bounty - Hoisted `principal` column for cross-service principal-reuse - Standardized JSON keys: every payload carries secret_b64 + secret_printable + principal universally; service-specific extras (user, domain, dn, mech, …) ride alongside The auth-helper SD-block emits the new shape natively. The ingester forks at _extract_bounty: - Native shape (SSH/Telnet, future emitters): secret_b64 present → direct upsert_credential - Legacy shape (FTP/POP3/IMAP/SMTP today): username + password → adapter synthesizes secret_{b64,sha256,printable} on the fly, upserts into the same Credential table. Tracked as DEBT-039; one-shot bridge until those service templates migrate. Defense-in-depth across five layers (input validation): - C helper: bytes outside [0x20, 0x7f) collapse to '?', RFC 5424 escape rules for \\, ", ]; b64 preserves exact bytes - Ingester native branch: rejects malformed secret_b64 (regex), drops the credential row but keeps the underlying Log - Ingester legacy adapter: same printable-ASCII filter as the C code; sha256 + b64 over the original utf-8 bytes (lossless, even when secret_printable is sanitized) - DB column caps with truncation warning; sha256 always over the full pre-truncation bytes so reuse queries match across truncation - JSON serialized with ensure_ascii=True so utf8mb4 columns stay safe even with non-ASCII service-specific keys Bounty.bounty_type='credential' is no longer written. Pre-v1: no historical backfill; existing rows stay untouched but unused. 595 tests pass; new tests cover the model + repo (upsert dedup, null-principal independence, cross-service reuse, filters), both ingester branches, b64 validation, sanitization preserving the fingerprinting signal in b64.	2026-04-25 05:29:26 -04:00
anti	9232031ec7	feat(db): extend SessionProfile schema with DEBT-036 keystroke features Adds the three signal columns motivated by the manual keystroke analysis in DEBT-036 directly to the SessionProfile table. Pre-v1 so we modify the schema in place — Alembic arrives at v1. Columns: - kd_top_bigrams (TEXT) — JSON of top-N most-common digraphs with mean IAT per bigram. Complements kd_digraph_simhash ("same typist?") with "same typist in same mental state?" (tired / rested / distracted shifts bigram-specific IATs measurably). - kd_start_of_action_latency (REAL/DOUBLE) — median IAT of the first keystroke after an idle gap > 1s. Separates "initiating a command" from "executing a remembered one"; real humans have measurable start-of-action latency, bots don't. - kd_pause_hist_burst / _think / _distracted (INT) — three-bucket histogram (counts, <0.2s / 0.2-1.5s / >1.5s). More discriminating than the existing flat burst_ratio / think_ratio pair: C2 operators concentrate in burst with a thin tail; opportunistic humans have a fat think bucket and a long distracted tail. Both backends get an idempotent ADD COLUMN migration (_migrate_session_profile_table) wired into initialize() alongside the existing _migrate_attackers_table path — guards on PRAGMA table_info (SQLite) / information_schema.COLUMNS (MySQL) so reruns are safe. PII discipline comment on kd_digraph_simhash and kd_top_bigrams: both operate on bigram CHARACTERS, never on raw input stream content. Attacker passwords typed over SSH must not land here. Test updated for the MySQL initialize() migration-order contract.	2026-04-24 10:45:48 -04:00
anti	ea95a009df	refactor(tests): move flat tests/.py into per-subsystem subfolders Groups every flat test_.py under the module it exercises, matching the existing tests/{profiler,sniffer,prober,collector,correlation,cli,web, topology,swarm,bus,updater,api,docker,geoip,...} layout. New folders: services/, fleet/, config/, logging/, db/ (+ db/mysql/), telemetry/, mutator/, core/. Path-dependent __file__ references bumped an extra .parent in three files that moved one level deeper: - tests/sniffer/test_sniffer_ja3.py (template path) - tests/services/test_ssh_capture_emit.py (template path) - tests/cli/test_mode_gating.py (REPO root) - tests/web/test_env_lazy_jwt.py (repo var) Also drops two SQLite runtime artifacts (test_decnet.db-{shm,wal}) that were leaking into the repo from a previous test run. Fixes two test_service_isolation cases that patched asyncio.sleep (no longer on the profiler main-loop hot path — same pre-existing bug I fixed earlier in test_attacker_worker.py) by patching asyncio.wait_for and passing interval=0.	2026-04-23 21:34:25 -04:00

30 Commits