DECNET

Author	SHA1	Message	Date
anti	e364ef8859	feat(clustering): revocable merges (merge + unmerge) Reworks the clusterer's tick to handle multi-identity components and re-evaluate prior merges. Two passes per tick: Pass 1 — per-component reconciliation: * Fresh component → mint identity (commit 4 path). * Single-identity component → link unassigned observations. * Multi-identity component → soft-merge: pick the smallest-uuid winner deterministically, set merged_into_uuid on each loser, link unassigned observations to the winner. Observations stay FK'd to their original identity row — the merge is a soft pointer, not a re-point. Audit trail preserved; cached subscribers resolve through the chain. Pass 2 — revocable-merge undo: * For each merged-out identity, check whether its observations still cluster with its winner's. If not, the merge is contradicted by new evidence — clear merged_into_uuid and emit identities_unmerged. The resurrected identity keeps its original uuid, so subscribers that cached it during the merged interval re-attach without a new lookup. A pre-built merge-chain dict feeds Pass 1 so the effective-identity lookup is O(1) per observation. The chain has a hop cap (paranoia against accidental cycles in the underlying state). Repo additions on BaseRepository + SQLModelRepository: * list_all_identities() — includes merged-out rows. * update_identity_merged_into(uuid, winner_or_None) — single setter for both merge and unmerge. DummyRepo coverage stub updated. Tests: * Two distinct identities bridged by a new observation merge with the smaller uuid as winner. * A pre-seeded soft-merge whose underlying observations diverge gets revoked; resurrected uuid emerges with merged_into_uuid cleared. * Tick is idempotent under no state changes.	2026-04-26 08:33:32 -04:00
anti	de2f4c3a62	feat(clustering): wire high-weight edges end-to-end The connected-components clusterer now writes attacker_identities rows + sets attackers.identity_id when high-weight signals (JA3 / HASSH / payload-hash / C2-endpoint exact match) agree across observations. Singletons stay un-fingerprinted and un-clustered. Algorithm split: - cluster_observations(observations) — pure union-find over the high-weight edge function. Same code path for fixture validation and production tick. - from_attacker_row(row) — production-row adapter; recovers JA3 + HASSH from Attacker.fingerprints JSON. Payload + C2 join from logs in later commits; the function shape doesn't change. Repo additions on BaseRepository + SQLModelRepository: - list_attackers_for_clustering(limit=None) - create_attacker_identity(row) - set_attacker_identity_id(attacker_uuid, identity_uuid) DummyRepo coverage stub updated. v1 behavior is conservative: only assigns identities to observations whose identity_id is currently NULL. Multi-identity components are skipped this pass — merge / re-assign lands in commit 10 with revocable merges. Fixture bounds tightened against the production clusterer: - lone_wolf (F3) — singletons stay singletons - shared_wordlist (F1) — credential-only overlap doesn't cluster (high-weight tier doesn't include credentials) - vpn_hopping (F2, identity-level) — 5 rotated IPs with stable JA3 + HASSH fold into one identity, ARI = 1.0, completeness = 1.0	2026-04-26 08:19:56 -04:00
anti	dc3d08dd41	feat(web): read-only /api/v1/identities/* endpoints + repo methods Second of the five-step identity-resolution substrate. Ships the API surface against the empty AttackerIdentity table from commit 1 — every endpoint returns empty/404 cleanly until the clusterer populates rows. Routes (auth-gated, viewer role): * GET /api/v1/identities — paginated list, excludes merged-out rows * GET /api/v1/identities/{uuid} — detail; transparently follows merged_into_uuid to surface the canonical winner * GET /api/v1/identities/{uuid}/observations — Attacker rows FK'd to the (resolved) identity uuid Repository (BaseRepository abstract + SQLModelRepository concrete): * get_identity_by_uuid (with merge-chain following, hop-bounded) * list_identities / count_identities (excluding merged-out) * list_observations_for_identity / count_observations_for_identity Tests: 12 new (empty-table behavior, seeded data, merge-chain resolution, repo-level smoke against real SQLite). Also fixes the pre-existing test_base_repo_coverage failure (DEBT-041 added abstract methods without updating the DummyRepo stub) — included here because this PR adds 5 more abstract methods, fixing it as a bonus. 474 db/web/profiler/correlation tests green.	2026-04-26 07:08:55 -04:00
anti	3eb67c9400	refactor(intel): re-key attacker_intel on attacker_uuid (closes DEBT-041) The threat-intel surface was IP-keyed on day one as an expedient — the worker is woken by IP-bearing bus events. ANTI's call: don't carry that debt. NO IPs as primary keys anywhere on the attacker-intel surface. Schema: - attacker_uuid is now the canonical key — UNIQUE + FK to attackers.uuid. - attacker_ip stays as a denormalised, indexed, NON-UNIQUE value column. Updated on every upsert; useful for SIEM payloads and audit lookups, but explicitly NOT a key. Model docstring says so. - Pre-v1, no Alembic migration needed. SQLModel.metadata.create_all() builds the new shape on fresh DBs. Repo: - upsert_attacker_intel now keys on attacker_uuid. - get_attacker_intel_by_ip → get_attacker_intel_by_uuid. - get_unenriched_attacker_ips → get_unenriched_attackers, returning [{uuid, ip}] tuples so the worker writes by UUID and dispatches provider calls by IP without a second round-trip. Worker: - _enrich_one(uuid, ip, ...) — UUID lands on the row, IP rides for provider egress. - attacker.intel.enriched bus payload gains attacker_uuid alongside attacker_ip — webhook → SIEM consumers benefit; no removal. API: - GET /api/v1/attackers/{ip}/intel deleted outright (rip-and-replace, never deployed beyond dev). - GET /api/v1/attackers/{uuid}/intel is the only public route, matching every other /attackers/* route. Frontend: - <IntelPanel uuid={id!} /> uses the URL param directly, fetches in parallel with the rest of AttackerDetail rather than waiting on attacker.ip. Tests: re-keyed in place, 39 passed (same coverage as before the refactor). Provider-impl tests untouched. DEBT-041: closed in DEBT.md (entry preserved as historical rationale, summary table flipped to ✅, remaining-open list shortened by one).	2026-04-26 05:35:29 -04:00
anti	0dd3811436	feat(intel): attacker_intel table + repo helpers New TTL-cached threat-intel row keyed by attacker IP, with per-provider verdict/raw/queried_at columns for GreyNoise, AbuseIPDB, abuse.ch Feodo Tracker and ThreatFox. Carries schema_version from day one (federation wire-format precedent set by SessionProfile). Repo gains upsert_attacker_intel, get_attacker_intel_by_ip, and a get_unenriched_attacker_ips backfill primitive that picks fresh + stale rows for the forthcoming 'decnet enrich' worker. Also documents the open-source intel-source backlog in DEVELOPMENT_V2.	2026-04-26 04:56:47 -04:00
anti	50870f2e7a	feat(creds): surface plaintext/b64 secret on reuse findings The CredentialReuse table only stores the sha256+kind hash of the secret; the printable + b64 forms live on the underlying Credential rows. The dashboard drawer was therefore showing only the hash, which defeats most of the value of having a reuse view in the first place. Repo helpers list_credential_reuses + get_credential_reuse_by_id now issue one batched SELECT against credentials keyed on the sha256s in the result page and graft secret_printable + secret_b64 onto each row before returning. The drawer renders the same printable/b64 code-block the credentials inspector uses.	2026-04-26 04:34:19 -04:00
anti	590c2b0fac	feat(correlation): credential-reuse engine + reuse-correlate worker Adds CorrelationEngine.correlate_credential_reuse + the `decnet reuse-correlate` long-running worker. The worker mirrors the mutator's bus-wake + slow-tick pattern: wakes on credential.captured and attacker.observed for sub-second latency, falls back to a 60s poll if the bus is unavailable, and publishes credential.reuse.detected once per new or grown CredentialReuse row (group-deduped so a 5-cred reuse doesn't emit 5 partial events). The web ingester now publishes credential.captured after every successful Credential upsert; bus + new repo helper find_credential_reuse_candidates feed the engine pass.	2026-04-26 03:37:49 -04:00
anti	ce4be68501	feat(creds): cred-reuse foundation + vectorstore scaffold Lays the storage and bus substrate for the "credential reuse patterns" task in DEVELOPMENT.md and scaffolds decnet/vectorstore/ as the future substrate for statistical attacker re-identification over behavioral fingerprints. No correlator, profiler, API, or dashboard wiring in this commit — see TODO.md for the handoff. Schema: - Credential.attacker_uuid (nullable FK to attackers.uuid), backfilled by the profiler post-write to avoid coupling the capture path to the profiler's ordering. - CredentialReuse table — UUID PK, JSON list columns for the accumulating attacker_uuids/ips/deckies/services, target_count (the discriminative scalar), confidence reserved for a future fuzzy-credential pass. Repo: - upsert_credential_reuse / list_credential_reuses / get_credential_reuse_by_id / update_credential_attacker_uuid. - Renamed pre-existing get_credential_reuse(secret_sha256) to get_credential_attempts_for_secret(secret_sha256) — the new findings table needs the cleaner name. Bus topics: - credential.captured (one per Credential upsert) - credential.reuse.detected (correlator-emitted on insert/grow) Vectorstore subpackage (decnet/vectorstore/, flat layout mirroring decnet/bus/): - BaseVectorStore ABC keyed by (kind, id) — kind discriminator means new feature families are additive, no schema migration. - FakeVectorStore (in-memory L2 KNN), NullVectorStore (no-op for DECNET_VECTORSTORE_ENABLED=false), SqliteVecVectorStore (lazy sqlite_vec extension load, one vec0 virtual table per kind). - get_vectorstore() env-driven dispatch with graceful fallback to FakeVectorStore when the sqlite-vec extension isn't on the host, so workers don't crash on a missing optional dep. Tests: 26 new (11 cred-reuse repo, 15 vectorstore). Existing credentials and base-repo tests updated for the rename. Total: 34 passing on the touched files.	2026-04-26 03:18:34 -04:00
anti	6b16c844b6	fix(creds): MQTT regression + secret_kind for hash credentials Honest correction to the "every cred-emitting service" claim. Audit of templates/* found three gaps: 1. MQTT — was working through the legacy adapter, silently dropped when Phase 3 (`e696c2b`) deleted it. Now migrated to encode_secret() alongside the others. 2. Postgres — `auth, pw_hash=…` event captures the MD5 challenge-response the attacker sent. Plaintext irrecoverable, so it never fit the (principal, secret_b64=raw_bytes) shape. Lands in Credential as secret_kind="postgres_md5_challenge". 3. VNC — `auth_response, response=…hex` event captures the 16-byte DES-encrypted challenge. Same situation as Postgres: plaintext irrecoverable. Lands as secret_kind="vnc_des_response". Adds a `secret_kind` discriminator column to Credential (default "plaintext", indexed). The dedup tuple gains secret_kind so two credentials with the same sha256 but different kinds are fundamentally different rows — different challenges produce different bytes for the same plaintext password, so cross-kind reuse matches are meaningless and would only confuse analytics. The model now genuinely covers every cred-emitting service in the fleet: plaintext SSH, Telnet, FTP, POP3, IMAP, SMTP, Redis, LDAP, MQTT postgres_md5_* Postgres vnc_des_response VNC Username-only services (MySQL/MSSQL — TDS pre-encryption captures the user but never sees the password byte) intentionally don't feed Credential — they're recon signals, not cred attempts. 40 tests pass in the touched scope. New cases: secret_kind dedups independently in the repo; Postgres MD5 + VNC DES emitters thread through; MQTT round-trips through the native branch.	2026-04-25 06:16:57 -04:00
anti	2f47f67eef	feat(creds): future-proof Credential storage model Replaces the opaque Bounty.bounty_type='credential' path with a dedicated `credentials` table whose schema is forward-compatible across every auth-bearing service in the fleet. Hoisted indexed columns (secret_sha256, principal, service, attacker_ip) carry the universal reuse-analytics signal; service-specific JSON keys ride in `fields`. Cross-service reuse queries become an indexed lookup on secret_sha256 instead of JSON_EXTRACT scans. Schema decisions baked in (per ANTI): - New `Credential` table, not extension to Bounty - Hoisted `principal` column for cross-service principal-reuse - Standardized JSON keys: every payload carries secret_b64 + secret_printable + principal universally; service-specific extras (user, domain, dn, mech, …) ride alongside The auth-helper SD-block emits the new shape natively. The ingester forks at _extract_bounty: - Native shape (SSH/Telnet, future emitters): secret_b64 present → direct upsert_credential - Legacy shape (FTP/POP3/IMAP/SMTP today): username + password → adapter synthesizes secret_{b64,sha256,printable} on the fly, upserts into the same Credential table. Tracked as DEBT-039; one-shot bridge until those service templates migrate. Defense-in-depth across five layers (input validation): - C helper: bytes outside [0x20, 0x7f) collapse to '?', RFC 5424 escape rules for \\, ", ]; b64 preserves exact bytes - Ingester native branch: rejects malformed secret_b64 (regex), drops the credential row but keeps the underlying Log - Ingester legacy adapter: same printable-ASCII filter as the C code; sha256 + b64 over the original utf-8 bytes (lossless, even when secret_printable is sanitized) - DB column caps with truncation warning; sha256 always over the full pre-truncation bytes so reuse queries match across truncation - JSON serialized with ensure_ascii=True so utf8mb4 columns stay safe even with non-ASCII service-specific keys Bounty.bounty_type='credential' is no longer written. Pre-v1: no historical backfill; existing rows stay untouched but unused. 595 tests pass; new tests cover the model + repo (upsert dedup, null-principal independence, cross-service reuse, filters), both ingester branches, b64 validation, sanitization preserving the fingerprinting signal in b64.	2026-04-25 05:29:26 -04:00
anti	37050a4bcd	fix(db): claim_next_mutation works on MySQL — derived-table workaround MySQL ERROR 1093 forbids referencing the UPDATE target inside a subquery; the existing UPDATE ... WHERE id = (SELECT id FROM topology_mutations ...) form blew up on every mutation claim under the MySQL backend, so no mutation ever progressed past pending. Wrap the inner SELECT in a derived table (SELECT id FROM (...) AS _next). MySQL materialises the derived rowset before applying the UPDATE, sidestepping 1093. SQLite accepts both forms, so the single-statement atomic claim semantics are preserved on both backends — racing watchers still serialise correctly.	2026-04-24 22:15:23 -04:00
anti	c78ab032bd	fix(xff): truncate LEAKED IPs + ROTATION badge for rotation attacks `for i in $(seq 1 100); do curl -H "X-Forwarded-For: 191.100.20.$i" ...` was dumping 100 distinct IPs into AttackerDetail's LEAKED IPs row, drowning the rest of the ORIGIN section. The 100-IP wall is itself a signal (WAF-bypass-list probing) that deserves a short badge, not a flood. Backend: - get_attacker_ip_leaks gains `limit: int = 10` parameter — caller only ever needs a sample, not the full set. - New count_attacker_ip_leaks() returns the unbounded COUNT(*) via one cheap SQL aggregate. - Detail endpoint returns {ip_leaks: [first 10], ip_leaks_total: N} so the UI can render a rotation badge independent of list length. UI: - New LeakedIPsRow component. First 5 distinct IPs rendered inline with hover tooltips (unchanged). When > 5, a `+ N more` expand button reveals the rest of the sample; when total exceeds the 10-row cap, a subtle `(+M beyond sample)` note appears. - When total ≥ 20, a red `ROTATION · N` tag renders leading the row with a tooltip explaining the semantic: "almost certainly XFF-rotation / WAF-bypass probing, not a real attribution leak." DB churn is deliberately not capped — 100k rows × ~500 B is tolerable. If it becomes a problem we can add an ingester-side count-and-skip; for now the UX fix is the whole story. Added test_ip_leaks_total_reported_separately_from_list asserting the endpoint shape matches what the UI consumes.	2026-04-24 18:25:46 -04:00
anti	2a0c5ca410	feat(attackers): XFF mismatch detection — attacker IP leak bounties Attackers routinely front their scanners with VPNs/proxies, so the TCP source we log is the proxy egress, not the real host. But a surprising number of attacker setups are misconfigured: the proxy forwards the real IP in an X-Forwarded-For (or Forwarded / X-Real-IP / CDN-variant) header. From our side that's a free attribution leak. New _detect_ip_leak extractor in decnet/web/ingester.py fires at ingest time per HTTP request. Logic: 1. Require service=http, source_ip present, headers present. 2. If source_ip ∈ DECNET_TRUSTED_PROXIES (comma-separated IPs or CIDRs) → legitimate reverse-proxy forwarding, skip. 3. Walk proxy-family headers in priority order: Forwarded (RFC 7239) → X-Forwarded-For → X-Real-IP → True-Client-IP → CF-Connecting-IP. 4. Extract the left-most parseable IP from the winning header. 5. If that IP differs from the TCP source → emit a bounty with bounty_type="ip_leak" carrying {source_ip, real_ip_claim, source_header, headers_seen, path, method}. Storage is the existing Bounty table — no schema change; de-dup is handled by Bounty's (attacker_ip, bounty_type, payload_hash) key, so repeat requests with the same leaked IP don't spam. AttackerDetail renders a warn-accent "LEAKED IPs:" row under ORIGIN listing distinct real_ip_claim values; hover tooltip shows the source header + path of the most recent leak. Only shown when at least one ip_leak bounty exists. RFC 7239 Forwarded parser handles the full vocabulary — bare IPv4, IPv4:port, quoted, IPv6 in brackets, IPv6 with port — returning only IPs that actually parse. Closes DEVELOPMENT.md "Network Topology Leakage → X-Forwarded-For mismatches". Phase 3 of the three-phase Attacker Intelligence series (phases 1: scanned-vs-interacted, 2: PTR records already shipped). DECNET_TRUSTED_PROXIES env shape matches THREAT_MODEL DA-08's "revisit when verified-proxy config lands" note — same token set future rate-limit work will consume.	2026-04-24 17:39:03 -04:00
anti	351a8939c3	feat(attackers): scanned vs. interacted service bucketing on detail page Adds a new card on AttackerDetail: SCANNED · N services \| INTERACTED WITH · M services. Distinguishes port-scanners (N high, M=0) from actual engagement (M>0) at a glance — the analyst's first question when triaging a new attacker row. Classifier lives in decnet/correlation/event_kinds.py, a single source of truth for the event-type vocabulary: - INTERACTION_EVENT_TYPES — command-family (command/exec/query/...), SMTP engagement (mail_from/rcpt_to/message_accepted), file/payload activity (file_captured/upload/download_attempt/retr), pub/sub (publish/subscribe), recorded TTY sessions. - NOISE_EVENT_TYPES — DECNET-internal (startup/shutdown/parse_error/ unknown_*). - Everything else defaults to scan. Conservative by design: new template verbs show up as "scanned" until explicitly promoted. Bucket logic: a service is "interacted" if ≥1 of its events classifies as interaction; otherwise "scanned" if ≥1 scan event; noise-only services drop. Disjoint by construction. Deliberate no-schema path: compute on-the-fly in the detail endpoint via SELECT DISTINCT service, event_type FROM logs. Small result set (tens of pairs per attacker), cost is trivial vs. the existing behavior/commands queries. Trade-off: one more DB round-trip per detail view in exchange for zero ALTER TABLE migration pain and immediate classifier-change feedback loop. Profiler's _COMMAND_EVENT_TYPES stays as-is (strict subset of interactions that carry executable text), with a comment pointing at the new canonical module. Closes DEVELOPMENT.md "Attacker Intelligence §Service-Level Behavioral Profiling — Services actively interacted with".	2026-04-24 17:12:20 -04:00
anti	2bcef50ac5	feat(webhooks): circuit breaker auto-disables misbehaving subscriptions After DECNET_WEBHOOK_CIRCUIT_THRESHOLD (default 5) consecutive failed deliveries, the worker calls trip_webhook_circuit(uuid, ts) which flips enabled=False and stamps auto_disabled_at. The worker sets its reload flag so the next dispatch epoch stops consuming events for the tripped sub entirely — one dead receiver can't poison the shared egress pool anymore. Operator clears the trip via PATCH — setting enabled=True when the sub was previously disabled clears auto_disabled_at, zeros consecutive_failures, and clears last_error. Admin-pause → re-enable hits the same path harmlessly. Three observable states now distinguishable in the UI: - Active enabled=True, auto_disabled_at=NULL - Admin-paused enabled=False, auto_disabled_at=NULL - Tripped enabled=False, auto_disabled_at=<ts> UI surfaces a TRIPPED · <ts> chip on the row (red, alert-styled) and a "N TRIPPED" count in the page header. Hover tooltip tells the operator how to reset ("Re-enable via Edit"). record_webhook_failure now returns the new consecutive_failures count so the worker can compare against the threshold without a second roundtrip. trip_webhook_circuit is idempotent — re-tripping just re-stamps auto_disabled_at. Closes THREAT_MODEL WH-02 and DEBT-037 §1.	2026-04-24 16:24:33 -04:00
anti	b70845a85d	feat(webhooks): subscription CRUD + HMAC-signed delivery client Introduces the webhook egress foundation — a new WebhookSubscription table, admin-gated CRUD under /api/v1/webhooks, and the shared delivery client that both the test-ping route and the upcoming worker will use. No worker yet; this commit is API + model + client only. Simple-mode enum (AttackerDetail / DeckyStatus / SystemStatus) expands to bus-topic patterns at the router layer; storage is always the raw pattern list. Advanced mode lets admins supply raw NATS-style patterns directly. Filter-at-subscribe: the worker (next commit) will subscribe to the union of patterns across enabled subscriptions. Delivery client handles HMAC-SHA256 signing (X-DECNET-Signature), retry on 429/5xx/network errors with jittered backoff, no-retry on 4xx. Secrets never leave the server on GET/LIST — only the create response carries the secret for copy-out. CRUD routes publish WEBHOOK_SUBSCRIPTIONS_CHANGED on the bus after every mutation so the (future) worker can hot-reload. Opens DEBT-037 for the deferred items (circuit breaker, dead-letter, batch delivery, payload templates, secret-at-rest).	2026-04-24 15:30:05 -04:00
anti	9232031ec7	feat(db): extend SessionProfile schema with DEBT-036 keystroke features Adds the three signal columns motivated by the manual keystroke analysis in DEBT-036 directly to the SessionProfile table. Pre-v1 so we modify the schema in place — Alembic arrives at v1. Columns: - kd_top_bigrams (TEXT) — JSON of top-N most-common digraphs with mean IAT per bigram. Complements kd_digraph_simhash ("same typist?") with "same typist in same mental state?" (tired / rested / distracted shifts bigram-specific IATs measurably). - kd_start_of_action_latency (REAL/DOUBLE) — median IAT of the first keystroke after an idle gap > 1s. Separates "initiating a command" from "executing a remembered one"; real humans have measurable start-of-action latency, bots don't. - kd_pause_hist_burst / _think / _distracted (INT) — three-bucket histogram (counts, <0.2s / 0.2-1.5s / >1.5s). More discriminating than the existing flat burst_ratio / think_ratio pair: C2 operators concentrate in burst with a thin tail; opportunistic humans have a fat think bucket and a long distracted tail. Both backends get an idempotent ADD COLUMN migration (_migrate_session_profile_table) wired into initialize() alongside the existing _migrate_attackers_table path — guards on PRAGMA table_info (SQLite) / information_schema.COLUMNS (MySQL) so reruns are safe. PII discipline comment on kd_digraph_simhash and kd_top_bigrams: both operate on bigram CHARACTERS, never on raw input stream content. Attacker passwords typed over SSH must not land here. Test updated for the MySQL initialize() migration-order contract.	2026-04-24 10:45:48 -04:00
anti	8cbb7834ef	feat(web): SMTP victim-domain + stored-mail panels on attacker detail Adds GET /attackers/{uuid}/smtp-targets (viewer) and GET /attackers/{uuid}/mail (admin) endpoints, plus two new sections on the attacker detail page: VICTIM DOMAINS rollup (aggregate-only, federation-gossip-safe) and STORED MAIL with a drawer that decodes headers, lists attachments, and downloads the raw .eml via the existing artifact endpoint (?service=smtp).	2026-04-22 22:33:53 -04:00
anti	d43303251d	feat(profiler): track SMTP victim domains per attacker New SmtpTarget table records each (attacker, domain) pair observed via the SMTP honeypots. Only the domain is stored — local-parts are dropped at ingestion, so this table holds no user-identifying data beyond the target organisation's identity. The profiler worker extracts domains from rcpt_to / rcpt_denied / message_accepted events, normalizes them (lowercase, strip local-part, drop blocked TLDs), and upserts one row per pair with a running count + first_seen / last_seen. Three repo methods shipped: * increment_smtp_target(attacker, domain) — upsert + bump * list_smtp_targets(attacker) — per-attacker view * smtp_target_seen(domain) — cross-attacker aggregate, shaped as the federation-gossip RPC that V2 will expose. The gossip-query shape is load-bearing: each operator can answer "have any of your attackers targeted corp1.com?" without leaking which attackers or when — the aggregate returns a bool + total count + first/last seen, nothing else.	2026-04-22 22:23:27 -04:00
anti	119b4e8724	feat(db): add session_profile table for keystroke-dynamics fingerprints New purpose-built table with schema_version column committed from day one so V2 federation gossip can cluster sessions across operators without retrofitting. Ships with the empty write path (upsert_session_profile); ingestion of keystroke features (IKI moments, control-char rates, digraph SimHash) is tracked as V2 work. Closes gap #2 from SIGNAL_CAPTURE_AUDIT.md.	2026-04-22 21:39:17 -04:00
anti	d3321324eb	feat(sniffer): capture SSH client banner from TCP stream Parse RFC 4253 §4.2 identification strings from the first attacker→decky data segment on TCP/22; emit ssh_client_banner syslog events and bus fan-out. Profiler's sniffer_rollup dedupes observed banners into a new AttackerBehavior.ssh_client_banners JSON column. Closes gap #3 from SIGNAL_CAPTURE_AUDIT.md.	2026-04-22 21:37:01 -04:00
anti	8181f39ae2	feat(profiler): persist raw SSH KEX algorithm ordering Prober already emits kex_algorithms in hassh_fingerprint syslog events, but the raw ordered list was only queryable via the generic bounty store. Add a dedicated AttackerBehavior.kex_order_raw column (TEXT, JSON list) so post-v1 KEX-order fingerprinting has a typed, indexable home. Pipeline: - sniffer_rollup() now consumes hassh_fingerprint events and collects distinct kex_algorithms strings across ports. - build_behavior_record() JSON-encodes the list (NULL when empty). - sqlmodel_repo._deserialize_behavior() parses it back into a list. Closes pre-v1 gap #1 from SIGNAL_CAPTURE_AUDIT.md.	2026-04-22 21:29:46 -04:00
anti	5704e8fcce	fix(topology): delete topology_mutations in delete-cascade delete_topology_cascade manually deletes status_events, edges, deckies and lans but overlooked topology_mutations, so deleting any topology that ever had a mutation enqueued (i.e. edits while active\|degraded) failed with an FK IntegrityError. Add the missing DELETE and extend the cascade test to seed a mutation row.	2026-04-22 17:50:30 -04:00
anti	3f460bab84	feat(web): show MazeNET decky running count + roll into dashboard MazeNET header now reports '{running}/{total} DECKIES RUNNING' so operators can see per-topology runtime status at a glance. Dashboard ACTIVE DECKIES counters used to reflect only the fleet state file; TopologyDecky rows (MazeNET deployments) are now added in — deployed_deckies = fleet + all topology rows, active_deckies = fleet (no runtime field) + topology rows whose state is 'running'.	2026-04-22 17:48:04 -04:00
anti	6e522c5a55	feat(web): transcripts API + repository lookups Adds get_attacker_transcripts (mirror of artifacts for session_recorded logs) and get_session_log for sid→shard resolution. New /api/v1/transcripts/{decky}/{sid}?offset=&limit= pages asciinema events out of the shared JSONL day-shard via an mtime-keyed byte-offset index — never scans the whole shard per request. New /api/v1/attackers/{uuid}/transcripts lists sessions for drilldown. Both endpoints admin-gated.	2026-04-21 23:06:39 -04:00
anti	e8f9c955b3	feat(swarm): heartbeat-driven topology resync for agent-pinned deployments Agent heartbeats now carry an applied-topology snapshot. The master heartbeat handler compares the reported version_hash against what canonical_hash yields for the hydrated topology pinned to that host and flags Topology.needs_resync on divergence (or when the agent reports no topology at all while master expects one). The mutator watch loop gains reconcile_agent_resyncs, which re-pushes the current hydrated blob via AgentClient.apply_topology without touching status, then clears the flag on success. Push failures leave the flag set so the next tick retries.	2026-04-21 01:35:12 -04:00
anti	be4e1b1891	feat(mazenet): auto-bridge new LANs to the DMZ gateway When a non-DMZ LAN is created via POST /lans, look up the topology's gateway (decky with forwards_l3=True attached to the DMZ) and insert an edge binding it to the new LAN. The gateway becomes multi-homed to every internal LAN automatically, so DMZ_ORPHAN cannot arise from ordinary editor use. Also fixes delete_lan: the home-decky guard used scalar_one_or_none, which blew up when the gateway already had >1 'other' LAN edge. Switch to scalars().first() — we only need to know some other edge exists, not a unique one.	2026-04-20 23:07:19 -04:00
anti	f182c98ffa	feat(api): phase 3 step 2 — topology read endpoints (list/get/status/catalog) GET /api/v1/topologies — paginated list with status filter. Extends repo.list_topologies() to accept limit/offset and adds count_topologies() for the total envelope field. GET /api/v1/topologies/{id} — hydrated TopologyDetail; 404 if missing. GET /api/v1/topologies/{id}/status-events — audit trail, limit-capped. Catalog helpers for the phase-4 canvas UI: * GET /topologies/services — full service catalog. * GET /topologies/next-subnet?base=172.20 — wraps SubnetAllocator against reserved_subnets across non-torn-down topologies. * GET /topologies/{id}/lans/{lan_id}/next-ip — IPAllocator pre-seeded with existing decky IPs in that LAN. All read routes are viewer-or-admin. Sub-routers are included in an order that keeps literal catalog paths (/services, /next-subnet) from being shadowed by the /{topology_id} trie branch.	2026-04-20 18:25:33 -04:00
anti	a76b9ecdf9	feat(mazenet): step 7 — topology_mutations queue + mutator reconciler Adds the live-mutation pipeline for active/degraded topologies: * TopologyMutation table with composite index (state, topology_id) so the watch-loop guard query stays O(log n). * claim_next_mutation is a single atomic UPDATE ... WHERE state='pending' so racing reconcilers deterministically pick one winner; losers see rowcount=0 and skip. * reconcile_topologies drains pending rows per live topology, applies via decnet.mutator.ops.dispatch, and on failure marks the mutation failed + transitions topology to degraded. * run_watch_loop gains a gated branch: flat-fleet mutate_all runs every tick unchanged; the reconciler only enters when the cheap has_pending_topology_mutation guard returns True. * apply_* ops re-check hard invariants (names, IP collisions, subnet overlap, known services, service_config shape) after every mutation so the repo never lands in an invalid state. * CLI: 'decnet topology mutate' / 'mutations' subcommands.	2026-04-20 18:02:37 -04:00
anti	91df57d36b	feat(topology): pending-only mutation repo methods with cascade + guards MazeNET phase 2 step 6. Equips the repo layer with the CRUD the web editor needs before deploy. - TopologyNotEditable exception: raised when a pending-only method hits a non-pending topology. The intent is "free-form edits stop at deploy; the mutator (step 7) takes over for live topologies." - _assert_pending helper checks status inside the session. - update_lan / update_topology_decky accept enforce_pending=True for pre-deploy callers (existing internal callers default to False so behavior is unchanged). - delete_lan: cascades edges; refuses if any decky has only one edge (= this LAN is its home) to prevent orphans. - delete_topology_decky: cascades edges. - delete_topology_edge: bare-bones removal. All four mutators accept expected_version for optimistic concurrency. Existing tests continue to pass (no behavior change for persist/deploy).	2026-04-20 17:50:29 -04:00
anti	e475c0957e	feat(topology): optimistic concurrency via Topology.version + expected_version MazeNET phase 2 step 4. Readies the repo layer for concurrent editors (web canvas + CLI + mutator) without lost-write races. - Topology.version: monotonically bumped on supervised child-row writes. - VersionConflict exception carries {current, expected} for the UI. - _check_and_bump_version helper reads Topology in the same session, compares against expected_version, raises on mismatch, bumps on match. Commit happens in the caller's existing transaction so check+bump+write are atomic per mutation. - add_lan / update_lan / add_topology_decky / update_topology_decky / add_topology_edge accept expected_version=None by default, preserving every existing caller's behavior. When expected_version is None, no check runs and version stays put — internal callers (persist) that don't care about concurrency keep working unchanged.	2026-04-20 17:47:28 -04:00
anti	47cd200e1d	feat(mazenet): repo methods for topology/LAN/decky/edge/status events Adds topology CRUD to BaseRepository (NotImplementedError defaults) and implements them in SQLModelRepository: create/get/list/delete topologies, add/update/list LANs and TopologyDeckies, add/list edges, plus an atomic update_topology_status that appends a TopologyStatusEvent in the same transaction. Cascade delete sweeps children before the topology row. Covered by tests/topology/test_repo.py (roundtrip, per-topology name uniqueness, status event log, cascade delete, status filter) and an extension to tests/test_base_repo.py for the NotImplementedError surface.	2026-04-20 16:43:49 -04:00
anti	148e51011c	feat(swarm): agent→master heartbeat with per-host cert pinning New POST /swarm/heartbeat on the swarm controller. Workers post every ~30s with the output of executor.status(); the master bumps SwarmHost.last_heartbeat and re-upserts each DeckyShard with a fresh DeckyConfig snapshot and runtime-derived state (running/degraded). Security: CA-signed mTLS alone is not sufficient — a decommissioned worker's still-valid cert could resurrect ghost shards. The endpoint extracts the presented peer cert (primary: scope["extensions"]["tls"], fallback: transport.get_extra_info("ssl_object")) and SHA-256-pins it to the SwarmHost.client_cert_fingerprint stored for the claimed host_uuid. Extraction is factored into _extract_peer_fingerprint so tests can exercise both uvicorn scope shapes and the both-unavailable fail-closed path without mocking uvicorn's TLS pipeline. Adds get_swarm_host_by_fingerprint to the repo interface (SQLModel impl reuses the indexed client_cert_fingerprint column).	2026-04-19 21:37:15 -04:00
anti	3ebd206bca	feat(swarm): persist DeckyConfig snapshot per shard + enrich list API Dispatch now writes the full serialised DeckyConfig into DeckyShard.decky_config (plus decky_ip as a cheap extract), so the master can render the same rich per-decky card the local-fleet view uses — hostname, distro, archetype, service_config, mutate_interval, last_mutated — without round-tripping to the worker on every page render. DeckyShardView gains the corresponding fields; the repository flattens the snapshot at read time. Pre-migration rows keep working (fields fall through as None/defaults). Columns are additive + nullable so SQLModel.metadata.create_all handles the change on both SQLite and MySQL. Backfill happens organically on the next dispatch or (in a follow-up) agent heartbeat.	2026-04-19 21:29:45 -04:00
anti	5dad1bb315	feat(swarm): remote teardown API + UI (per-decky and per-host) Agents already exposed POST /teardown; the master was missing the plumbing to reach it. Add: - POST /api/v1/swarm/hosts/{uuid}/teardown — admin-gated. Body {decky_id: str\|null}: null tears the whole host, a value tears one decky. On worker failure the master returns 502 and leaves DB shards intact so master and agent stay aligned. - BaseRepository.delete_decky_shard(name) + sqlmodel impl for per-decky cleanup after a single-decky teardown. - SwarmHosts page: "Teardown all" button (keeps host enrolled). - SwarmDeckies page: per-row "Teardown" button. Also exclude setuptools' build/ staging dir from the enrollment tarball — `pip install -e` on the master generates build/lib/decnet_web/node_modules and the bundle walker was leaking it to agents. Align pyproject's bandit exclude with the git-hook invocation so both skip decnet/templates/.	2026-04-19 19:39:28 -04:00
anti	6657d3e097	feat(swarm): add SwarmHost and DeckyShard tables + repo CRUD Introduces the master-side persistence layer for swarm mode: - SwarmHost: enrolled worker metadata, cert fingerprint, heartbeat. - DeckyShard: per-decky host assignment, state, last error. Repo methods are added as default-raising on BaseRepository so unihost deployments are untouched; SQLModelRepository implements them (shared between the sqlite and mysql subclasses per the existing pattern).	2026-04-18 07:09:29 -04:00
anti	41fd496128	feat(web): attacker artifacts endpoint + UI drawer Adds the server-side wiring and frontend UI to surface files captured by the SSH honeypot for a given attacker. - New repository method get_attacker_artifacts (abstract + SQLModel impl) that joins the attacker's IP to `file_captured` log rows. - New route GET /attackers/{uuid}/artifacts. - New router /artifacts/{decky}/{service}/{stored_as} that streams a quarantined file back to an authenticated viewer. - AttackerDetail grows an ArtifactDrawer panel with per-file metadata (sha256, size, orig_path) and a download action. - ssh service fragment now sets NODE_NAME=decky_name so logs and the host-side artifacts bind-mount share the same decky identifier.	2026-04-18 05:36:48 -04:00
anti	fb69a06ab3	fix(db): detach session cleanup onto fresh task on cancellation Previous attempt (shield + sync invalidate fallback) didn't work because shield only protects against cancellation from other tasks. When the caller task itself is cancelled mid-query, its next await re-raises CancelledError as soon as the shielded coroutine yields — rollback inside session.close() never completes, the aiomysql connection is orphaned, and the pool logs 'non-checked-in connection' when GC finally reaches it. Hand exception-path cleanup to loop.create_task() so the new task isn't subject to the caller's pending cancellation. close() (and the invalidate() fallback for a dead connection) runs to completion. Success path is unchanged — still awaits close() inline so callers see commit visibility and pool release before proceeding.	2026-04-17 21:13:43 -04:00
anti	1446f6da94	fix(db): invalidate pool connection when cancelled close fails Under high-concurrency MySQL load, uvicorn cancels request tasks when clients disconnect. If cancellation lands mid-query, session.close() tries to ROLLBACK on a connection that aiomysql has already marked as closed — raising InterfaceError("Cancelled during execution") and leaving the connection checked-out until GC, which the pool then warns about as a 'non-checked-in connection'. The old fallback tried sync.rollback() + sync.close(), but those still go through the async driver and fail the same way on a dead connection. Replace them with session.sync_session.invalidate(), which just flips the pool's internal record — no I/O, so it can't be cancelled — and tells the pool to drop the connection immediately instead of waiting for garbage collection.	2026-04-17 21:04:04 -04:00
anti	11b9e85874	feat(db): bulk add_logs for one-commit ingestion batches Adds BaseRepository.add_logs (default: loops add_log for backwards compatibility) and a real single-session/single-commit implementation on SQLModelRepository. Introduces DECNET_BATCH_SIZE (default 100) and DECNET_BATCH_MAX_WAIT_MS (default 250) so the ingester can flush on either a size or a time bound when it adopts the new method. Ingester wiring is deferred to a later pass — the single-log path was deadlocking tests when flushed during lifespan teardown, so this change ships the DB primitive alone.	2026-04-17 16:23:09 -04:00
anti	32340bea0d	perf: migrate hot-path JSON serialization to orjson stdlib json was FastAPI's default. Every response body, every SSE frame, and every add_log/state/payload write paid the stdlib encode cost. - pyproject.toml: add orjson>=3.10 as a core dep. - decnet/web/api.py: default_response_class=ORJSONResponse on the FastAPI app, so every endpoint return goes through orjson without touching call sites. Explicit JSONResponse sites in the validation exception handlers migrated to ORJSONResponse for consistency. - health endpoint's explicit JSONResponse → ORJSONResponse. - SSE stream (api_stream_events.py): 6 json.dumps call sites → orjson.dumps(...).decode() — the per-event frames that fire on every sse tick. - sqlmodel_repo.py: encode sites on the log-insert path switched to orjson (fields, payload, state value). Parser sites (json.loads) left as-is for now — not on the measured hot path.	2026-04-17 15:07:28 -04:00
anti	bd406090a7	fix: re-seed admin password when still unfinalized (must_change_password=True) _ensure_admin_user was strict insert-if-missing: once a stale hash landed in decnet.db (e.g. from a deploy that used a different DECNET_ADMIN_PASSWORD), login silently 401'd because changing the env var later had no effect. Now on startup: if the admin still has must_change_password=True (they never finalized their own password), re-sync the hash from the current env var. Once the admin sets a real password, we leave it alone. Found via locustfile.py login storm — see tests/test_admin_seed.py. Note: this commit also bundles uncommitted pool-management work already present in sqlmodel_repo.py from prior sessions.	2026-04-17 14:49:13 -04:00
anti	e9d151734d	feat: deduplicate bounties on (bounty_type, attacker_ip, payload) Before inserting a bounty, check whether an identical row already exists. Drops silent duplicates to prevent DB saturation from aggressive scanners.	2026-04-15 18:02:52 -04:00
anti	c8f05df4d9	feat: overhaul behavioral profiler — multi-tool detection, improved classification, TTL OS fallback	2026-04-15 15:47:02 -04:00
anti	314e6c6388	fix: remove event-loop-blocking cold start; unify profiler to cursor-based incremental Cold start fetched all logs in one bulk query then processed them in a tight synchronous loop with no yields, blocking the asyncio event loop for seconds on datasets of 30K+ rows. This stalled every concurrent await — including the SSE stream generator's initial DB calls — causing the dashboard to show INITIALIZING SENSORS indefinitely. Changes: - Drop _cold_start() and get_all_logs_raw(); uninitialized state now runs the same cursor loop as incremental, starting from last_log_id=0 - Yield to the event loop after every _BATCH_SIZE rows (asyncio.sleep(0)) - Add SSE keepalive comment as first yield so the connection flushes before any DB work begins - Add Cache-Control/X-Accel-Buffering headers to StreamingResponse	2026-04-15 13:46:42 -04:00
anti	c603531fd2	feat: add MySQL backend support for DECNET database - Implement MySQLRepository extending BaseRepository - Add SQLAlchemy/SQLModel ORM abstraction layer (sqlmodel_repo.py) - Support connection pooling and tuning via DECNET_DB_URL env var - Cross-compatible with SQLite backend via factory pattern - Prepared for production deployment with MySQL SIEM/ELK integration	2026-04-15 12:51:11 -04:00

46 Commits