DECNET

Author	SHA1	Message	Date
anti	d531cea536	feat(web): read-only campaigns API + SSE + frontend API: /api/v1/campaigns (paginated list), /api/v1/campaigns/{uuid} (soft-merge chain follow), /api/v1/campaigns/{uuid}/identities (member identities), and /api/v1/campaigns/events (SSE under campaign.> + JWT-via-?token=, snapshot-on-connect). Mirror of the identity router; same auth, same shape, same OpenAPI tags pattern. Frontend: CampaignDetail.tsx page (same visual vocabulary as IdentityDetail), useCampaignStream hook (mirror of useIdentityStream), /campaigns/:id route, IdentityDetail's CAMPAIGN badge becomes clickable and navigates to the campaign. useIdentityStream now listens for identity.campaign.assigned so the badge appears live without a manual refresh.	2026-04-26 09:20:17 -04:00
anti	75af00c9c8	test(clustering): full-bound passes through production campaign clusterer Runs the chained identity + campaign clustering pipeline against all seven fixtures via from_synthetic / from_synthetic_identity adapters and ratchets every YAML floor to 1.0 — the production clusterer (and the reference clusterers used in the per-fixture tests) all score perfectly across ARI / homogeneity / completeness / singleton_recall on each fixture. Three substrate fixes surfaced by the ratchet: - Tuning: shared_infra now Jaccards payload+C2 only; decky_set moved into cohort_weight to prevent fleet-scarcity false-merges (F1's shared_wordlist failure mode). Tier weight raised to 1.0 so shared payload+C2 alone crosses threshold (F5's intended pass). - Adapter: from_synthetic_identity now reads SyntheticSession started_at + duration_s for session_windows and per-decky timestamps (the production-row adapter still uses start_ts/end_ts when available). - Fixture data: paused_campaign.yaml's JA3 collided exactly with vpn_hopping.yaml's (same TLS extension list). The collision fused two unrelated campaigns under the chained identity layer in the noise_floor composite. Made paused's JA3 distinct. Also wires Campaign / CampaignsResponse into models/__init__.py's __all__ that was missed in the schema commit.	2026-04-26 09:13:59 -04:00
anti	6936a1426c	feat(clustering): campaign-clusterer worker + bus topics + CLI The campaign clusterer worker mirrors the identity-side worker shell (bus connect, heartbeat, control listener, slow-tick fallback) but wakes on identity.> instead of attacker.> — campaign-level work is gated on identity-layer changes, not raw observations. The connected-components implementation reads identities via list_identities_for_clustering, projects them with from_identity_row, runs union-find over combined_campaign_weight, writes campaigns rows, sets attacker_identities.campaign_id, and runs the same revocable- merge pass as the identity layer (a merged-out campaign whose identities no longer co-cluster with the winner gets revoked). Bus: adds campaign.> family (formed / identity.assigned / merged / unmerged) plus the cross-family identity.campaign.assigned so existing identity-stream subscribers see the badge update without having to subscribe to campaign.>. Wiki Service-Bus.md updated in wiki-checkout in the same wave per the project's bus-signals discipline. CLI: decnet campaign-clusterer registered as master-only via MASTER_ONLY_COMMANDS; --poll-interval / --daemon mirror the identity clusterer command surface.	2026-04-26 09:04:00 -04:00
anti	0946bab424	feat(clustering): campaign-level similarity primitives The signal taxonomy for the campaign clusterer (next commit). Mirror of the identity-layer module but with edge families that don't translate 1:1: phase-handoff (load-bearing for F5 multi_operator — the signal the identity-side fingerprint-disagreement veto deliberately isn't), shared-infra (vetoed at identity level, primary positive signal here), temporal-overlap (pairwise-relative — F7 invariance preserved), cohort (weak supporting weight only). Tier weights tuned so phase-handoff alone crosses threshold (F5), shared-infra + temporal-overlap together cross (canonical co-op pattern), and shared-infra + cohort together do NOT (F1 shared_wordlist's failure mode). The F7 time-shift invariant is explicitly tested on every time-bearing edge and on the combined weight.	2026-04-26 08:57:46 -04:00
anti	0a1cf65ddb	feat(db): Campaign SQLModel + repo write/read methods Adds the campaigns table and the BaseRepository / SQLModelRepository methods that the campaign-clusterer worker (next commit) needs to populate it. Mirrors the AttackerIdentity layer: schema_version from day one for federation gossip, soft-merge via merged_into_uuid with a chain-walking get_campaign_by_uuid, list_campaigns excluding merged- out rows while list_all_campaigns returns the unfiltered set for the revoke pass. attacker_identities.campaign_id gets a real FK now that the target table exists.	2026-04-26 08:54:28 -04:00
anti	97aa57faed	feat(api): SSE stream for identity events at /api/v1/identities/events Mirrors GET /api/v1/topologies/{id}/events: subscribes to identity.> on the bus for the duration of the request and forwards each event as a named SSE frame (formed / observation.linked / merged / unmerged). The endpoint is broadly scoped (every identity event, not per-uuid) because both AttackerDetail and IdentityDetail need the same firehose: AttackerDetail watches for an identity.formed that finally binds its identity_id; IdentityDetail watches for observation.linked / merged / unmerged against its current row. A per-uuid filter would force the client to know its identity before subscribing, which it doesn't always. JWT via ?token= (EventSource can't set headers), require_stream_viewer gate, sse_connection_slot per-user cap, snapshot-on-connect with the first 50 identities so the client buffer renders without a separate REST call. Bus-disabled / unreachable path keeps the connection alive on keepalives so the client doesn't reconnect-storm; it can re-poll the REST API on its own timer.	2026-04-26 08:36:17 -04:00
anti	e364ef8859	feat(clustering): revocable merges (merge + unmerge) Reworks the clusterer's tick to handle multi-identity components and re-evaluate prior merges. Two passes per tick: Pass 1 — per-component reconciliation: * Fresh component → mint identity (commit 4 path). * Single-identity component → link unassigned observations. * Multi-identity component → soft-merge: pick the smallest-uuid winner deterministically, set merged_into_uuid on each loser, link unassigned observations to the winner. Observations stay FK'd to their original identity row — the merge is a soft pointer, not a re-point. Audit trail preserved; cached subscribers resolve through the chain. Pass 2 — revocable-merge undo: * For each merged-out identity, check whether its observations still cluster with its winner's. If not, the merge is contradicted by new evidence — clear merged_into_uuid and emit identities_unmerged. The resurrected identity keeps its original uuid, so subscribers that cached it during the merged interval re-attach without a new lookup. A pre-built merge-chain dict feeds Pass 1 so the effective-identity lookup is O(1) per observation. The chain has a hop cap (paranoia against accidental cycles in the underlying state). Repo additions on BaseRepository + SQLModelRepository: * list_all_identities() — includes merged-out rows. * update_identity_merged_into(uuid, winner_or_None) — single setter for both merge and unmerge. DummyRepo coverage stub updated. Tests: * Two distinct identities bridged by a new observation merge with the smaller uuid as winner. * A pre-seeded soft-merge whose underlying observations diverge gets revoked; resurrected uuid emerges with merged_into_uuid cleared. * Tick is idempotent under no state changes.	2026-04-26 08:33:32 -04:00
anti	ed323581fe	feat(clustering): fingerprint-disagreement veto for fixture 5 Two operators cooperating on one campaign can share C2 endpoints + stage-1 payloads while running distinct tooling — fixture 5 (multi_operator) is the canonical demonstration. The identity clusterer must NOT fuse them: shared infra is a campaign-level signal, not an identity-level one. The campaign clusterer (downstream work) handles that grouping over identities. Mechanism: when two observations have non-null fingerprints AND the fingerprints fully disagree, the high-weight tier drops the payload and C2 contributions to zero. JA3 / HASSH agreement still returns 1.0 directly — no veto applies when something agrees. Partial agreement (one slot agrees, another disagrees) is treated as agreement, since stable-tool partial overlap is more consistent with one identity than two. The veto only triggers when there is actual disagreement evidence — two un-fingerprinted observations sharing a C2 still cluster, since the absence of fingerprints is not the same as disagreement on them. Fixture 5 production-clusterer assertion added at identity level: ARI = 1.0, homogeneity = 1.0, exactly 2 predicted clusters from 2 truth identities. Phase-handoff edges (from the TODO) belong to the downstream campaign clusterer, not this identity clusterer.	2026-04-26 08:24:22 -04:00
anti	f7da33726c	feat(clustering): combined edge weight + medium-tier wiring The clusterer now drops a single high-tier function call in favor of a tier-weighted sum. Tier multipliers (high=1.0, medium=0.6, low=0.2, very_low=0.05) are tuned so the threshold (1.0) admits high-tier agreement alone while leaving every weaker tier — and every combination of weaker tiers — under threshold. Per-tier discipline tested: - high alone clusters - medium alone does NOT cluster (supporting signal only) - low alone does NOT cluster (fixture 1's failure mode) - very-low alone does NOT cluster (fixture 2's failure mode) - all three weak tiers stacked still don't reach threshold - high + medium clusters (high already saturates) The combination is forward-compatible: low + very-low contributions are computed today but always project to 0.0 because the production adapter doesn't populate credentials / ASN-edge inputs into the fixture path yet. Their contribution becomes load-bearing in commit 7 when the low-tier landing tightens the F1 / F2 bounds. Fixture 4 (paused_campaign) ratchet added: high-tier signal carries the multi-day-silence campaign into one identity. Time-agnostic invariant — silence is irrelevant to the edge weight.	2026-04-26 08:22:10 -04:00
anti	de2f4c3a62	feat(clustering): wire high-weight edges end-to-end The connected-components clusterer now writes attacker_identities rows + sets attackers.identity_id when high-weight signals (JA3 / HASSH / payload-hash / C2-endpoint exact match) agree across observations. Singletons stay un-fingerprinted and un-clustered. Algorithm split: - cluster_observations(observations) — pure union-find over the high-weight edge function. Same code path for fixture validation and production tick. - from_attacker_row(row) — production-row adapter; recovers JA3 + HASSH from Attacker.fingerprints JSON. Payload + C2 join from logs in later commits; the function shape doesn't change. Repo additions on BaseRepository + SQLModelRepository: - list_attackers_for_clustering(limit=None) - create_attacker_identity(row) - set_attacker_identity_id(attacker_uuid, identity_uuid) DummyRepo coverage stub updated. v1 behavior is conservative: only assigns identities to observations whose identity_id is currently NULL. Multi-identity components are skipped this pass — merge / re-assign lands in commit 10 with revocable merges. Fixture bounds tightened against the production clusterer: - lone_wolf (F3) — singletons stay singletons - shared_wordlist (F1) — credential-only overlap doesn't cluster (high-weight tier doesn't include credentials) - vpn_hopping (F2, identity-level) — 5 rotated IPs with stable JA3 + HASSH fold into one identity, ARI = 1.0, completeness = 1.0	2026-04-26 08:19:56 -04:00
anti	a9775c4000	feat(clustering): similarity-graph primitives Adds the four weight-tier edge functions as pure, time-agnostic scoring primitives over an Observation projection. Each returns a score in [0, 1]; the connected-components impl will combine + threshold in subsequent commits. Tier semantics (from IDENTITY_RESOLUTION.md): - high — JA3/HASSH/payload-hash/C2-endpoint exact match - medium — phase-bucketed command-sequence Jaccard - low — credential-attempt-set Jaccard (defeated alone by F1) - very low — ASN equality (defeated alone by F2) Time-agnostic invariant is a static test: Observation has no time fields, so no edge function can silently start using them. Fixture 7 forbids recency-decay clustering on multi-month APT campaigns. A from_synthetic() adapter projects SyntheticAttacker corpora into Observation; the production-row adapter lands when the clusterer starts reading the attackers table.	2026-04-26 08:13:29 -04:00
anti	fb522af107	feat(bus): reserve identity.unmerged topic Revocable merges (a contradiction-driven undo of identity.merged) ship in the clusterer work; this reserves the topic up-front so identity.> subscribers receive it day one without a re-subscribe. The clusterer worker's ClusterResult fan-out now publishes on identity.unmerged when populated. The skeleton clusterer never populates it; the revocable-merge commit will. Wiki update lives in wiki-checkout/Service-Bus.md (separate repo).	2026-04-26 08:10:56 -04:00
anti	e545f7d8d3	feat(clustering): identity clusterer worker skeleton Adds the decnet clusterer master-only command + provider-subpackage shape (base.py + factory.py + impl/connected_components.py) so subsequent commits can land similarity-graph features without churning callers. The skeleton ConnectedComponentsClusterer.tick is a no-op; the worker shell is fully wired (bus consumer on attacker.observed + attacker.scored, slow-tick fallback, health heartbeat, control listener, ClusterResult fan-out to identity.formed/observation.linked /merged). Subscribers on identity.> see no events from this clusterer until edge functions land, but the lifecycle is in place.	2026-04-26 08:09:11 -04:00
anti	4f1077be72	feat(bus): identity.* topic family (formed / observation.linked / merged) Fourth of the five-step identity-resolution substrate. Constants and builder ship now; no publishers exist yet — they land with the clusterer worker. Subscribers (webhook worker, dashboard SSE relay) can register against identity.> from day one. * decnet/bus/topics.py — IDENTITY root + IDENTITY_FORMED / IDENTITY_OBSERVATION_LINKED / IDENTITY_MERGED leaves; identity() builder mirroring the attacker() / system() helpers. Module docstring topic-tree updated. * tests/bus/test_topics.py — assert builder produces the expected three topic strings + rejects empty event_type. Wiki Service-Bus.md and a new Identity-Resolution.md page land in the companion wiki-checkout commit.	2026-04-26 07:15:44 -04:00
anti	dc3d08dd41	feat(web): read-only /api/v1/identities/* endpoints + repo methods Second of the five-step identity-resolution substrate. Ships the API surface against the empty AttackerIdentity table from commit 1 — every endpoint returns empty/404 cleanly until the clusterer populates rows. Routes (auth-gated, viewer role): * GET /api/v1/identities — paginated list, excludes merged-out rows * GET /api/v1/identities/{uuid} — detail; transparently follows merged_into_uuid to surface the canonical winner * GET /api/v1/identities/{uuid}/observations — Attacker rows FK'd to the (resolved) identity uuid Repository (BaseRepository abstract + SQLModelRepository concrete): * get_identity_by_uuid (with merge-chain following, hop-bounded) * list_identities / count_identities (excluding merged-out) * list_observations_for_identity / count_observations_for_identity Tests: 12 new (empty-table behavior, seeded data, merge-chain resolution, repo-level smoke against real SQLite). Also fixes the pre-existing test_base_repo_coverage failure (DEBT-041 added abstract methods without updating the DummyRepo stub) — included here because this PR adds 5 more abstract methods, fixing it as a bonus. 474 db/web/profiler/correlation tests green.	2026-04-26 07:08:55 -04:00
anti	84c1ca9c9b	feat(identity): AttackerIdentity table + nullable attackers.identity_id FK Schema-only commit, first of the five-step substrate for identity resolution. The clusterer that populates identities lands later; this ships the table empty and the FK uniformly NULL on existing rows. * decnet/web/db/models/attackers.py — new AttackerIdentity SQLModel (uuid PK, schema_version, fingerprint summary lists, kd_digraph_simhash, merged_into_uuid self-FK, all clusterer-populated fields nullable). Attacker grows a nullable indexed identity_id FK + docstring marking it as the per-IP observation row. * decnet/web/db/models/__init__.py — re-exports AttackerIdentity. * tests/db/test_identity_schema.py — 9 schema invariants: table exists, identity_id nullable + indexed, FK targets attacker_identities.uuid, schema_version defaults to 1, attacker rows inserted with NULL identity_id, FK constraint blocks orphans. 463 unrelated db/web/profiler/correlation tests still green. See development/IDENTITY_RESOLUTION.md for the full design.	2026-04-26 07:00:24 -04:00
anti	00254629f8	feat(clustering): UKC phase enum + synthetic campaign factory + metric harness Pre-implementation scaffolding for campaign clustering. The simulator is the spec — algorithm code follows once fixtures + metrics are stable. * decnet/clustering/ukc.py — UKCPhase enum (19 phases across In/Through/Out stages), OBSERVABLE_PHASES set, stage_of() helper. Vocabulary aligns with future MITRE ATT&CK tagging so synthetic data and runtime phase inference don't need renaming when TTP-tagging lands. * tests/factories/campaign_factory.py — YAML DSL parser + deterministic generator emitting truth-labeled SyntheticAttacker / SyntheticSession records. Validates phase names, warns on unobservable phases, supports multi-campaign + noise corpora. * tests/clustering/metrics.py — pure-Python ARI / homogeneity / completeness / singleton_recall (no sklearn dep). Decided before any algorithm exists, on purpose. * tests/fixtures/campaigns/lone_wolf.{yaml,expected.yaml} — fixture 3 from the design doc; simplest of the six, exercises the full pipeline with an identity-clusterer placeholder. * development/CAMPAIGN_CLUSTERING.md — design spec for the feature. * development/DEVELOPMENT_V2.md — note on DSL evolution path (concurrent phases, multi-actor per phase) deferred post-v1.	2026-04-26 06:29:10 -04:00
anti	3eb67c9400	refactor(intel): re-key attacker_intel on attacker_uuid (closes DEBT-041) The threat-intel surface was IP-keyed on day one as an expedient — the worker is woken by IP-bearing bus events. ANTI's call: don't carry that debt. NO IPs as primary keys anywhere on the attacker-intel surface. Schema: - attacker_uuid is now the canonical key — UNIQUE + FK to attackers.uuid. - attacker_ip stays as a denormalised, indexed, NON-UNIQUE value column. Updated on every upsert; useful for SIEM payloads and audit lookups, but explicitly NOT a key. Model docstring says so. - Pre-v1, no Alembic migration needed. SQLModel.metadata.create_all() builds the new shape on fresh DBs. Repo: - upsert_attacker_intel now keys on attacker_uuid. - get_attacker_intel_by_ip → get_attacker_intel_by_uuid. - get_unenriched_attacker_ips → get_unenriched_attackers, returning [{uuid, ip}] tuples so the worker writes by UUID and dispatches provider calls by IP without a second round-trip. Worker: - _enrich_one(uuid, ip, ...) — UUID lands on the row, IP rides for provider egress. - attacker.intel.enriched bus payload gains attacker_uuid alongside attacker_ip — webhook → SIEM consumers benefit; no removal. API: - GET /api/v1/attackers/{ip}/intel deleted outright (rip-and-replace, never deployed beyond dev). - GET /api/v1/attackers/{uuid}/intel is the only public route, matching every other /attackers/* route. Frontend: - <IntelPanel uuid={id!} /> uses the URL param directly, fetches in parallel with the rest of AttackerDetail rather than waiting on attacker.ip. Tests: re-keyed in place, 39 passed (same coverage as before the refactor). Provider-impl tests untouched. DEBT-041: closed in DEBT.md (entry preserved as historical rationale, summary table flipped to ✅, remaining-open list shortened by one).	2026-04-26 05:35:29 -04:00
anti	8a6d632ab0	feat(deploy): systemd unit for decnet-enrich + register in worker panel Mirrors decnet-reuse-correlator.service.j2: same hardening posture (NoNewPrivileges, ProtectSystem=full, etc.), same restart policy, same log file convention. The decnet init renderer picks it up automatically via the decnet-*.service.j2 glob. Also reconciles a naming inconsistency I shipped earlier: the heartbeat name was 'intel' (the package) but the CLI command and unit are 'enrich' (the action). Renamed the heartbeat to 'enrich' so the workers panel displays the same string the operator types and the same string in the systemd unit file. Convention across the project: heartbeat name = registry key = unit basename = CLI command name. Registers 'enrich' in worker_registry.KNOWN_WORKERS and in the start-all preferred order. The decnet.target Wants= list also picks up the new unit so 'systemctl start decnet.target' brings everything up together.	2026-04-26 05:20:54 -04:00
anti	d3d9bd5aa7	feat(intel): `decnet enrich` CLI + GET /attackers/{ip}/intel endpoint CLI command mirrors the reuse-correlate shape (--poll-interval, --ttl-hours, --daemon). Run it under systemd as a sibling worker. The API endpoint returns the most recent cached row for an attacker IP or 404. Auth-gated via require_viewer like every other attacker route. Also extends the worker test with a real FakeBus so the attacker.intel.enriched publish path is exercised end-to-end (no longer a no-op against NullBus).	2026-04-26 05:17:25 -04:00
anti	cd70136d09	feat(intel): wire GreyNoise, AbuseIPDB, Feodo Tracker + ThreatFox Four concrete IntelProvider impls — three per-IP queries plus one bulk feed: * GreyNoiseProvider — community endpoint, optional API key for higher rate limit. 404 = unknown (cache the absence so we don't re-query). * AbuseIPDBProvider — score threshold mapping (>=75 malicious, >=25 suspicious, else benign). Self-disables with a clear error when no API key is configured rather than burning quota. * FeodoProvider — fetches the bulk botnet C2 IP feed once per refresh window and answers every lookup from an in-memory set. Listed = C2. * ThreatFoxProvider — POST /api/v1/ search_ioc query, optional Auth-Key header. Match in data[] = malicious; no_result = absence-not-benign. Every provider routes through decnet.net.http.stealth_client so the egress UA never leaks 'DECNET'.	2026-04-26 05:15:17 -04:00
anti	f49a7db07d	feat(intel): worker shell + attacker.intel.enriched bus topic run_intel_loop fans out across configured providers per IP, writes the aggregate row, and publishes attacker.intel.enriched. Mirrors the correlation/reuse_worker.py wake-on pattern: subscribes to attacker.observed and attacker.scored for sub-second latency, falls back to a 60s poll when the bus is unavailable. Heartbeat + control-listener wired so the workers panel sees it like every other supervised worker. Aggregate verdict picks the strongest provider tier (malicious > suspicious > benign > unknown). Provider-level errors land in IntelResult.error and are logged without poisoning the row — partial success is the expected case for free-tier providers under their daily caps. Concrete provider impls land in follow-up commits; the worker is fully exercised here against fake providers so the framing is locked in.	2026-04-26 05:01:47 -04:00
anti	58ca9075db	feat(net): stealth-egress httpx client factory Outbound calls to 3rd-party services (threat-intel providers, future TI lookups) MUST NOT advertise 'DECNET' in their user-agent — operators running honeypots want their reconnaissance dependencies to look like generic infra. New decnet.net.http.stealth_client() returns a fresh httpx.AsyncClient with a curl-shaped UA (pinned to a single constant so future siblings — browser-shaped, Go-shaped — sit next to it cleanly). Internal egress (webhook → operator's own SIEM, swarm worker → master) keeps its DECNET-tagged UA; the docstring is explicit about not routing those through this client.	2026-04-26 04:59:34 -04:00
anti	023bc1993d	feat(intel): provider ABC + lazy factory IntelProvider is async-first (every concrete provider does HTTP), bounded by a per-provider asyncio.Semaphore, and contractually never raises — errors land in IntelResult.error so a single provider's outage doesn't poison the worker pass for an entire IP. Factory returns a list (not a singleton like geoip) because intel enrichment fans out across all enabled providers per IP, with row-level partial-success handling. Lazy imports keep the module dependency-free when intel is disabled. Concrete providers (greynoise/abuseipdb/feodo/threatfox) land in follow-up commits — factory references them via lazy import so tests covering the disabled and unknown-name paths pass on their own.	2026-04-26 04:58:38 -04:00
anti	0dd3811436	feat(intel): attacker_intel table + repo helpers New TTL-cached threat-intel row keyed by attacker IP, with per-provider verdict/raw/queried_at columns for GreyNoise, AbuseIPDB, abuse.ch Feodo Tracker and ThreatFox. Carries schema_version from day one (federation wire-format precedent set by SessionProfile). Repo gains upsert_attacker_intel, get_attacker_intel_by_ip, and a get_unenriched_attacker_ips backfill primitive that picks fresh + stale rows for the forthcoming 'decnet enrich' worker. Also documents the open-source intel-source backlog in DEVELOPMENT_V2.	2026-04-26 04:56:47 -04:00
anti	50870f2e7a	feat(creds): surface plaintext/b64 secret on reuse findings The CredentialReuse table only stores the sha256+kind hash of the secret; the printable + b64 forms live on the underlying Credential rows. The dashboard drawer was therefore showing only the hash, which defeats most of the value of having a reuse view in the first place. Repo helpers list_credential_reuses + get_credential_reuse_by_id now issue one batched SELECT against credentials keyed on the sha256s in the result page and graft secret_printable + secret_b64 onto each row before returning. The drawer renders the same printable/b64 code-block the credentials inspector uses.	2026-04-26 04:34:19 -04:00
anti	a455248dd9	feat(deploy): systemd unit for decnet-reuse-correlator Adds the systemd template for the credential-reuse correlator daemon and wires it into decnet.target so `decnet init` installs it automatically (the unit installer globs decnet-*.service.j2). Mirrors the mutator template: bus-woken Type=simple service with the standard hardening + on-failure restart. Also registers `reuse-correlator` in the in-process worker registry (so the dashboard panel surfaces its heartbeat instead of dropping it as unknown) and slots it into the start-all preferred order between mutator and webhook.	2026-04-26 04:29:10 -04:00
anti	0d2283e10c	chore(cli): remove dead `decnet correlate` command The CLI was a day-one debug helper that read a log file or stdin and printed a traversal table. It hadn't been wired to the live data path since the engine moved into the profiler worker (DEBT.md:218). No deploy unit, no caller, no doc relied on it. Removed the command and its two tests; `decnet/correlation/` stays as a library consumed by the profiler and the reuse correlator.	2026-04-26 04:26:15 -04:00
anti	181c792753	feat(api): GET /credential-reuse list + detail endpoints Read-only routes for the credential-reuse findings produced by the correlator. Mirrors the /credentials route shape: JWT-gated via require_viewer, paginated with optional secret_kind / min_target_count filters, and a 404-on-missing detail route. No POST/PUT/PATCH (and no body parsing) so no 400 contract is documented.	2026-04-26 03:40:08 -04:00
anti	590c2b0fac	feat(correlation): credential-reuse engine + reuse-correlate worker Adds CorrelationEngine.correlate_credential_reuse + the `decnet reuse-correlate` long-running worker. The worker mirrors the mutator's bus-wake + slow-tick pattern: wakes on credential.captured and attacker.observed for sub-second latency, falls back to a 60s poll if the bus is unavailable, and publishes credential.reuse.detected once per new or grown CredentialReuse row (group-deduped so a 5-cred reuse doesn't emit 5 partial events). The web ingester now publishes credential.captured after every successful Credential upsert; bus + new repo helper find_credential_reuse_candidates feed the engine pass.	2026-04-26 03:37:49 -04:00
anti	00ecea924a	feat(profiler): backfill Credential.attacker_uuid on attacker upsert Credential capture runs before the profiler mints an Attacker, so Credential.attacker_uuid is nullable on write. The profiler now backfills the FK after each successful upsert_attacker. Soft-fail posture matches the surrounding behavior + smtp rollups so a backfill error never blocks the next attacker.	2026-04-26 03:30:44 -04:00
anti	ce4be68501	feat(creds): cred-reuse foundation + vectorstore scaffold Lays the storage and bus substrate for the "credential reuse patterns" task in DEVELOPMENT.md and scaffolds decnet/vectorstore/ as the future substrate for statistical attacker re-identification over behavioral fingerprints. No correlator, profiler, API, or dashboard wiring in this commit — see TODO.md for the handoff. Schema: - Credential.attacker_uuid (nullable FK to attackers.uuid), backfilled by the profiler post-write to avoid coupling the capture path to the profiler's ordering. - CredentialReuse table — UUID PK, JSON list columns for the accumulating attacker_uuids/ips/deckies/services, target_count (the discriminative scalar), confidence reserved for a future fuzzy-credential pass. Repo: - upsert_credential_reuse / list_credential_reuses / get_credential_reuse_by_id / update_credential_attacker_uuid. - Renamed pre-existing get_credential_reuse(secret_sha256) to get_credential_attempts_for_secret(secret_sha256) — the new findings table needs the cleaner name. Bus topics: - credential.captured (one per Credential upsert) - credential.reuse.detected (correlator-emitted on insert/grow) Vectorstore subpackage (decnet/vectorstore/, flat layout mirroring decnet/bus/): - BaseVectorStore ABC keyed by (kind, id) — kind discriminator means new feature families are additive, no schema migration. - FakeVectorStore (in-memory L2 KNN), NullVectorStore (no-op for DECNET_VECTORSTORE_ENABLED=false), SqliteVecVectorStore (lazy sqlite_vec extension load, one vec0 virtual table per kind). - get_vectorstore() env-driven dispatch with graceful fallback to FakeVectorStore when the sqlite-vec extension isn't on the host, so workers don't crash on a missing optional dep. Tests: 26 new (11 cred-reuse repo, 15 vectorstore). Existing credentials and base-repo tests updated for the rename. Total: 34 passing on the touched files.	2026-04-26 03:18:34 -04:00
anti	817ce32e6d	fix(collector): label-based fleet container discovery The events watcher's start-event filter previously called _load_service_container_names(), which reads decnet-state.json on every event. decnet deploy writes that state file out-of-band with docker compose up, so a container's start event could arrive before the state was committed — the watcher then dropped the event silently and never tailed the container's stdout. The visible symptom was an empty Credentials view (and Logs/Bounty) after a fresh deploy until the collector was manually restarted. Fix: stamp decnet.fleet.{service,decky,service_name} labels on every fleet service container at compose-time, and let the collector recognize either the fleet or topology label without touching the state file. The state-file name match remains as a fallback for legacy containers that predate the new labels.	2026-04-25 08:11:21 -04:00
anti	4566146d50	feat(api): GET /credentials endpoint Surfaces the Credential table (deduped attacker auth attempts) via a new /api/v1/credentials route. Mirrors the Bounty cache pattern (5s TTL on the unfiltered default page) and reuses the existing get_credentials / get_total_credentials repo methods + the already defined CredentialsResponse DTO. Filters: search, service, attacker_ip.	2026-04-25 07:51:20 -04:00
anti	b3d1301925	feat(creds): DEBT-040 Phase 3 — RDP NLA / CredSSP NTLMv2 capture When RDP_ENABLE_NLA=true (service_cfg.nla=true on the topology side), confirm PROTOCOL_HYBRID on the X.224 Connection Confirm, upgrade the socket to TLS using a self-signed cert generated at first start by the entrypoint, then drive a tiny CredSSP loop: - Read inbound TSRequest DER (bounded to MAX_TSREQUEST_LEN). - Scan for the NTLMSSP signature, dispatch on message type: Type 1 -> respond with a hand-built TSRequest carrying our Type 2 challenge. Type 3 -> parse_type3() and emit auth_attempt with the universal credential SD shape (secret_kind = ntlmssp_v2). - Hand-built DER: no pyasn1 dependency. Also folds in a small fix-up to commit 1: SMB SERVER_CHALLENGE was hardcoded to 0x11..0x88 across the fleet, which would let a scanner fingerprint every DECNET decky by its NTLM challenge. Both SMB and RDP now derive the 8-byte challenge from instance_seed.random_bytes(8, "ntlm_challenge"), giving each decky a deterministic-but-distinct value. SMB Dockerfile gets the instance_seed.py copy too (was synced into the build context but not COPYed into the image). - decnet/services/rdp.py: optional service_cfg.nla bool flips RDP_ENABLE_NLA in the compose env. - decnet/templates/rdp/Dockerfile + entrypoint.sh: openssl install + per-decky cert generation gated on RDP_ENABLE_NLA. - 9 NLA unit tests cover the DER reader/builder, _handle_nla round- trip with Type 1 / Type 3, oversized-DER rejection, and per- NODE_NAME challenge divergence. - DEBT.md: DEBT-040 closed; full TS_INFO_PACKET capture documented as a follow-up if attacker telemetry justifies it.	2026-04-25 07:42:52 -04:00
anti	a8b9c82c97	feat(creds): DEBT-040 Phase 2 — RDP X.224 cookie capture Replace Twisted-based connection logger with an asyncio handler that parses the X.224 Connection Request, extracts the mstshash routing cookie (universal across mstsc / FreeRDP / Hydra / ncrack / MSF rdp_login), records the rdpNegRequest.requestedProtocols flags, and answers with a well-formed X.224 Connection Confirm selecting PROTOCOL_RDP. Scope-down vs. the original DEBT-040 plan: full TS_INFO_PACKET extraction would require either Standard-RDP-Security RC4 stream- cipher implementation (with our own RSA pair + MS-RDPBCGR signing) or a complete MCS+GCC ASN.1/BER stack for the SSL path — both far exceed the 150 LoC budget the DEBT cited. The mstshash cookie is the only piece of credential information that flows in plaintext on the wire when the attacker speaks RDP, so capturing it is the highest- value-per-byte signal available without going down either rabbit hole. Phase 3 (CredSSP/NLA, next commit) is where actual NTLMv2 hashes land. - Drops Twisted dependency from rdp/Dockerfile; adds ntlmssp.py copy ahead of the NLA path that consumes it. - 7 unit tests cover cookie capture, requestedProtocols recording, CC framing, no-cookie path, and oversized/non-TPKT drops.	2026-04-25 07:34:42 -04:00
anti	6905c88083	feat(creds): DEBT-040 Phase 1 — SMB NTLMSSP framer Replace impacket's SimpleSMBServer with a hand-rolled asyncio SMB2 framer that walks Negotiate -> SessionSetup(Type1) -> SessionSetup(Type3) just deep enough to extract the inner NTLMSSP Type 3 via the shared parse_type3() parser. Always returns STATUS_LOGON_FAILURE; the attacker's hash lands in the Credential table, the attacker doesn't land on the host. - decnet/engine/deployer.py: _sync_ntlmssp_sources() mirrors the auth-helper / sessrec sync pattern, copies _shared/ntlmssp.py into smb/ and rdp/ build contexts before docker compose up. - Dockerfile: drop impacket dep, copy ntlmssp.py. - 7 unit tests drive the asyncio handler in-process via StreamReader.feed_data; assert dialect, MORE_PROCESSING_REQUIRED on first SessionSetup, NTLMSSP Type 2 carriage in SPNEGO, credential capture with universal SD shape, STATUS_LOGON_FAILURE on Type 3, oversized-NBSS / SMB1 / short-PDU drops.	2026-04-25 07:31:41 -04:00
anti	afe02af5c2	feat(creds): NTLMSSP Type 3 parser + DEBT-040 for SMB/RDP/NLA framers Ships the load-bearing primitive both Phase 5 (SMB) and Phase 7 (RDP NLA) need: a standalone NTLMSSP Type 3 (AUTHENTICATE_MESSAGE) parser per MS-NLMP §2.2.1.3. Surface: parse_type3(blob) -> dict \| None find_ntlmssp(buf) -> int # locate NTLMSSP\\0 inside SPNEGO outer Returns the universal Credential SD shape: username + domain (decoded UTF-16-LE or ASCII per NEGOTIATE_UNICODE) principal = "DOMAIN\\\\username" secret_kind = "ntlmssp_v1" (24-byte fixed) or "ntlmssp_v2" (variable) secret_b64 = base64 of NtChallengeResponse — canonical hashcat input (-m 5500 v1, -m 5600 v2) Bounds-checked for untrusted-input safety. Anonymous binds (empty NT response) return None — no credential to record. 7 unit tests cover NTLMv1/v2 distinction, ASCII vs Unicode strings, empty-domain shape, malformed signature/type rejection, and SPNEGO- wrapped find_ntlmssp() lookup. DEBT-040 opens to track the three remaining protocol framers that will consume this parser: - SMB: hand-rolled SMB2 + Session Setup framer (~200 LoC) replacing Impacket's opaque SimpleSMBServer - RDP basic auth: TPKT/X.224/MCS framer for legacy plaintext path (~150 LoC) - RDP NLA: TLS upgrade + CredSSP TSRequest parser, reuses parse_type3 via the SPNEGO inner blob (~250 LoC) These are substantial protocol implementations each — landing them inline with Phase 1-3+6's cred coverage rollout would have inflated the session beyond reasonable scope. Cred-reuse analytics already work across the 12 services covered in this session; the deferred three just round out the fleet.	2026-04-25 07:19:30 -04:00
anti	9777aa7677	feat(creds): Phase 6 — MongoDB SCRAM credential capture Plugs the cred-coverage gap for MongoDB. The template previously parsed only the wire opcode + length and discarded the BSON body entirely, so SCRAM-SHA-{1,256} client-proofs flowed straight through without ever landing in the Credential table. Adds an inline minimal BSON walker (~100 LoC) covering the 7 type codes auth commands actually use: string, doc, array, binary, bool, int32, int64. Hand-rolled rather than pulling pymongo as a runtime dep — the parser is bounds-checked for untrusted-input safety (won't loop on malformed length fields). Wire flow MongoDB clients use for auth: - OP_MSG body section (kind=0) → BSON doc with `saslStart` field carrying mechanism + payload (SCRAM client-first-message: "n,,n=<user>,r=<nonce>"). Username extracted, pinned to the per-connection _sasl_username + _sasl_mechanism state. - Subsequent OP_MSG with `saslContinue` → SCRAM client-final-message ("c=biws,r=<combined>,p=<base64 client-proof>"). The `p=` value is the credential — emitted as secret_kind=scram_sha256 (or _sha1 / _unknown depending on the prior saslStart's mechanism), principal = the pinned username, secret_b64 = base64 of the decoded proof. Reuse semantics: same client-proof across two auth attempts only matches when both server salt and password were identical (proofs include the salt). So cross-session reuse correlates only on credential reuse against the same MongoDB account on the same decky — honest, non-misleading signal. 680 tests pass across services, service_testing, db, web/ingester, and core/fingerprinting (the broader scope my recent commits touched). Phases 4, 5, 7 still pending (RDP basic-auth, SMB NTLMSSP, RDP NLA).	2026-04-25 07:15:44 -04:00
anti	e4bf8fa012	feat(creds): Phase 3 — HTTP/HTTPS POST form body cred extraction Login forms (wp-login.php, phpMyAdmin, Joomla, etc.) ship a `Content-Type: application/x-www-form-urlencoded` body with field names like username/user/email/log/pwd/password. The HTTP/HTTPS templates already captured the body as opaque bytes; now they parse common login-form shapes into the universal credential SD shape. Adds canonical templates/syslog_bridge.py: extract_form_credentials(body, content_type) -> dict \| None. Field-name matching is case-insensitive and covers: Principal: username, user, email, login, userid, account, log, user_login (WordPress), uname / pma_username (phpMyAdmin) Secret: password, pass, pwd, passwd, passwort, mot_de_passe, user_password (WordPress), pma_password (phpMyAdmin) The HTTP/HTTPS log_request handlers now call: cred = classify_authorization(...) or extract_form_credentials(...) — Authorization wins when present (current session credential beats a follow-up form change), but POSTs to /wp-login.php with no Auth header still surface their cleartext creds. Secret-without-principal is intentional: a reset-confirm or auto- fill abuse may carry a password without any field that maps to our principal list. The cred row writes with principal=None — the sha256 still correlates across services for reuse analytics. The body capture cap bumped from 512 → 4096 chars so reasonable form bodies aren't truncated before the cred extractor sees them; the body stored in fields.body stays at 512 chars (display-friendly). 36 helper + emitter tests pass. Phases 4-7 still pending.	2026-04-25 07:10:05 -04:00
anti	0c1316f74c	feat(creds): Phase 2 — MySQL handshake hash + MSSQL Login7 plaintext Closes the cred-coverage gap for two database services that had been capturing only the username: - MySQL — extends _handle_packet to read the auth-response after the null-terminated username. mysql_native_password puts a 1-byte length followed by 20 bytes: SHA1(password) XOR SHA1(salt + SHA1(SHA1(password))). Plaintext irrecoverable, lands as secret_kind="mysql_native_password" with the 20 hash bytes in secret_b64. Hash is canonical for "hashcat -m 11200" if an operator ever wants to crack offline. - MSSQL — fixes a pre-existing bug AND adds password capture. The prior _parse_login7_username read offsets 36/38, which is actually ibHostName/cchHostName in the Login7 layout — username sat at 40/42 and was never touched. Replaced with _parse_login7_creds() reading the correct offsets (40 username, 44 password). Login7 password is XOR-then-nibble-swap obfuscated against 0xa5; _deobfuscate_login7_password reverses it. Plaintext-recoverable, lands as secret_kind="plaintext". The pre-existing test_login7_auth_logged_and_closes only verified the error response ships and the connection closes; it didn't validate the parsed username, so the hostname-as-username bug was silent. New tests cover both the deobfuscation algorithm directly and the full ingester round-trip for both services. Sync: copies the canonical syslog_bridge.py into mysql/ and mssql/ template build contexts so service_testing tests load the version with classify_authorization + encode_secret available. 37 tests pass in the touched scope. Phases 3-7 still pending.	2026-04-25 07:07:33 -04:00
anti	3404e3b3a6	feat(creds): Phase 1 — Authorization header + SNMP community capture Closes the cred-coverage gap for 7 services that already had the data on the wire but never landed it in the Credential table: - SNMP — community string lands as secret_kind="snmp_community", principal=None (v1/v2c has no per-user identity, the community IS the auth). - SIP — Digest response hash, previously buried in the auth= header dump, now classify_authorization()-extracted. - HTTP / HTTPS — Authorization header was in the headers JSON but never extracted. Now Basic decodes to plaintext, Bearer → http_bearer (principal=None), Digest → http_digest_md5. - K8s — already extracted Authorization but didn't normalize. Service- account JWTs flow through as Bearer. - Docker API — headers absent entirely. Adds the headers JSON dump and runs Authorization through the classifier. - Elasticsearch — five distinct request handlers; each gains a per-handler _cred_fields() helper. Adds canonical templates/syslog_bridge.py:classify_authorization(). Recognised: Basic / Bearer / Token / Digest. Unknown schemes (NTLM, AWS4-HMAC, Negotiate) return None; the header still rides in the ambient SD-block but isn't normalized as a credential. The SD shape on the wire collapses sip_digest_md5 into http_digest_md5 — same algorithm, so cross-protocol reuse correlates correctly when (rare) nonce collisions allow. Drive-by repair of tests/core/test_fingerprinting.py: - The pre-existing `test_http_useragent_extracted` asserted both that add_bounty was called exactly once AND that the UA payload carried `path` and `method` fields. Both wrong since this session opened: the http_quirks fingerprint added later fires too, and the UA payload never actually included path/method despite the assertion. - Adds `path`/`method` to the UA fingerprint payload (real operator value: "Nikto hit /admin" beats "Nikto seen on this decky"). - Replaces `assert_awaited_once` with a `_find_ua_bounty()` helper that filters add_bounty calls by `fingerprint_type`. New fingerprint families landing later won't retroactively break old tests. - Updates the two credential-bearing tests to use the post-DEBT-039 native shape (`secret_b64` / `principal`) and `upsert_credential`, not the deleted legacy `username+password` adapter. Also rebuilds the per-service fake `syslog_bridge` modules in tests/service_testing/{conftest,test_imap,test_pop3,test_snmp,test_mqtt,test_smtp}.py to expose `encode_secret` + `classify_authorization`. Service templates that import either now no longer fail at test collection. 173 tests pass in the touched scope. Phases 2-7 still pending.	2026-04-25 07:04:10 -04:00
anti	6b16c844b6	fix(creds): MQTT regression + secret_kind for hash credentials Honest correction to the "every cred-emitting service" claim. Audit of templates/* found three gaps: 1. MQTT — was working through the legacy adapter, silently dropped when Phase 3 (`e696c2b`) deleted it. Now migrated to encode_secret() alongside the others. 2. Postgres — `auth, pw_hash=…` event captures the MD5 challenge-response the attacker sent. Plaintext irrecoverable, so it never fit the (principal, secret_b64=raw_bytes) shape. Lands in Credential as secret_kind="postgres_md5_challenge". 3. VNC — `auth_response, response=…hex` event captures the 16-byte DES-encrypted challenge. Same situation as Postgres: plaintext irrecoverable. Lands as secret_kind="vnc_des_response". Adds a `secret_kind` discriminator column to Credential (default "plaintext", indexed). The dedup tuple gains secret_kind so two credentials with the same sha256 but different kinds are fundamentally different rows — different challenges produce different bytes for the same plaintext password, so cross-kind reuse matches are meaningless and would only confuse analytics. The model now genuinely covers every cred-emitting service in the fleet: plaintext SSH, Telnet, FTP, POP3, IMAP, SMTP, Redis, LDAP, MQTT postgres_md5_* Postgres vnc_des_response VNC Username-only services (MySQL/MSSQL — TDS pre-encryption captures the user but never sees the password byte) intentionally don't feed Credential — they're recon signals, not cred attempts. 40 tests pass in the touched scope. New cases: secret_kind dedups independently in the repo; Postgres MD5 + VNC DES emitters thread through; MQTT round-trips through the native branch.	2026-04-25 06:16:57 -04:00
anti	e696c2beb3	refactor(ingester): drop legacy cred adapter — DEBT-039 closed Phase 3/3 of DEBT-039. Now that all six cred-emitting services (SSH, Telnet, FTP, POP3, IMAP, SMTP, Redis, LDAP) emit the universal `secret_b64`-bearing SD shape, the ingester's legacy fork has no live emitters to handle. Deletes: - `_ingest_credential_legacy()` — synthesized native fields from username+password - The `elif _fields.get("username") and _fields.get("password")` branch in `_extract_bounty` - `_printable_filter()` — only the legacy adapter called it; the native branch trusts the emitter (encode_secret() in Python or sd_escape() in C) to have already sanitized - The legacy-adapter test cases in tests/web/test_ingester.py; their coverage moved to tests/services/test_cred_emitters.py per-service in Phase 2 The cred path is now single-shape end-to-end. A pre-migration log row carrying only username+password silently produces no Credential write — by design, since no current emitter writes that shape and keeping a code path alive for theoretical legacy data risks masking emitter regressions. Pre-v1: any historical Bounty cred rows from before commit `2f47f67` stay untouched. DEBT-039 marked resolved with summary of the three commits and the silent-loss bug fix for Redis + LDAP that fell out of execution.	2026-04-25 06:04:09 -04:00
anti	abb4dd9fc0	feat(templates): migrate six cred emitters to native shape Phase 2/3 of DEBT-039. Switches FTP, POP3, IMAP, SMTP, Redis, and LDAP from the legacy `username=` + `password=` SD-block shape to the universal credential shape (`principal=` + `secret_printable=` + `secret_b64=`) the new Credential storage model expects. Pattern is uniform across all six services: _log("auth_attempt", username=u, principal=u, **encode_secret(pw)) Each service emits the canonical SD keys. The ingester's native-shape branch (introduced in `2f47f67`) now writes their cred attempts directly without going through the legacy adapter. Once Phase 3 removes the adapter the contract becomes single-shape. Per-service notes: - POP3 / IMAP — `status="success"\|"failed"` renamed to `outcome="success"\|"failure"` to match Credential.outcome's vocabulary; the ingester reads outcome directly. - SMTP — AUTH path migrated; in addition the existing mail_from event now exposes a parsed `domain=` field alongside the original `value=` so future "what domains do attackers spoof from" analytics have an indexed field. Not stored in Credential — regular Log row. - Redis — was silently dropped by the legacy adapter (no `username` field). Native branch handles `principal=None` correctly. BONUS FIX: the Redis 6+ ACL syntax `AUTH <user> <pw>` now captures the ACL username as principal (was previously discarded). - LDAP — was silently dropped by the legacy adapter (no `password` recognition for the `bind` event). Now lands as `principal=<dn>`. BONUS FIX. Tests (tests/services/test_cred_emitters.py, 9 cases): - per-service native-shape ingest path produces correct Credential rows; outcome maps for POP3/IMAP; principal=None for legacy Redis AUTH; principal=dn for LDAP. - mail_from event does NOT trigger a credential write (it's a Log-only observation, not auth). - 0xff/NUL/ANSI bytes in passwords survive losslessly through secret_b64 even when secret_printable is sanitized. Phase 3 deletes the legacy adapter once all migrations land — the adapter has no live emitters to handle anymore.	2026-04-25 05:43:51 -04:00
anti	aebb9f81c6	feat(templates): encode_secret() helper in canonical syslog_bridge Phase 1/3 of DEBT-039. Adds the Python emitter-side counterpart to auth-helper.c's sd_escape + base64 logic so service templates can emit the universal credential SD shape with a single spread: _log("auth_attempt", principal=user, **encode_secret(password)) secret_printable mirrors the C helper's [0x20, 0x7f) → '?' contract; secret_b64 preserves the ORIGINAL utf-8 bytes losslessly so non-ASCII or control-byte payloads survive as fingerprinting signal even when the printable form sanitizes them. The canonical syslog_bridge.py is what _sync_logging_helper() propagates into per-template build contexts at deploy time, so any service that imports its local syslog_bridge picks this up automatically on next rebuild. Phase 2 migrates the six cred-emitting service templates (FTP, POP3, IMAP, SMTP, Redis, LDAP) onto this helper. Phase 3 deletes the ingester's legacy adapter once nothing emits the old shape.	2026-04-25 05:37:44 -04:00
anti	2f47f67eef	feat(creds): future-proof Credential storage model Replaces the opaque Bounty.bounty_type='credential' path with a dedicated `credentials` table whose schema is forward-compatible across every auth-bearing service in the fleet. Hoisted indexed columns (secret_sha256, principal, service, attacker_ip) carry the universal reuse-analytics signal; service-specific JSON keys ride in `fields`. Cross-service reuse queries become an indexed lookup on secret_sha256 instead of JSON_EXTRACT scans. Schema decisions baked in (per ANTI): - New `Credential` table, not extension to Bounty - Hoisted `principal` column for cross-service principal-reuse - Standardized JSON keys: every payload carries secret_b64 + secret_printable + principal universally; service-specific extras (user, domain, dn, mech, …) ride alongside The auth-helper SD-block emits the new shape natively. The ingester forks at _extract_bounty: - Native shape (SSH/Telnet, future emitters): secret_b64 present → direct upsert_credential - Legacy shape (FTP/POP3/IMAP/SMTP today): username + password → adapter synthesizes secret_{b64,sha256,printable} on the fly, upserts into the same Credential table. Tracked as DEBT-039; one-shot bridge until those service templates migrate. Defense-in-depth across five layers (input validation): - C helper: bytes outside [0x20, 0x7f) collapse to '?', RFC 5424 escape rules for \\, ", ]; b64 preserves exact bytes - Ingester native branch: rejects malformed secret_b64 (regex), drops the credential row but keeps the underlying Log - Ingester legacy adapter: same printable-ASCII filter as the C code; sha256 + b64 over the original utf-8 bytes (lossless, even when secret_printable is sanitized) - DB column caps with truncation warning; sha256 always over the full pre-truncation bytes so reuse queries match across truncation - JSON serialized with ensure_ascii=True so utf8mb4 columns stay safe even with non-ASCII service-specific keys Bounty.bounty_type='credential' is no longer written. Pre-v1: no historical backfill; existing rows stay untouched but unused. 595 tests pass; new tests cover the model + repo (upsert dedup, null-principal independence, cross-service reuse, filters), both ingester branches, b64 validation, sanitization preserving the fingerprinting signal in b64.	2026-04-25 05:29:26 -04:00
anti	f1026b4427	feat(telnet): same PAM cred-capture, /etc/pam.d/login Promotes auth-helper.c to decnet/templates/_shared/auth-helper/ and adds _sync_auth_helper_sources() — mirrors the existing sessrec sync pattern that keeps shared sources in step with per-template build contexts. Telnet's image grows the same multi-stage musl build, COPY of the static helper into /usr/sbin/auth-helper, and prepended pam_exec line in /etc/pam.d/login. Pulls in the `login` package (real Debian PAM-aware /bin/login, replacing busybox's PAM-less applet) and libpam-modules transitively for pam_exec.so. Verified inside the rebuilt telnet image: - /bin/login is the real 53KB Debian binary (PAM-aware) - /etc/pam.d/login top line is the auth-helper hook - pam_exec.so present at /usr/lib/x86_64-linux-gnu/security/pam_exec.so - helper smoke-run emits correct RFC 5424 line for `telnetpw` → password_b64="dGVsbmV0cHc=" SSH Dockerfile updated to read auth-helper.c from auth-helper/ subdirectory so both templates use the synced layout. The canonical source lives in _shared/; per-template copies are tracked in git AND synced at deploy time so a drift on either side rebases on the next deploy. Closes the telnet half of DEBT-038's #5 follow-up.	2026-04-25 04:52:35 -04:00
anti	d064125f61	feat(ssh): capture password attempts via pam_exec auth-helper Real OpenSSH doesn't log attempted passwords — only success/failure with username — leaving SSH the sole auth-bearing service in the fleet that contributes nothing to the cred corpus FTP/MySQL/RDP/ VNC/etc. populate. Closes that gap with a tiny pam_exec shim. A static C helper (~80 LoC, musl, ~38KB stripped) is wired into /etc/pam.d/sshd as `auth optional pam_exec.so expose_authtok stdout /usr/sbin/auth-helper`. pam_exec writes the attempted password to the helper's stdin NUL-terminated; the helper formats an RFC 5424 line in the exact shape templates/syslog_bridge.py produces (facility local0, PEN 55555, MSGID auth_attempt — same MSGID FTP uses) and writes it to /proc/1/fd/1 so the existing collector stdout-reader pipeline picks it up. Two password fields ride in the SD-block: - password= RFC 5424 escaped, ASCII-printable only, ? for non- printables. FTP-compatible — existing dashboard rendering picks up SSH attempts unchanged. - password_b64= base64 of the exact PAM_AUTHTOK bytes. Preserves NUL/0xff/control-byte fingerprinting signal that the plain field necessarily drops. Fail-open by design: the PAM line is `optional` so a malfunctioning helper never blocks sshd auth. Better to miss a cred than break the honeypot. Verified end-to-end inside the rebuilt image: - 38KB static ELF, runs without a dynamic linker - correct RFC 5424 line for `hunter2` → b64 `aHVudGVyMg==` - NUL truncation matches pam_exec's contract - 0xff bytes survive losslessly through password_b64 - empty password produces a well-formed line (e.g. pubkey auth path)	2026-04-25 04:42:50 -04:00
anti	bcf460d2a5	feat(profiler): write ASN + AS name onto attacker rows Adds asn (int), as_name (varchar 128), asn_source (varchar 16) to the Attacker SQLModel — direct columns, no _migrate_* helper per feedback_no_new_migrations_prev1. Profiler worker now calls decnet.asn.enrich_ip alongside the existing geoip enrich_ip; both feed the upsert payload. Failure is total — if either lookup throws or the IP is private/unannounced, the field stays None and the row still writes. Both lookups are independent: a CGNAT address can have a country (RIR allocation) but no ASN (no BGP origin), and vice-versa for unrouted RIR-allocated space. Storing them separately preserves that signal.	2026-04-25 04:01:28 -04:00

1 2 3 4 5 ...

441 Commits