Lays the storage and bus substrate for the "credential reuse patterns"
task in DEVELOPMENT.md and scaffolds decnet/vectorstore/ as the future
substrate for statistical attacker re-identification over behavioral
fingerprints. No correlator, profiler, API, or dashboard wiring in
this commit — see TODO.md for the handoff.
Schema:
- Credential.attacker_uuid (nullable FK to attackers.uuid),
backfilled by the profiler post-write to avoid coupling the
capture path to the profiler's ordering.
- CredentialReuse table — UUID PK, JSON list columns for the
accumulating attacker_uuids/ips/deckies/services, target_count
(the discriminative scalar), confidence reserved for a future
fuzzy-credential pass.
Repo:
- upsert_credential_reuse / list_credential_reuses /
get_credential_reuse_by_id / update_credential_attacker_uuid.
- Renamed pre-existing get_credential_reuse(secret_sha256) to
get_credential_attempts_for_secret(secret_sha256) — the new
findings table needs the cleaner name.
Bus topics:
- credential.captured (one per Credential upsert)
- credential.reuse.detected (correlator-emitted on insert/grow)
Vectorstore subpackage (decnet/vectorstore/, flat layout mirroring
decnet/bus/):
- BaseVectorStore ABC keyed by (kind, id) — kind discriminator
means new feature families are additive, no schema migration.
- FakeVectorStore (in-memory L2 KNN), NullVectorStore (no-op for
DECNET_VECTORSTORE_ENABLED=false), SqliteVecVectorStore (lazy
sqlite_vec extension load, one vec0 virtual table per kind).
- get_vectorstore() env-driven dispatch with graceful fallback
to FakeVectorStore when the sqlite-vec extension isn't on the
host, so workers don't crash on a missing optional dep.
Tests: 26 new (11 cred-reuse repo, 15 vectorstore). Existing
credentials and base-repo tests updated for the rename. Total: 34
passing on the touched files.
9.9 KiB
TODO — credential reuse + vectorstore (handoff)
This document hands off in-progress work on the credential reuse
patterns task from development/DEVELOPMENT.md (under Service-Level
Behavioral Profiling) plus the decnet/vectorstore/ scaffolding
that prepares the substrate for a future statistical re-identification
engine over behavioral fingerprints. See
/home/anti/.claude/plans/ah-excellent-alright-claude-vivid-thimble.md
for the full approved plan and motivation.
Done in the previous session
Foundation is shipped + tested (26 new tests passing, no regressions):
- Schema —
decnet/web/db/models/logs.pyCredential.attacker_uuid: Optional[str]FK toattackers.uuid, nullable. Backfilled by the profiler post-write.CredentialReusetable (UUID PK; JSON list columns forattacker_uuids,attacker_ips,deckies,services;target_count,attempt_count,confidencereserved for future fuzzy matching). Unique key:(secret_sha256, secret_kind, principal_key).CredentialReuseResponsePydantic DTO.
- Repo —
decnet/web/db/sqlmodel_repo.py+repository.pyupsert_credential_reuse(...),list_credential_reuses(limit, offset, min_target_count, secret_kind),get_credential_reuse_by_id(id),update_credential_attacker_uuid(attacker_ip, attacker_uuid) -> int.- Rename: pre-existing
get_credential_reuse(secret_sha256)→get_credential_attempts_for_secret(secret_sha256). All callers updated.
- Bus topics —
decnet/bus/topics.pyCREDENTIAL_CAPTURED = "captured"(one per Credential upsert).CREDENTIAL_REUSE_DETECTED = "reuse.detected"(correlator emits on insert/grow).credential(event_type)builder.
- Vectorstore —
decnet/vectorstore/(NEW; flat layout mirroringdecnet/bus/)base.py—BaseVectorStoreABC,VectorRecord,Neighbor,VECTORSTORE_SCHEMA_VERSION. Methods:initialize,close,health,insert,get,delete,knn. Keyed by(kind, id).fake.py—FakeVectorStore(in-memory, brute-force L2 KNN) +NullVectorStore(no-op whenDECNET_VECTORSTORE_ENABLED=false).sqlite_vec.py—SqliteVecVectorStore; lazy-loads thesqlite_vecextension; onevec0virtual table perkindso new feature families don't require schema migration. Per-kind dim is locked on first insert.factory.py—get_vectorstore()env-driven dispatch (DECNET_VECTORSTORE_TYPE∈ {sqlite_vec, fake};DECNET_VECTORSTORE_ENABLED;DECNET_VECTORSTORE_PATH). On missingsqlite_vecextension: logs a warning and returnsFakeVectorStoreso workers don't crash.
- Tests
tests/db/test_credential_reuse.py— 11 tests (upsert idempotency, list filters/pagination, FK backfill semantics, null-principal uniqueness, JSON-list merging).tests/vectorstore/test_factory.py(6) +tests/vectorstore/test_fake.py(9) — factory dispatch + fallback, round-trip, dim-mismatch raises, KNN ordering, NullStore no-op.- Updated
tests/db/test_base_repo.pyandtests/db/test_credentials.pyfor the rename.
Not yet done — what the next agent should pick up
Tasks below are roughly in dependency order. Backend first, dashboard last (it's the largest unknown and benefits from a fresh context).
1. Profiler backfill of Credential.attacker_uuid
Smallest task; do this first to validate the FK column end-to-end.
- File:
decnet/profiler/— find the spot where the profiler mints/updates anAttackerrow from observed events. There's likely anupsert_attacker(...)call that produces the(ip, uuid)pair. - Add immediately after a successful upsert:
await repo.update_credential_attacker_uuid(ip, uuid) - Test in
tests/profiler/(whatever the existing test file is) that after the profiler processes events for an IP, allCredentialrows for that IP have theirattacker_uuidpopulated. Use the pattern fromtests/db/test_credential_reuse.py:: test_update_credential_attacker_uuid_backfills_only_nulls.
2. Correlator engine + worker wiring
- File:
decnet/correlation/engine.py— addcorrelate_credential_reuse(min_targets: int = 2)toCorrelationEngine. Signature suggested in the plan:For each group, fetch the underlying credential rows and callSELECT secret_sha256, secret_kind, principal, COUNT(DISTINCT decky_name||':'||service) AS target_count FROM credentials GROUP BY secret_sha256, secret_kind, principal HAVING target_count >= :min_targetsrepo.upsert_credential_reuse(...)per row. The repo upsert recomputestarget_countfrom thecredentialstable on each update, so you don't need to pass aggregates in. - On insert/grow (
out["inserted"] is True or out["changed"] is True), publishbus.publish(topics.credential(topics.CREDENTIAL_REUSE_DETECTED), {...})with payload{id, secret_kind, target_count, attacker_uuids, attacker_ips, deckies, services}. - Worker file:
decnet/correlation/main.py(or whereverCorrelationEngineis loop-driven). Subscribe to:attacker.observed— re-runs reuse pass for that IP.credential.captured— re-runs reuse pass for that secret.- Heartbeat tick every 60s as a fallback (mirror the mutator's bus-wake + slow-tick pattern).
- Where is
credential.capturedemitted? Find the credential ingest path — probablydecnet/collector/or whereverrepo.upsert_credential(...)is called. Add abus.publish( topics.credential(topics.CREDENTIAL_CAPTURED), {secret_sha256, secret_kind, attacker_ip, decky, service})after a successful upsert. Bus is fire-and-forget — don't block on it. - Tests:
tests/correlation/test_credential_reuse.py— engine emits the rightCredentialReuserows from synthetic credentials; bus event published exactly once per insert/grow.- Use
decnet.bus.fake.FakeBusin tests; collect published events for assertion.
3. API routes — GET /api/v1/credential-reuse
- File: probably
decnet/web/api/routes/— see how existing credentials routes are organized (recent commitfeat(api): GET /credentials endpoint→4566146). - Endpoints:
GET /api/v1/credential-reuse?limit=50&offset=0&min_target_count=2&secret_kind=plaintext→CredentialReuseResponse(already in models).GET /api/v1/credential-reuse/{id}→ single row dict, 404 if missing.
- JWT-gated like all other routes. Use the existing dependency.
- No POST/PUT/PATCH — read-only this release. Per the
feedback_schemathesis_400memory there's no 400 contract to document since there's no body parsing. - Tests:
tests/api/test_credential_reuse_routes.py— JWT gate, pagination, filters, 404 for missing id.
4. Dashboard — Credentials Reuse tab + drawer
The big unknown. Next agent should:
- Survey
decnet/web/dashboard/(React app) — how the existing Credentials view is structured (commit4ea4b0b feat(web): Credentials view + inspector). - Add a "Reuse" tab/filter that lists
CredentialReuserows sorted bytarget_count desc. - Drawer on row-click showing decky×service breakdown,
attacker_uuidlist (link to/attackers/:id), timeline. Reuse the existing drawer pattern (seefeedback_react_stop_propagation_native_delegationmemory — backdrop click closes viatarget===currentTarget, neverstopPropagation). - On the existing Credentials list, add a "seen on N targets"
badge when a credential has a corresponding
CredentialReuserow, so the connection is bidirectional.
5. DEVELOPMENT.md
Tick [x] Credential reuse patterns under Service-Level Behavioral
Profiling. Add a one-liner under Attacker Intelligence Collection
noting decnet/vectorstore/ is scaffolded for the future statistical
re-ID engine (no behavioural change yet).
Architectural decisions worth knowing
These came out of the design conversation that produced the plan; the next agent should respect them:
- Classical statistics, not ML, for attacker re-identification. Cosine/Mahalanobis/KS-test over per-kind feature vectors, weighted voting, versioned thresholds. Reproducible, explainable, no model drift. ML is reserved for a future advisory layer behind the factory, never primary.
- Provider factory pattern is mandatory for any new pluggable
backend (storage, transport, similarity). Mirror
decnet/web/db/anddecnet/bus/— never let workers import concrete backends. kinddiscriminator is the extension point for new feature families. Addingkind="cmd_ngram"later does not require schema changes — thevec_<kind>table is created lazily on first insert.Credential.attacker_uuidis nullable on write by design — the credential capture path runs before the profiler mintsAttacker, so coupling them would create a chicken-and-egg ordering bug. The profiler backfills.CredentialReuse.confidenceis always 1.0 today (exact-secret match). The column exists so a future fuzzy-credential pass (hunter2≈hunter22) can write 0.x rows without schema work.
Verification checklist for the next agent
After finishing each chunk:
pytest tests/<area> --timeout=30 --timeout-method=thread— must be green before moving on.- Don't run fuzz/bench/live/stress in the dev loop (memory:
feedback_skip_heavy_tests). - Don't pre-clear with custom bandit/ruff flags (memory:
feedback_trust_git_hooks) — the pre-commit hook is authoritative. - Commit per task, not batched (memory:
feedback_commit_per_task). Don't add Co-Authored-By to commit messages.
Open questions to surface to ANTI before tackling §4
- Should the dashboard "Reuse" surface live as a tab on the existing Credentials page, or as a sibling page? (The plan said tab, but worth confirming once you've seen the code.)
- Pagination size for the reuse list — match the existing Credentials view default, or use a smaller page since the rows are wider?