Lays the storage and bus substrate for the "credential reuse patterns"
task in DEVELOPMENT.md and scaffolds decnet/vectorstore/ as the future
substrate for statistical attacker re-identification over behavioral
fingerprints. No correlator, profiler, API, or dashboard wiring in
this commit — see TODO.md for the handoff.
Schema:
- Credential.attacker_uuid (nullable FK to attackers.uuid),
backfilled by the profiler post-write to avoid coupling the
capture path to the profiler's ordering.
- CredentialReuse table — UUID PK, JSON list columns for the
accumulating attacker_uuids/ips/deckies/services, target_count
(the discriminative scalar), confidence reserved for a future
fuzzy-credential pass.
Repo:
- upsert_credential_reuse / list_credential_reuses /
get_credential_reuse_by_id / update_credential_attacker_uuid.
- Renamed pre-existing get_credential_reuse(secret_sha256) to
get_credential_attempts_for_secret(secret_sha256) — the new
findings table needs the cleaner name.
Bus topics:
- credential.captured (one per Credential upsert)
- credential.reuse.detected (correlator-emitted on insert/grow)
Vectorstore subpackage (decnet/vectorstore/, flat layout mirroring
decnet/bus/):
- BaseVectorStore ABC keyed by (kind, id) — kind discriminator
means new feature families are additive, no schema migration.
- FakeVectorStore (in-memory L2 KNN), NullVectorStore (no-op for
DECNET_VECTORSTORE_ENABLED=false), SqliteVecVectorStore (lazy
sqlite_vec extension load, one vec0 virtual table per kind).
- get_vectorstore() env-driven dispatch with graceful fallback
to FakeVectorStore when the sqlite-vec extension isn't on the
host, so workers don't crash on a missing optional dep.
Tests: 26 new (11 cred-reuse repo, 15 vectorstore). Existing
credentials and base-repo tests updated for the rename. Total: 34
passing on the touched files.
28 lines
899 B
Python
28 lines
899 B
Python
"""Vector store substrate for behavioral fingerprint similarity search.
|
|
|
|
Provider-pluggable storage for ``(kind, id, vector)`` triples used by the
|
|
future statistical re-identification engine. ``kind`` discriminates
|
|
feature families (``ja3``, ``hassh``, ``keystroke``, ``cmd_ngram``, ...)
|
|
so new feature types are additive — no schema migration required when
|
|
adding a new extractor.
|
|
|
|
Use :func:`get_vectorstore` from :mod:`decnet.vectorstore.factory`; never
|
|
import concrete implementations directly. Mirrors the same factory
|
|
discipline as :mod:`decnet.bus` and :mod:`decnet.web.db`.
|
|
"""
|
|
from decnet.vectorstore.base import (
|
|
BaseVectorStore,
|
|
Neighbor,
|
|
VectorRecord,
|
|
VECTORSTORE_SCHEMA_VERSION,
|
|
)
|
|
from decnet.vectorstore.factory import get_vectorstore
|
|
|
|
__all__ = [
|
|
"BaseVectorStore",
|
|
"Neighbor",
|
|
"VectorRecord",
|
|
"VECTORSTORE_SCHEMA_VERSION",
|
|
"get_vectorstore",
|
|
]
|