feat(creds): cred-reuse foundation + vectorstore scaffold
Lays the storage and bus substrate for the "credential reuse patterns"
task in DEVELOPMENT.md and scaffolds decnet/vectorstore/ as the future
substrate for statistical attacker re-identification over behavioral
fingerprints. No correlator, profiler, API, or dashboard wiring in
this commit — see TODO.md for the handoff.
Schema:
- Credential.attacker_uuid (nullable FK to attackers.uuid),
backfilled by the profiler post-write to avoid coupling the
capture path to the profiler's ordering.
- CredentialReuse table — UUID PK, JSON list columns for the
accumulating attacker_uuids/ips/deckies/services, target_count
(the discriminative scalar), confidence reserved for a future
fuzzy-credential pass.
Repo:
- upsert_credential_reuse / list_credential_reuses /
get_credential_reuse_by_id / update_credential_attacker_uuid.
- Renamed pre-existing get_credential_reuse(secret_sha256) to
get_credential_attempts_for_secret(secret_sha256) — the new
findings table needs the cleaner name.
Bus topics:
- credential.captured (one per Credential upsert)
- credential.reuse.detected (correlator-emitted on insert/grow)
Vectorstore subpackage (decnet/vectorstore/, flat layout mirroring
decnet/bus/):
- BaseVectorStore ABC keyed by (kind, id) — kind discriminator
means new feature families are additive, no schema migration.
- FakeVectorStore (in-memory L2 KNN), NullVectorStore (no-op for
DECNET_VECTORSTORE_ENABLED=false), SqliteVecVectorStore (lazy
sqlite_vec extension load, one vec0 virtual table per kind).
- get_vectorstore() env-driven dispatch with graceful fallback
to FakeVectorStore when the sqlite-vec extension isn't on the
host, so workers don't crash on a missing optional dep.
Tests: 26 new (11 cred-reuse repo, 15 vectorstore). Existing
credentials and base-repo tests updated for the rename. Total: 34
passing on the touched files.
This commit is contained in:
73
decnet/vectorstore/factory.py
Normal file
73
decnet/vectorstore/factory.py
Normal file
@@ -0,0 +1,73 @@
|
||||
"""Vectorstore factory — selects a :class:`BaseVectorStore` implementation.
|
||||
|
||||
Dispatch keys:
|
||||
|
||||
* ``DECNET_VECTORSTORE_ENABLED`` — ``"false"`` short-circuits to
|
||||
:class:`~decnet.vectorstore.fake.NullVectorStore`. Default ``"true"``.
|
||||
* ``DECNET_VECTORSTORE_TYPE`` — ``"sqlite_vec"`` (default) or
|
||||
``"fake"``.
|
||||
* ``DECNET_VECTORSTORE_PATH`` — sqlite file path. Defaults to
|
||||
``/var/lib/decnet/vectors.sqlite`` if writable, else
|
||||
``~/.decnet/vectors.sqlite``.
|
||||
|
||||
Mirrors :mod:`decnet.bus.factory` and :mod:`decnet.web.db.factory`:
|
||||
lazy imports inside each branch, env-driven dispatch, callers MUST go
|
||||
through :func:`get_vectorstore` rather than instantiating backends.
|
||||
|
||||
If ``sqlite_vec`` is requested but the extension is unavailable on
|
||||
this host, the factory logs a warning and returns the fake backend
|
||||
instead — the caller's code path stays valid (``insert`` no-ops, etc.)
|
||||
without crashing the worker on a missing optional dependency.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import os
|
||||
from typing import Any
|
||||
|
||||
from decnet.vectorstore.base import BaseVectorStore
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def get_vectorstore(**kwargs: Any) -> BaseVectorStore:
|
||||
if os.environ.get("DECNET_VECTORSTORE_ENABLED", "true").lower() == "false":
|
||||
from decnet.vectorstore.fake import NullVectorStore
|
||||
return NullVectorStore()
|
||||
|
||||
backend = os.environ.get("DECNET_VECTORSTORE_TYPE", "sqlite_vec").lower()
|
||||
|
||||
if backend == "fake":
|
||||
from decnet.vectorstore.fake import FakeVectorStore
|
||||
return FakeVectorStore()
|
||||
|
||||
if backend == "sqlite_vec":
|
||||
# Probe extension availability up front so the factory can fall
|
||||
# back cleanly. Construction is cheap, but the extension load
|
||||
# only happens in initialize(); without this probe the caller
|
||||
# sees the failure too late to substitute a backend.
|
||||
try:
|
||||
import sqlite_vec # noqa: F401
|
||||
except ImportError as e:
|
||||
LOG.warning(
|
||||
"sqlite_vec not installed (%s); falling back to FakeVectorStore. "
|
||||
"Install the sqlite-vec package or set "
|
||||
"DECNET_VECTORSTORE_TYPE=fake to silence this warning.", e,
|
||||
)
|
||||
from decnet.vectorstore.fake import FakeVectorStore
|
||||
return FakeVectorStore()
|
||||
from decnet.vectorstore.sqlite_vec import SqliteVecVectorStore
|
||||
db_path = kwargs.pop("db_path", None) or _default_db_path()
|
||||
return SqliteVecVectorStore(db_path=db_path)
|
||||
|
||||
raise ValueError(f"Unsupported vectorstore type: {backend}")
|
||||
|
||||
|
||||
def _default_db_path() -> str:
|
||||
explicit = os.environ.get("DECNET_VECTORSTORE_PATH")
|
||||
if explicit:
|
||||
return explicit
|
||||
runtime_dir = "/var/lib/decnet"
|
||||
if os.path.isdir(runtime_dir) and os.access(runtime_dir, os.W_OK):
|
||||
return f"{runtime_dir}/vectors.sqlite"
|
||||
return os.path.expanduser("~/.decnet/vectors.sqlite")
|
||||
Reference in New Issue
Block a user