Files
DECNET/decnet/vectorstore/factory.py
anti ce4be68501 feat(creds): cred-reuse foundation + vectorstore scaffold
Lays the storage and bus substrate for the "credential reuse patterns"
task in DEVELOPMENT.md and scaffolds decnet/vectorstore/ as the future
substrate for statistical attacker re-identification over behavioral
fingerprints. No correlator, profiler, API, or dashboard wiring in
this commit — see TODO.md for the handoff.

Schema:
  - Credential.attacker_uuid (nullable FK to attackers.uuid),
    backfilled by the profiler post-write to avoid coupling the
    capture path to the profiler's ordering.
  - CredentialReuse table — UUID PK, JSON list columns for the
    accumulating attacker_uuids/ips/deckies/services, target_count
    (the discriminative scalar), confidence reserved for a future
    fuzzy-credential pass.

Repo:
  - upsert_credential_reuse / list_credential_reuses /
    get_credential_reuse_by_id / update_credential_attacker_uuid.
  - Renamed pre-existing get_credential_reuse(secret_sha256) to
    get_credential_attempts_for_secret(secret_sha256) — the new
    findings table needs the cleaner name.

Bus topics:
  - credential.captured (one per Credential upsert)
  - credential.reuse.detected (correlator-emitted on insert/grow)

Vectorstore subpackage (decnet/vectorstore/, flat layout mirroring
decnet/bus/):
  - BaseVectorStore ABC keyed by (kind, id) — kind discriminator
    means new feature families are additive, no schema migration.
  - FakeVectorStore (in-memory L2 KNN), NullVectorStore (no-op for
    DECNET_VECTORSTORE_ENABLED=false), SqliteVecVectorStore (lazy
    sqlite_vec extension load, one vec0 virtual table per kind).
  - get_vectorstore() env-driven dispatch with graceful fallback
    to FakeVectorStore when the sqlite-vec extension isn't on the
    host, so workers don't crash on a missing optional dep.

Tests: 26 new (11 cred-reuse repo, 15 vectorstore). Existing
credentials and base-repo tests updated for the rename. Total: 34
passing on the touched files.
2026-04-26 03:18:34 -04:00

74 lines
2.8 KiB
Python

"""Vectorstore factory — selects a :class:`BaseVectorStore` implementation.
Dispatch keys:
* ``DECNET_VECTORSTORE_ENABLED`` — ``"false"`` short-circuits to
:class:`~decnet.vectorstore.fake.NullVectorStore`. Default ``"true"``.
* ``DECNET_VECTORSTORE_TYPE`` — ``"sqlite_vec"`` (default) or
``"fake"``.
* ``DECNET_VECTORSTORE_PATH`` — sqlite file path. Defaults to
``/var/lib/decnet/vectors.sqlite`` if writable, else
``~/.decnet/vectors.sqlite``.
Mirrors :mod:`decnet.bus.factory` and :mod:`decnet.web.db.factory`:
lazy imports inside each branch, env-driven dispatch, callers MUST go
through :func:`get_vectorstore` rather than instantiating backends.
If ``sqlite_vec`` is requested but the extension is unavailable on
this host, the factory logs a warning and returns the fake backend
instead — the caller's code path stays valid (``insert`` no-ops, etc.)
without crashing the worker on a missing optional dependency.
"""
from __future__ import annotations
import logging
import os
from typing import Any
from decnet.vectorstore.base import BaseVectorStore
LOG = logging.getLogger(__name__)
def get_vectorstore(**kwargs: Any) -> BaseVectorStore:
if os.environ.get("DECNET_VECTORSTORE_ENABLED", "true").lower() == "false":
from decnet.vectorstore.fake import NullVectorStore
return NullVectorStore()
backend = os.environ.get("DECNET_VECTORSTORE_TYPE", "sqlite_vec").lower()
if backend == "fake":
from decnet.vectorstore.fake import FakeVectorStore
return FakeVectorStore()
if backend == "sqlite_vec":
# Probe extension availability up front so the factory can fall
# back cleanly. Construction is cheap, but the extension load
# only happens in initialize(); without this probe the caller
# sees the failure too late to substitute a backend.
try:
import sqlite_vec # noqa: F401
except ImportError as e:
LOG.warning(
"sqlite_vec not installed (%s); falling back to FakeVectorStore. "
"Install the sqlite-vec package or set "
"DECNET_VECTORSTORE_TYPE=fake to silence this warning.", e,
)
from decnet.vectorstore.fake import FakeVectorStore
return FakeVectorStore()
from decnet.vectorstore.sqlite_vec import SqliteVecVectorStore
db_path = kwargs.pop("db_path", None) or _default_db_path()
return SqliteVecVectorStore(db_path=db_path)
raise ValueError(f"Unsupported vectorstore type: {backend}")
def _default_db_path() -> str:
explicit = os.environ.get("DECNET_VECTORSTORE_PATH")
if explicit:
return explicit
runtime_dir = "/var/lib/decnet"
if os.path.isdir(runtime_dir) and os.access(runtime_dir, os.W_OK):
return f"{runtime_dir}/vectors.sqlite"
return os.path.expanduser("~/.decnet/vectors.sqlite")