2
Design Overview
anti edited this page 2026-04-18 06:08:17 -04:00

Design Overview

A short tour of how DECNET is split into processes and why. For knob-level detail see Environment-Variables; for storage internals see Database-Drivers.

The microservice split

DECNET runs as a small constellation of workers around a FastAPI process. Each worker is a first-class CLI subcommand and can also be embedded in the API process for simple single-host deploys.

Subsystem Launch standalone Embed in API Primary job
Web / API decnet web --daemon (this is the host) FastAPI app, dashboard, REST endpoints
Collector decnet collect --daemon always runs Ingest RFC 5424 syslog from deckies
Correlator decnet correlate --daemon always runs Session + attacker correlation
Profiler decnet profiler --daemon DECNET_EMBED_PROFILER=1 Attacker profiling / scoring
Sniffer decnet sniffer --daemon DECNET_EMBED_SNIFFER=1 Passive PCAP on the decoy bridge
Prober decnet probe --daemon always runs Active realism checks
Mutator decnet mutate --daemon --watch always runs Runtime fleet mutation

Every worker is also how decnet deploy spawns them — the deploy path shells out to python -m decnet.cli <worker> --daemon so there is exactly one code path, whether you run interactively or under systemd.

Why split them at all

Resilience

A crashed sniffer must not take the API down. A stuck profiler must not block an attacker write from the collector. Splitting into processes gives us the usual crash-domain isolation: supervise each unit under systemd (see Systemd-Setup), restart on its own schedule.

Scaling

In UNIHOST mode everything lives on one machine. In SWARM / MULTIHOST mode the heavy workers (sniffer, profiler) can move to dedicated hosts while the API stays on the public-facing bridge. Because each worker reads the same repository via get_repository(), they are effectively stateless w.r.t. each other — they coordinate through the DB, not through shared memory.

Write-load isolation

The API serves reads; the collector, correlator, and profiler are write-heavy. Under SQLite, single-writer contention was the #1 latency source when everything ran in-process. Breaking the writers out and letting them hold short transactions independently drops lock contention dramatically. If you outgrow even that, flip DECNET_DB_TYPE=mysql.

Observability

Each subsystem emits its own RFC 5424 stream tagged with its own APP-NAME (decnet.collector, decnet.sniffer, decnet.profiler, …). That makes triage in the SIEM mechanical: filter by app, not by guesswork. Embedded mode muddies this because everything shares the API process.

Embed mode

For dev and for the smallest possible single-host deploy, two workers can run inside the FastAPI process:

  • DECNET_EMBED_PROFILER=1 — profiler starts in a thread on app startup.
  • DECNET_EMBED_SNIFFER=1 — sniffer starts in a thread on app startup.

These are off by default. The rest of the constellation (collector, correlator, prober, mutator) always runs as standalone processes — decnet deploy supervises them through a small process registry in decnet/cli.py::_service_registry, which respawns any unit that dies. Embed mode exists only for the profiler and the sniffer, which are the two workers cheap enough to live in-process during dev.

The duplication risk

Do not run embed mode and the standalone worker at the same time. That is how you get:

  • Duplicated events — both sniffer copies persist the same packet.
  • Skipped events — both profilers race on the same attacker row; one loses.

The env doc (Environment-Variables) flags this explicitly. The rule: pick one mode per host per worker. Systemd units shipped under deploy/ assume standalone.

Storage layer — the short version

DECNET uses a single repository pattern:

  • SQLModelRepository is the base class. It holds all SQLModel / SQLAlchemy logic, queries, and transactions that are portable.
  • SQLiteRepository and MySQLRepository subclass it and override only the dialect-specific bits (pragmas, pool config, upsert flavor).
  • get_repository() in decnet/web/db/factory.py picks one based on DECNET_DB_TYPE (sqlite or mysql) and wraps it with telemetry.
  • FastAPI routes take the repo via the get_repo dependency in decnet/web/dependencies.py.

Never import SQLiteRepository directly. See Database-Drivers for schema, migration, and tuning.

Going deeper

The development/ directory in the repo has low-level flow material that is too noisy to mirror here:

  • development/execution_graphs.md — per-command call graphs.
  • development/complete_execution_graph.md — one big graph across the whole system.
  • development/ast_graph.md — static call/symbol graph.

If you are chasing a bug across subsystem boundaries, start from those.