Add Developer Guide, Design Overview, Plugin guide (Unit 15)

2026-04-18 06:07:04 -04:00
parent 9de6ea3dff
commit e48b884970
3 changed files with 381 additions and 0 deletions

84
Design-Overview.md Normal file

@@ -0,0 +1,84 @@
# Design Overview
A short tour of how DECNET is split into processes and why. For knob-level detail see [[Environment-Variables]]; for storage internals see [[Database-Drivers]].
## The microservice split
DECNET runs as a small constellation of workers around a FastAPI process. Each worker is a first-class CLI subcommand and can also be embedded in the API process for simple single-host deploys.
| Subsystem | Launch standalone | Embed in API | Primary job |
|-----------|-------------------|--------------|-------------|
| Web / API | `decnet web --daemon` | (this is the host) | FastAPI app, dashboard, REST endpoints |
| Collector | `decnet collect --daemon` | always runs | Ingest RFC 5424 syslog from deckies |
| Correlator | `decnet correlate --daemon` | always runs | Session + attacker correlation |
| Profiler | `decnet profiler --daemon` | `DECNET_EMBED_PROFILER=1` | Attacker profiling / scoring |
| Sniffer | `decnet sniffer --daemon` | `DECNET_EMBED_SNIFFER=1` | Passive PCAP on the decoy bridge |
| Prober | `decnet probe --daemon` | always runs | Active realism checks |
| Mutator | `decnet mutate --daemon --watch` | always runs | Runtime fleet mutation |
Every worker is also how `decnet deploy` spawns them — the deploy path shells out to `python -m decnet.cli <worker> --daemon` so there is exactly one code path, whether you run interactively or under systemd.
## Why split them at all
### Resilience
A crashed sniffer must not take the API down. A stuck profiler must not block an attacker write from the collector. Splitting into processes gives us the usual crash-domain isolation: supervise each unit under systemd (see [[Systemd-Setup]]), restart on its own schedule.
### Scaling
In UNIHOST mode everything lives on one machine. In SWARM / MULTIHOST mode the heavy workers (sniffer, profiler) can move to dedicated hosts while the API stays on the public-facing bridge. Because each worker reads the same repository via `get_repository()`, they are effectively stateless w.r.t. each other — they coordinate through the DB, not through shared memory.
### Write-load isolation
The API serves reads; the collector, correlator, and profiler are write-heavy. Under SQLite, single-writer contention was the #1 latency source when everything ran in-process. Breaking the writers out and letting them hold short transactions independently drops lock contention dramatically. If you outgrow even that, flip `DECNET_DB_TYPE=mysql`.
### Observability
Each subsystem emits its own RFC 5424 stream tagged with its own APP-NAME (`decnet.collector`, `decnet.sniffer`, `decnet.profiler`, …). That makes triage in the SIEM mechanical: filter by app, not by guesswork. Embedded mode muddies this because everything shares the API process.
## Embed mode
For dev and for the smallest possible single-host deploy, two workers can run inside the FastAPI process:
- `DECNET_EMBED_PROFILER=1` — profiler starts in a thread on app startup.
- `DECNET_EMBED_SNIFFER=1` — sniffer starts in a thread on app startup.
These are off by default in production deployments. The rest of the constellation (collector, correlator, prober, mutator) always runs embedded under the API as lightweight threads unless you explicitly shell out the standalone CLI.
### The duplication risk
Do **not** run embed mode *and* the standalone worker at the same time. That is how you get:
- **Duplicated events** — both sniffer copies persist the same packet.
- **Skipped events** — both profilers race on the same attacker row; one loses.
The env doc ([[Environment-Variables]]) flags this explicitly. The rule: pick one mode per host per worker. Systemd units shipped under `deploy/` assume standalone.
## Storage layer — the short version
DECNET uses a single repository pattern:
- `SQLModelRepository` is the base class. It holds all SQLModel / SQLAlchemy logic, queries, and transactions that are portable.
- `SQLiteRepository` and `MySQLRepository` subclass it and override only the dialect-specific bits (pragmas, pool config, upsert flavor).
- `get_repository()` in `decnet/web/db/factory.py` picks one based on `DECNET_DB_TYPE` (`sqlite` or `mysql`) and wraps it with telemetry.
- FastAPI routes take the repo via the `get_repo` dependency in `decnet/web/dependencies.py`.
Never import `SQLiteRepository` directly. See [[Database-Drivers]] for schema, migration, and tuning.
## Going deeper
The `development/` directory in the repo has low-level flow material that is too noisy to mirror here:
- `development/execution_graphs.md` — per-command call graphs.
- `development/complete_execution_graph.md` — one big graph across the whole system.
- `development/ast_graph.md` — static call/symbol graph.
If you are chasing a bug across subsystem boundaries, start from those.
## Related pages
- [[Developer-Guide]] — setup, layout, conventions.
- [[Writing-a-Service-Plugin]] — add a new honeypot service.
- [[Database-Drivers]] — SQLite vs MySQL.
- [[Environment-Variables]] — the full env surface.
- [[Systemd-Setup]] — running each worker as a supervised unit.

123
Developer-Guide.md Normal file

@@ -0,0 +1,123 @@
# Developer Guide
How to hack on DECNET. If you just want to deploy it, see [[Home]] and [[INI-Config-Format]] instead.
## Environment setup
DECNET pins its runtime deps in `requirements.lock`. Always work inside the project virtualenv — do not install into the system interpreter.
```bash
cd /path/to/DECNET
python -m venv .venv
source .venv/bin/activate
pip install -e .
```
Every subsequent shell must `source .venv/bin/activate` before running `pip`, `pytest`, or `decnet`. The CLI entrypoint is registered in `pyproject.toml` and resolves to `decnet.cli:app`.
To confirm the dev install:
```bash
decnet services # list registered service plugins
decnet distros # list base-image archetypes
pytest -q # run the suite
```
## Repository layout
High-level tour. Only the directories you will touch often are listed.
| Path | What lives there |
|------|------------------|
| `decnet/cli.py` | Typer app. Every `decnet <verb>` subcommand is defined here. |
| `decnet/services/` | Service plugins. One file per honeypot service. See [[Writing-a-Service-Plugin]]. |
| `decnet/services/base.py` | `BaseService` contract. |
| `decnet/services/registry.py` | Auto-discovery of `BaseService` subclasses. |
| `decnet/composer.py` | Turns a fleet spec into a `docker-compose` file. |
| `decnet/fleet.py` | Fleet planning: which decky runs which services on which IP. |
| `decnet/archetypes.py`, `decnet/distros.py` | OS personas + base-image selection. |
| `decnet/os_fingerprint.py` | TCP/IP stack tuning to bend nmap fingerprints toward a chosen persona. |
| `decnet/env.py` | Central env-var parsing (`DECNET_DB_TYPE`, `DECNET_EMBED_*`, …). |
| `decnet/collector/` | Syslog / RFC 5424 ingest worker. |
| `decnet/correlation/` | Session and attacker correlation worker. |
| `decnet/profiler/` | Attacker profiler. Embeddable or standalone — see [[Design-Overview]]. |
| `decnet/sniffer/` | Passive PCAP sniffer worker. Same embed/standalone split. |
| `decnet/mutator/` | Runtime mutation of the decoy fleet. |
| `decnet/prober/` | Active probe / realism checker. |
| `decnet/engine/` | Deploy / teardown orchestration. |
| `decnet/web/` | FastAPI app + dashboard + repository layer. |
| `decnet/web/db/` | `SQLModelRepository` base and `sqlite/`, `mysql/` subclasses. See [[Database-Drivers]]. |
| `decnet/logging/` | RFC 5424 emitters and the syslog bridge used by service containers. |
| `templates/<slug>/` | Dockerfile + service config bundle built into the service image. |
| `tests/` | Pytest suite. Mirrors the `decnet/` tree loosely. |
| `development/` | Low-level design notes and generated graphs. Not shipped. |
## Coding conventions
### Lint and static checks
- **ruff** is the single source of truth for style. Config lives in `ruff.toml`. Run `ruff check decnet tests` before committing.
- **bandit** is used for security linting of `decnet/`. Fix findings rather than silencing them; if a silence is unavoidable, scope the `# nosec` comment to one line and explain why.
### Stealth in probes and banners
Never reveal DECNET identity in anything an attacker can see. That means:
- No `User-Agent: DECNET/...` in the prober or in any service plugin.
- No banners, MOTDs, `/etc/issue` contents, HTTP `Server:` headers, or SSH version strings that mention DECNET, honeypot, decoy, fake, or any internal codename.
- No log filenames or env var names leaking into emitted service output.
This rule is load-bearing. A single leaked banner turns the whole fleet into a well-known signature.
### Dependency injection for storage
Do not `from decnet.web.db.sqlite.repository import SQLiteRepository` in new code. Ever.
- **In workers / CLI / library code**: call `get_repository()` from `decnet/web/db/factory.py`. It reads `DECNET_DB_TYPE` and returns the right backend, already wrapped with telemetry.
- **In FastAPI route handlers**: take `repo: BaseRepository = Depends(get_repo)` — defined in `decnet/web/dependencies.py`. This keeps the test harness able to swap in an in-memory repo.
The direct-import rule is enforced by convention and by reviewer. If you find an old direct import while working on a file, fix it in the same commit.
See [[Database-Drivers]] for how SQLite and MySQL subclasses differ.
## Tests
### Layout
- `tests/` — fast unit tests. Run by default.
- `tests/api/` — FastAPI `TestClient` tests.
- `tests/docker/` — integration tests that spin real containers. Opt-in.
- `tests/live/` — full end-to-end against a live deploy. Opt-in.
- `tests/perf/`, `tests/stress/` — performance and soak. Opt-in.
- `tests/service_testing/` — per-service plugin smoke tests.
- `tests/conftest.py` — shared fixtures, including repo factories.
### Running
```bash
pytest -q # fast suite
pytest tests/api -q # just the API
pytest tests/service_testing -q # plugin smoke
pytest -k ssh # single topic
```
### Rules
- Every new feature ships with pytest coverage. No exceptions.
- Never hand off code that is not running or not 100% green. If you cannot finish the tests, say so — do not push.
- Do not use scapy's `sniff()` inside a `TestClient` lifespan test. The sniff thread hangs pytest teardown. Use static source inspection or a fake socket instead.
## Commit style
- Follow the existing log: short imperative subject, `scope:` prefix when obvious (`feat(sniffer):`, `fix(web-ui):`, `test(ssh):`, `chore:`).
- Run the relevant `pytest` subset before committing. A broken main is worse than a late commit.
- Never add `Co-Authored-By:` or any Claude / AI attribution trailer.
- Prefer a new commit over `--amend`. Hooks that fail leave you in a half-state; amending there hides work.
## Related pages
- [[Design-Overview]] — why workers are split out and how embed mode works.
- [[Writing-a-Service-Plugin]] — step-by-step plugin authoring.
- [[Database-Drivers]] — the repository pattern in detail.
- [[Environment-Variables]] — every `DECNET_*` knob.
- [[INI-Config-Format]] — declarative deploy specs.

174
Writing-a-Service-Plugin.md Normal file

@@ -0,0 +1,174 @@
# Writing a Service Plugin
A service plugin is what makes a decky look like an SSH box, an SMB share, an MSSQL server, or whatever else. Plugins are auto-discovered from `decnet/services/`. You add a file, you get a service.
For runtime INI-driven custom services (no Python code at all), see [[Custom-Services]] — this page is for first-class plugins baked into the codebase.
## The contract
Every plugin subclasses `BaseService` from `decnet/services/base.py`:
```python
class BaseService(ABC):
name: str # unique slug, e.g. "ssh"
ports: list[int] # in-container listen ports
default_image: str # Docker image tag, or "build"
fleet_singleton: bool = False # True = one instance fleet-wide
@abstractmethod
def compose_fragment(
self,
decky_name: str,
log_target: str | None = None,
service_cfg: dict | None = None,
) -> dict: ...
def dockerfile_context(self) -> Path | None:
return None
```
Rules the composer enforces so you do not have to:
- Networking keys (`networks`, `ipv4_address`, `mac_address`) are injected by `decnet/composer.py`. Do not set them in `compose_fragment`.
- If you return `"build": {"context": ...}`, make sure `dockerfile_context()` returns the same path so `decnet deploy` can pre-build the image.
- `log_target` is `"ip:port"` when log forwarding is on, else `None`. Pass it into the container as an env var and let the in-container rsyslog bridge handle the rest.
## Registration
There is no registration step. The registry in `decnet/services/registry.py` walks the `decnet/services/` package at import time, imports every module, and picks up every `BaseService` subclass via `__subclasses__()`. Your plugin appears in `decnet services` and in `all_services()` the moment its file exists in the right directory.
To verify:
```bash
decnet services | grep <your-slug>
```
## Templates
If your service needs a custom image (almost all do), drop the build context under `templates/<slug>/`:
```
templates/myservice/
Dockerfile
entrypoint.sh
config/
...
```
Conventions the existing plugins follow:
- Base the image on `debian:bookworm-slim` unless you have a reason to diverge. Heterogeneity is good — some services use Alpine, some use CentOS-derived images.
- Bake an rsyslog or equivalent bridge into the image so the container emits RFC 5424 on stdout.
- Never write DECNET, honeypot, or decoy strings into the image, banners, MOTDs, config files, or user-agents. See the stealth rule in [[Developer-Guide]].
## A minimal plugin
The smallest real plugin is about 50 lines. This one wraps a pre-built image and needs no Dockerfile:
```python
# decnet/services/echoecho.py
from decnet.services.base import BaseService
class EchoEchoService(BaseService):
"""
Tiny TCP echo service. Useful as a template and for testing the composer.
service_cfg keys:
greeting First line sent on connect. Default: empty.
"""
name = "echoecho"
ports = [7]
default_image = "ghcr.io/example/echoecho:1.0"
fleet_singleton = False
def compose_fragment(
self,
decky_name: str,
log_target: str | None = None,
service_cfg: dict | None = None,
) -> dict:
cfg = service_cfg or {}
env: dict = {
"NODE_NAME": decky_name,
"ECHO_GREETING": cfg.get("greeting", ""),
}
if log_target:
env["SYSLOG_TARGET"] = log_target
fragment: dict = {
"image": self.default_image,
"container_name": f"{decky_name}-echoecho",
"restart": "unless-stopped",
"environment": env,
}
return fragment
```
That is the whole plugin. Drop it in `decnet/services/echoecho.py`, run `decnet services`, and it shows up.
## Adding a build context
If you need a custom image, reference `templates/<slug>/` and implement `dockerfile_context`:
```python
from pathlib import Path
from decnet.services.base import BaseService
TEMPLATES_DIR = Path(__file__).parent.parent.parent / "templates" / "echoecho"
class EchoEchoService(BaseService):
name = "echoecho"
ports = [7]
default_image = "build"
def compose_fragment(self, decky_name, log_target=None, service_cfg=None):
return {
"build": {"context": str(TEMPLATES_DIR)},
"container_name": f"{decky_name}-echoecho",
"restart": "unless-stopped",
"environment": {"NODE_NAME": decky_name},
}
def dockerfile_context(self) -> Path:
return TEMPLATES_DIR
```
Look at `decnet/services/ssh.py` for a fully worked, stealth-aware example including a per-decky quarantine bind-mount.
## Per-service persona config
`service_cfg` is the dict pulled from the matching `[service.<slug>]` section of the INI (see [[INI-Config-Format]]). Keep the keys documented in the class docstring — that docstring is the only user-facing reference.
## Pytest coverage
Every plugin ships with tests. Drop them under `tests/service_testing/test_<slug>.py`. Cover at minimum:
- Instantiation + registry lookup: `all_services()["echoecho"]` resolves.
- `compose_fragment` returns the expected keys for a given `decky_name` and `service_cfg`.
- Absence of DECNET / honeypot strings in rendered env, command, and template files — this is the stealth rule made executable.
- If `dockerfile_context()` is set, that the path exists and contains a `Dockerfile`.
Run `pytest tests/service_testing -q` before committing. Features without tests do not land — see [[Developer-Guide]].
## Checklist
- [ ] New file under `decnet/services/<slug>.py`, subclasses `BaseService`.
- [ ] `name`, `ports`, `default_image` set. `fleet_singleton` if applicable.
- [ ] `compose_fragment` returns networking-free compose dict.
- [ ] If `default_image == "build"`, `dockerfile_context()` returns the context path.
- [ ] `templates/<slug>/` exists with a Dockerfile (if building).
- [ ] No DECNET / honeypot / decoy strings anywhere the attacker can see.
- [ ] `service_cfg` keys documented in the class docstring.
- [ ] Pytest coverage under `tests/service_testing/`.
- [ ] `decnet services` lists the new slug.
- [ ] Commit follows the style in [[Developer-Guide]].
## Related pages
- [[Developer-Guide]] — conventions, DI rules, commit style.
- [[Custom-Services]] — declarative INI-only services.
- [[INI-Config-Format]] — the deploy spec format.
- [[Design-Overview]] — where plugins fit in the larger picture.