diff --git a/Design-Overview.md b/Design-Overview.md new file mode 100644 index 0000000..1fe3831 --- /dev/null +++ b/Design-Overview.md @@ -0,0 +1,84 @@ +# Design Overview + +A short tour of how DECNET is split into processes and why. For knob-level detail see [[Environment-Variables]]; for storage internals see [[Database-Drivers]]. + +## The microservice split + +DECNET runs as a small constellation of workers around a FastAPI process. Each worker is a first-class CLI subcommand and can also be embedded in the API process for simple single-host deploys. + +| Subsystem | Launch standalone | Embed in API | Primary job | +|-----------|-------------------|--------------|-------------| +| Web / API | `decnet web --daemon` | (this is the host) | FastAPI app, dashboard, REST endpoints | +| Collector | `decnet collect --daemon` | always runs | Ingest RFC 5424 syslog from deckies | +| Correlator | `decnet correlate --daemon` | always runs | Session + attacker correlation | +| Profiler | `decnet profiler --daemon` | `DECNET_EMBED_PROFILER=1` | Attacker profiling / scoring | +| Sniffer | `decnet sniffer --daemon` | `DECNET_EMBED_SNIFFER=1` | Passive PCAP on the decoy bridge | +| Prober | `decnet probe --daemon` | always runs | Active realism checks | +| Mutator | `decnet mutate --daemon --watch` | always runs | Runtime fleet mutation | + +Every worker is also how `decnet deploy` spawns them — the deploy path shells out to `python -m decnet.cli --daemon` so there is exactly one code path, whether you run interactively or under systemd. + +## Why split them at all + +### Resilience + +A crashed sniffer must not take the API down. A stuck profiler must not block an attacker write from the collector. Splitting into processes gives us the usual crash-domain isolation: supervise each unit under systemd (see [[Systemd-Setup]]), restart on its own schedule. + +### Scaling + +In UNIHOST mode everything lives on one machine. In SWARM / MULTIHOST mode the heavy workers (sniffer, profiler) can move to dedicated hosts while the API stays on the public-facing bridge. Because each worker reads the same repository via `get_repository()`, they are effectively stateless w.r.t. each other — they coordinate through the DB, not through shared memory. + +### Write-load isolation + +The API serves reads; the collector, correlator, and profiler are write-heavy. Under SQLite, single-writer contention was the #1 latency source when everything ran in-process. Breaking the writers out and letting them hold short transactions independently drops lock contention dramatically. If you outgrow even that, flip `DECNET_DB_TYPE=mysql`. + +### Observability + +Each subsystem emits its own RFC 5424 stream tagged with its own APP-NAME (`decnet.collector`, `decnet.sniffer`, `decnet.profiler`, …). That makes triage in the SIEM mechanical: filter by app, not by guesswork. Embedded mode muddies this because everything shares the API process. + +## Embed mode + +For dev and for the smallest possible single-host deploy, two workers can run inside the FastAPI process: + +- `DECNET_EMBED_PROFILER=1` — profiler starts in a thread on app startup. +- `DECNET_EMBED_SNIFFER=1` — sniffer starts in a thread on app startup. + +These are off by default. The rest of the constellation (collector, correlator, prober, mutator) always runs as standalone processes — `decnet deploy` supervises them through a small process registry in `decnet/cli.py::_service_registry`, which respawns any unit that dies. Embed mode exists only for the profiler and the sniffer, which are the two workers cheap enough to live in-process during dev. + +### The duplication risk + +Do **not** run embed mode *and* the standalone worker at the same time. That is how you get: + +- **Duplicated events** — both sniffer copies persist the same packet. +- **Skipped events** — both profilers race on the same attacker row; one loses. + +The env doc ([[Environment-Variables]]) flags this explicitly. The rule: pick one mode per host per worker. Systemd units shipped under `deploy/` assume standalone. + +## Storage layer — the short version + +DECNET uses a single repository pattern: + +- `SQLModelRepository` is the base class. It holds all SQLModel / SQLAlchemy logic, queries, and transactions that are portable. +- `SQLiteRepository` and `MySQLRepository` subclass it and override only the dialect-specific bits (pragmas, pool config, upsert flavor). +- `get_repository()` in `decnet/web/db/factory.py` picks one based on `DECNET_DB_TYPE` (`sqlite` or `mysql`) and wraps it with telemetry. +- FastAPI routes take the repo via the `get_repo` dependency in `decnet/web/dependencies.py`. + +Never import `SQLiteRepository` directly. See [[Database-Drivers]] for schema, migration, and tuning. + +## Going deeper + +The `development/` directory in the repo has low-level flow material that is too noisy to mirror here: + +- `development/execution_graphs.md` — per-command call graphs. +- `development/complete_execution_graph.md` — one big graph across the whole system. +- `development/ast_graph.md` — static call/symbol graph. + +If you are chasing a bug across subsystem boundaries, start from those. + +## Related pages + +- [[Developer-Guide]] — setup, layout, conventions. +- [[Writing-a-Service-Plugin]] — add a new honeypot service. +- [[Database-Drivers]] — SQLite vs MySQL. +- [[Environment-Variables]] — the full env surface. +- [[Systemd-Setup]] — running each worker as a supervised unit. diff --git a/Developer-Guide.md b/Developer-Guide.md new file mode 100644 index 0000000..b48a812 --- /dev/null +++ b/Developer-Guide.md @@ -0,0 +1,123 @@ +# Developer Guide + +How to hack on DECNET. If you just want to deploy it, see [[Home]] and [[INI-Config-Format]] instead. + +## Environment setup + +DECNET pins its runtime deps in `requirements.lock`. Always work inside the project virtualenv — do not install into the system interpreter. + +```bash +cd /path/to/DECNET +python -m venv .venv +source .venv/bin/activate +pip install -e . +``` + +Every subsequent shell must `source .venv/bin/activate` before running `pip`, `pytest`, or `decnet`. The CLI entrypoint is registered in `pyproject.toml` and resolves to `decnet.cli:app`. + +To confirm the dev install: + +```bash +decnet services # list registered service plugins +decnet distros # list base-image archetypes +pytest -q # run the suite +``` + +## Repository layout + +High-level tour. Only the directories you will touch often are listed. + +| Path | What lives there | +|------|------------------| +| `decnet/cli.py` | Typer app. Every `decnet ` subcommand is defined here. | +| `decnet/services/` | Service plugins. One file per honeypot service. See [[Writing-a-Service-Plugin]]. | +| `decnet/services/base.py` | `BaseService` contract. | +| `decnet/services/registry.py` | Auto-discovery of `BaseService` subclasses. | +| `decnet/composer.py` | Turns a fleet spec into a `docker-compose` file. | +| `decnet/fleet.py` | Fleet planning: which decky runs which services on which IP. | +| `decnet/archetypes.py`, `decnet/distros.py` | OS personas + base-image selection. | +| `decnet/os_fingerprint.py` | TCP/IP stack tuning to bend nmap fingerprints toward a chosen persona. | +| `decnet/env.py` | Central env-var parsing (`DECNET_DB_TYPE`, `DECNET_EMBED_*`, …). | +| `decnet/collector/` | Syslog / RFC 5424 ingest worker. | +| `decnet/correlation/` | Session and attacker correlation worker. | +| `decnet/profiler/` | Attacker profiler. Embeddable or standalone — see [[Design-Overview]]. | +| `decnet/sniffer/` | Passive PCAP sniffer worker. Same embed/standalone split. | +| `decnet/mutator/` | Runtime mutation of the decoy fleet. | +| `decnet/prober/` | Active probe / realism checker. | +| `decnet/engine/` | Deploy / teardown orchestration. | +| `decnet/web/` | FastAPI app + dashboard + repository layer. | +| `decnet/web/db/` | `SQLModelRepository` base and `sqlite/`, `mysql/` subclasses. See [[Database-Drivers]]. | +| `decnet/logging/` | RFC 5424 emitters and the syslog bridge used by service containers. | +| `templates//` | Dockerfile + service config bundle built into the service image. | +| `tests/` | Pytest suite. Mirrors the `decnet/` tree loosely. | +| `development/` | Low-level design notes and generated graphs. Not shipped. | + +## Coding conventions + +### Lint and static checks + +- **ruff** is the single source of truth for style. Config lives in `ruff.toml`. Run `ruff check decnet tests` before committing. +- **bandit** is used for security linting of `decnet/`. Fix findings rather than silencing them; if a silence is unavoidable, scope the `# nosec` comment to one line and explain why. + +### Stealth in probes and banners + +Never reveal DECNET identity in anything an attacker can see. That means: + +- No `User-Agent: DECNET/...` in the prober or in any service plugin. +- No banners, MOTDs, `/etc/issue` contents, HTTP `Server:` headers, or SSH version strings that mention DECNET, honeypot, decoy, fake, or any internal codename. +- No log filenames or env var names leaking into emitted service output. + +This rule is load-bearing. A single leaked banner turns the whole fleet into a well-known signature. + +### Dependency injection for storage + +Do not `from decnet.web.db.sqlite.repository import SQLiteRepository` in new code. Ever. + +- **In workers / CLI / library code**: call `get_repository()` from `decnet/web/db/factory.py`. It reads `DECNET_DB_TYPE` and returns the right backend, already wrapped with telemetry. +- **In FastAPI route handlers**: take `repo: BaseRepository = Depends(get_repo)` — defined in `decnet/web/dependencies.py`. This keeps the test harness able to swap in an in-memory repo. + +The direct-import rule is enforced by convention and by reviewer. If you find an old direct import while working on a file, fix it in the same commit. + +See [[Database-Drivers]] for how SQLite and MySQL subclasses differ. + +## Tests + +### Layout + +- `tests/` — fast unit tests. Run by default. +- `tests/api/` — FastAPI `TestClient` tests. +- `tests/docker/` — integration tests that spin real containers. Opt-in. +- `tests/live/` — full end-to-end against a live deploy. Opt-in. +- `tests/perf/`, `tests/stress/` — performance and soak. Opt-in. +- `tests/service_testing/` — per-service plugin smoke tests. +- `tests/conftest.py` — shared fixtures, including repo factories. + +### Running + +```bash +pytest -q # fast suite +pytest tests/api -q # just the API +pytest tests/service_testing -q # plugin smoke +pytest -k ssh # single topic +``` + +### Rules + +- Every new feature ships with pytest coverage. No exceptions. +- Never hand off code that is not running or not 100% green. If you cannot finish the tests, say so — do not push. +- Do not use scapy's `sniff()` inside a `TestClient` lifespan test. The sniff thread hangs pytest teardown. Use static source inspection or a fake socket instead. + +## Commit style + +- Follow the existing log: short imperative subject, `scope:` prefix when obvious (`feat(sniffer):`, `fix(web-ui):`, `test(ssh):`, `chore:`). +- Run the relevant `pytest` subset before committing. A broken main is worse than a late commit. +- Never add `Co-Authored-By:` or any Claude / AI attribution trailer. +- Prefer a new commit over `--amend`. Hooks that fail leave you in a half-state; amending there hides work. + +## Related pages + +- [[Design-Overview]] — why workers are split out and how embed mode works. +- [[Writing-a-Service-Plugin]] — step-by-step plugin authoring. +- [[Database-Drivers]] — the repository pattern in detail. +- [[Environment-Variables]] — every `DECNET_*` knob. +- [[INI-Config-Format]] — declarative deploy specs. diff --git a/Writing-a-Service-Plugin.md b/Writing-a-Service-Plugin.md new file mode 100644 index 0000000..cd63f28 --- /dev/null +++ b/Writing-a-Service-Plugin.md @@ -0,0 +1,174 @@ +# Writing a Service Plugin + +A service plugin is what makes a decky look like an SSH box, an SMB share, an MSSQL server, or whatever else. Plugins are auto-discovered from `decnet/services/`. You add a file, you get a service. + +For runtime INI-driven custom services (no Python code at all), see [[Custom-Services]] — this page is for first-class plugins baked into the codebase. + +## The contract + +Every plugin subclasses `BaseService` from `decnet/services/base.py`: + +```python +class BaseService(ABC): + name: str # unique slug, e.g. "ssh" + ports: list[int] # in-container listen ports + default_image: str # Docker image tag, or "build" + fleet_singleton: bool = False # True = one instance fleet-wide + + @abstractmethod + def compose_fragment( + self, + decky_name: str, + log_target: str | None = None, + service_cfg: dict | None = None, + ) -> dict: ... + + def dockerfile_context(self) -> Path | None: + return None +``` + +Rules the composer enforces so you do not have to: + +- Networking keys (`networks`, `ipv4_address`, `mac_address`) are injected by `decnet/composer.py`. Do not set them in `compose_fragment`. +- If you return `"build": {"context": ...}`, make sure `dockerfile_context()` returns the same path so `decnet deploy` can pre-build the image. +- `log_target` is `"ip:port"` when log forwarding is on, else `None`. Pass it into the container as an env var and let the in-container rsyslog bridge handle the rest. + +## Registration + +There is no registration step. The registry in `decnet/services/registry.py` walks the `decnet/services/` package at import time, imports every module, and picks up every `BaseService` subclass via `__subclasses__()`. Your plugin appears in `decnet services` and in `all_services()` the moment its file exists in the right directory. + +To verify: + +```bash +decnet services | grep +``` + +## Templates + +If your service needs a custom image (almost all do), drop the build context under `templates//`: + +``` +templates/myservice/ + Dockerfile + entrypoint.sh + config/ + ... +``` + +Conventions the existing plugins follow: + +- Base the image on `debian:bookworm-slim` unless you have a reason to diverge. Heterogeneity is good — some services use Alpine, some use CentOS-derived images. +- Bake an rsyslog or equivalent bridge into the image so the container emits RFC 5424 on stdout. +- Never write DECNET, honeypot, or decoy strings into the image, banners, MOTDs, config files, or user-agents. See the stealth rule in [[Developer-Guide]]. + +## A minimal plugin + +The smallest real plugin is about 50 lines. This one wraps a pre-built image and needs no Dockerfile: + +```python +# decnet/services/echoecho.py +from decnet.services.base import BaseService + + +class EchoEchoService(BaseService): + """ + Tiny TCP echo service. Useful as a template and for testing the composer. + + service_cfg keys: + greeting First line sent on connect. Default: empty. + """ + + name = "echoecho" + ports = [7] + default_image = "ghcr.io/example/echoecho:1.0" + fleet_singleton = False + + def compose_fragment( + self, + decky_name: str, + log_target: str | None = None, + service_cfg: dict | None = None, + ) -> dict: + cfg = service_cfg or {} + env: dict = { + "NODE_NAME": decky_name, + "ECHO_GREETING": cfg.get("greeting", ""), + } + if log_target: + env["SYSLOG_TARGET"] = log_target + + fragment: dict = { + "image": self.default_image, + "container_name": f"{decky_name}-echoecho", + "restart": "unless-stopped", + "environment": env, + } + return fragment +``` + +That is the whole plugin. Drop it in `decnet/services/echoecho.py`, run `decnet services`, and it shows up. + +## Adding a build context + +If you need a custom image, reference `templates//` and implement `dockerfile_context`: + +```python +from pathlib import Path +from decnet.services.base import BaseService + +TEMPLATES_DIR = Path(__file__).parent.parent.parent / "templates" / "echoecho" + + +class EchoEchoService(BaseService): + name = "echoecho" + ports = [7] + default_image = "build" + + def compose_fragment(self, decky_name, log_target=None, service_cfg=None): + return { + "build": {"context": str(TEMPLATES_DIR)}, + "container_name": f"{decky_name}-echoecho", + "restart": "unless-stopped", + "environment": {"NODE_NAME": decky_name}, + } + + def dockerfile_context(self) -> Path: + return TEMPLATES_DIR +``` + +Look at `decnet/services/ssh.py` for a fully worked, stealth-aware example including a per-decky quarantine bind-mount. + +## Per-service persona config + +`service_cfg` is the dict pulled from the matching `[service.]` section of the INI (see [[INI-Config-Format]]). Keep the keys documented in the class docstring — that docstring is the only user-facing reference. + +## Pytest coverage + +Every plugin ships with tests. Drop them under `tests/service_testing/test_.py`. Cover at minimum: + +- Instantiation + registry lookup: `all_services()["echoecho"]` resolves. +- `compose_fragment` returns the expected keys for a given `decky_name` and `service_cfg`. +- Absence of DECNET / honeypot strings in rendered env, command, and template files — this is the stealth rule made executable. +- If `dockerfile_context()` is set, that the path exists and contains a `Dockerfile`. + +Run `pytest tests/service_testing -q` before committing. Features without tests do not land — see [[Developer-Guide]]. + +## Checklist + +- [ ] New file under `decnet/services/.py`, subclasses `BaseService`. +- [ ] `name`, `ports`, `default_image` set. `fleet_singleton` if applicable. +- [ ] `compose_fragment` returns networking-free compose dict. +- [ ] If `default_image == "build"`, `dockerfile_context()` returns the context path. +- [ ] `templates//` exists with a Dockerfile (if building). +- [ ] No DECNET / honeypot / decoy strings anywhere the attacker can see. +- [ ] `service_cfg` keys documented in the class docstring. +- [ ] Pytest coverage under `tests/service_testing/`. +- [ ] `decnet services` lists the new slug. +- [ ] Commit follows the style in [[Developer-Guide]]. + +## Related pages + +- [[Developer-Guide]] — conventions, DI rules, commit style. +- [[Custom-Services]] — declarative INI-only services. +- [[INI-Config-Format]] — the deploy spec format. +- [[Design-Overview]] — where plugins fit in the larger picture.