wiki: merge env-state into main
190
Environment-Variables.md
Normal file
190
Environment-Variables.md
Normal file
@@ -0,0 +1,190 @@
|
||||
# Environment Variables
|
||||
|
||||
DECNET reads configuration from process environment. On import, `decnet/env.py`
|
||||
loads `.env.local` first (preferred, git-ignored) then `.env` from the project
|
||||
root. Any variable already present in the shell environment wins over both
|
||||
files.
|
||||
|
||||
Only the variables listed below are recognised. Anything else is noise.
|
||||
|
||||
- Source of truth: [`decnet/env.py`](https://git.resacachile.cl/anti/DECNET/src/branch/main/decnet/env.py)
|
||||
- Starter template: [`env.config.example`](https://git.resacachile.cl/anti/DECNET/src/branch/main/env.config.example)
|
||||
|
||||
See also: [DB drivers](Database-Drivers), [Logging](Logging-and-Syslog),
|
||||
[Systemd](Systemd-Setup), [Tracing](Tracing-and-Profiling).
|
||||
|
||||
## Validation rules
|
||||
|
||||
Two validators live in `decnet/env.py`:
|
||||
|
||||
- `_port(name, default)` — integer in `[1, 65535]`. Applies to
|
||||
`DECNET_API_PORT`, `DECNET_WEB_PORT`, `DECNET_DB_PORT`.
|
||||
- `_require_env(name)` — variable must be set, and must not be a known-bad
|
||||
default. Under pytest (`PYTEST*` env var present) the bad-value check is
|
||||
skipped so test fixtures can use sentinel values.
|
||||
|
||||
### Known-bad-values block list
|
||||
|
||||
`_require_env` rejects these case-insensitive literals:
|
||||
|
||||
- `admin`
|
||||
- `secret`
|
||||
- `password`
|
||||
- `changeme`
|
||||
- `fallback-secret-key-change-me`
|
||||
|
||||
### JWT secret length rule
|
||||
|
||||
When `name == "DECNET_JWT_SECRET"`, the value must be at least **32 bytes**.
|
||||
This matches HS256's minimum key length (RFC 7518 §3.2 — "A key of the same
|
||||
size as the hash output [...] or larger MUST be used"). The check is relaxed
|
||||
when `DECNET_DEVELOPER=true`.
|
||||
|
||||
## System logging
|
||||
|
||||
| Name | Type | Default | Required | Consequence |
|
||||
|------|------|---------|----------|-------------|
|
||||
| `DECNET_SYSTEM_LOGS` | path | `decnet.system.log` | No | Destination for the RFC 5424 `RotatingFileHandler` installed by `decnet/config.py`. All microservice daemons (api, sniffer, profiler, collector) append here. Skipped under pytest. |
|
||||
|
||||
## Embedded workers
|
||||
|
||||
These are escape hatches — leave them unset in normal deployments. `decnet
|
||||
deploy` always spawns standalone daemons, and embedding the same worker inside
|
||||
the API duplicates DB writes and sniffer packets.
|
||||
|
||||
| Name | Type | Default | Required | Consequence |
|
||||
|------|------|---------|----------|-------------|
|
||||
| `DECNET_EMBED_PROFILER` | bool (`true`/other) | `false` | No | Embed profiler in API process. Do not combine with `decnet profiler --daemon`. |
|
||||
| `DECNET_EMBED_SNIFFER` | bool | `false` | No | Embed MACVLAN sniffer in API process. Do not combine with `decnet sniffer --daemon`. |
|
||||
|
||||
## Request profiling (Pyinstrument)
|
||||
|
||||
| Name | Type | Default | Required | Consequence |
|
||||
|------|------|---------|----------|-------------|
|
||||
| `DECNET_PROFILE_REQUESTS` | bool | `false` | No | Mount Pyinstrument ASGI middleware on the FastAPI app. Writes per-request HTML flamegraphs. |
|
||||
| `DECNET_PROFILE_DIR` | path | `profiles` | No | Output directory for flamegraphs. Relative paths are relative to `$PWD`. |
|
||||
|
||||
## API server
|
||||
|
||||
| Name | Type | Default | Required | Consequence |
|
||||
|------|------|---------|----------|-------------|
|
||||
| `DECNET_API_HOST` | str | `127.0.0.1` | No | Bind address for the FastAPI server. |
|
||||
| `DECNET_API_PORT` | int (1–65535) | `8000` | No | TCP port for the API. |
|
||||
| `DECNET_JWT_SECRET` | str (≥32 chars) | — | **Yes** | HS256 signing secret. Missing, known-bad, or short values abort startup unless `DECNET_DEVELOPER=true` (and even then, known-bad is still rejected). |
|
||||
| `DECNET_INGEST_LOG_FILE` | path | `/var/log/decnet/decnet.log` | No | File the ingester tails for honeypot events. |
|
||||
|
||||
## Ingester batching
|
||||
|
||||
| Name | Type | Default | Required | Consequence |
|
||||
|------|------|---------|----------|-------------|
|
||||
| `DECNET_BATCH_SIZE` | int | `100` | No | Rows accumulated per DB commit. Larger batches reduce SQLite write-lock contention. |
|
||||
| `DECNET_BATCH_MAX_WAIT_MS` | int | `250` | No | Maximum milliseconds to wait before flushing a partial batch. Bounds latency during idle periods. |
|
||||
|
||||
## Web dashboard
|
||||
|
||||
| Name | Type | Default | Required | Consequence |
|
||||
|------|------|---------|----------|-------------|
|
||||
| `DECNET_WEB_HOST` | str | `127.0.0.1` | No | Bind address for the web dashboard. |
|
||||
| `DECNET_WEB_PORT` | int (1–65535) | `8080` | No | Web dashboard port. |
|
||||
| `DECNET_ADMIN_USER` | str | `admin` | No* | Admin login. `admin` is a known-bad default and is rejected at startup outside pytest. |
|
||||
| `DECNET_ADMIN_PASSWORD` | str | `admin` | No* | Admin password. Rejected if set to a known-bad value. Change both. |
|
||||
| `DECNET_DEVELOPER` | bool | `false` | No | `true` enables DEBUG logging and relaxes the JWT length check. Does not enable tracing. |
|
||||
|
||||
*The defaults exist so imports do not crash, but the web API refuses to start
|
||||
with them in non-pytest environments.
|
||||
|
||||
## Tracing (OpenTelemetry)
|
||||
|
||||
Independent from `DECNET_DEVELOPER` so tracing can be toggled on its own.
|
||||
|
||||
| Name | Type | Default | Required | Consequence |
|
||||
|------|------|---------|----------|-------------|
|
||||
| `DECNET_DEVELOPER_TRACING` | bool | `false` | No | Enable OpenTelemetry tracing for the API and workers. |
|
||||
| `DECNET_OTEL_ENDPOINT` | URL | `http://localhost:4317` | No | OTLP gRPC collector endpoint. |
|
||||
|
||||
See [Tracing and Profiling](Tracing-and-Profiling).
|
||||
|
||||
## Database
|
||||
|
||||
See [Database Drivers](Database-Drivers) for the full driver matrix.
|
||||
|
||||
| Name | Type | Default | Required | Consequence |
|
||||
|------|------|---------|----------|-------------|
|
||||
| `DECNET_DB_TYPE` | `sqlite` \| `mysql` | `sqlite` | No | Selects the repository subclass. Lower-cased automatically. |
|
||||
| `DECNET_DB_URL` | SQLAlchemy URL | unset | No | Full URL, e.g. `mysql+asyncmy://user:pass@host:3306/decnet`. **When set, all component vars below are ignored.** |
|
||||
| `DECNET_DB_HOST` | str | `localhost` | No | MySQL host. |
|
||||
| `DECNET_DB_PORT` | int (1–65535) | `3306` | No | MySQL port. Validated only when explicitly set. |
|
||||
| `DECNET_DB_NAME` | str | `decnet` | No | Database name. |
|
||||
| `DECNET_DB_USER` | str | `decnet` | No | DB user. |
|
||||
| `DECNET_DB_PASSWORD` | str | unset | No | DB password. `None` when unset. |
|
||||
|
||||
## CORS
|
||||
|
||||
| Name | Type | Default | Required | Consequence |
|
||||
|------|------|---------|----------|-------------|
|
||||
| `DECNET_CORS_ORIGINS` | CSV of URLs | `http://<web_host>:<web_port>` | No | Allowed origins for the dashboard API. Wildcard bind addresses (`0.0.0.0`, `127.0.0.1`, `::`) resolve to `localhost` in the default. |
|
||||
|
||||
Example override:
|
||||
|
||||
```bash
|
||||
DECNET_CORS_ORIGINS=http://192.168.1.50:9090,https://dashboard.example.com
|
||||
```
|
||||
|
||||
## Starter `.env.local`
|
||||
|
||||
Copy this to the project root as `.env.local`, change every placeholder, and
|
||||
keep it out of git.
|
||||
|
||||
```bash
|
||||
# System logging
|
||||
DECNET_SYSTEM_LOGS=decnet.system.log
|
||||
|
||||
# Embedded workers (leave off unless you know why)
|
||||
DECNET_EMBED_PROFILER=false
|
||||
DECNET_EMBED_SNIFFER=false
|
||||
|
||||
# Request profiling
|
||||
DECNET_PROFILE_REQUESTS=false
|
||||
DECNET_PROFILE_DIR=profiles
|
||||
|
||||
# API
|
||||
DECNET_API_HOST=127.0.0.1
|
||||
DECNET_API_PORT=8000
|
||||
# Generate with: python -c 'import secrets; print(secrets.token_urlsafe(48))'
|
||||
DECNET_JWT_SECRET=REPLACE_WITH_A_64_BYTE_URLSAFE_TOKEN_NOT_IN_THE_BAD_LIST
|
||||
DECNET_INGEST_LOG_FILE=/var/log/decnet/decnet.log
|
||||
|
||||
# Ingester batching
|
||||
DECNET_BATCH_SIZE=100
|
||||
DECNET_BATCH_MAX_WAIT_MS=250
|
||||
|
||||
# Web dashboard
|
||||
DECNET_WEB_HOST=127.0.0.1
|
||||
DECNET_WEB_PORT=8080
|
||||
DECNET_ADMIN_USER=anti
|
||||
DECNET_ADMIN_PASSWORD=REPLACE_ME_WITH_A_LONG_PASSPHRASE
|
||||
DECNET_DEVELOPER=false
|
||||
|
||||
# Tracing
|
||||
DECNET_DEVELOPER_TRACING=false
|
||||
DECNET_OTEL_ENDPOINT=http://localhost:4317
|
||||
|
||||
# Database (sqlite is the default; uncomment the mysql block to switch)
|
||||
DECNET_DB_TYPE=sqlite
|
||||
# DECNET_DB_TYPE=mysql
|
||||
# DECNET_DB_URL=mysql+asyncmy://decnet:REPLACE_ME@db.internal:3306/decnet
|
||||
# DECNET_DB_HOST=db.internal
|
||||
# DECNET_DB_PORT=3306
|
||||
# DECNET_DB_NAME=decnet
|
||||
# DECNET_DB_USER=decnet
|
||||
# DECNET_DB_PASSWORD=REPLACE_ME
|
||||
|
||||
# CORS (only needed when the browser is not on the same host:port as the API)
|
||||
# DECNET_CORS_ORIGINS=http://192.168.1.50:9090,https://dashboard.example.com
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
`decnet/config.py` re-reads `DECNET_DEVELOPER` and `DECNET_SYSTEM_LOGS` during
|
||||
logging setup. Those are the same variables documented above — there are no
|
||||
others.
|
||||
177
Teardown-and-State.md
Normal file
177
Teardown-and-State.md
Normal file
@@ -0,0 +1,177 @@
|
||||
# Teardown and State
|
||||
|
||||
DECNET keeps the whole fleet picture in a single file, `decnet-state.json`,
|
||||
at the project root. Every command that touches a running deployment
|
||||
(`decnet status`, `decnet teardown`, the web dashboard, the sniffer, the
|
||||
collector) loads it; `decnet deploy` writes it.
|
||||
|
||||
Without this file, teardown cannot find the compose project, the sniffer
|
||||
cannot map IPs to deckies, and the collector does not know which containers
|
||||
to tail.
|
||||
|
||||
See also: [Environment Variables](Environment-Variables),
|
||||
[Database Drivers](Database-Drivers), [Systemd](Systemd-Setup).
|
||||
|
||||
## Layout
|
||||
|
||||
`decnet-state.json` has exactly two top-level keys:
|
||||
|
||||
```json
|
||||
{
|
||||
"config": { ... DecnetConfig.model_dump() ... },
|
||||
"compose_path": "/absolute/path/to/decnet-compose.yml"
|
||||
}
|
||||
```
|
||||
|
||||
- `config` — the serialised `DecnetConfig` pydantic model
|
||||
(`decnet/models.py`): `mode`, `interface`, `subnet`, `gateway`, `ipvlan`,
|
||||
`mutate_interval`, `log_file`, and the full `deckies[]` list. Each
|
||||
`DeckyConfig` entry carries name, IP, services, distro, base image,
|
||||
hostname, archetype, per-service config, `nmap_os`, and rotation timestamps.
|
||||
- `compose_path` — absolute path to the generated
|
||||
`decnet-compose.yml`. Teardown uses it as the `-f` argument to
|
||||
`docker compose`.
|
||||
|
||||
### Example `decnet-state.json`
|
||||
|
||||
```json
|
||||
{
|
||||
"config": {
|
||||
"mode": "unihost",
|
||||
"interface": "eth0",
|
||||
"subnet": "192.168.1.0/24",
|
||||
"gateway": "192.168.1.1",
|
||||
"ipvlan": false,
|
||||
"mutate_interval": 30,
|
||||
"log_file": "/var/log/decnet/decnet.log",
|
||||
"deckies": [
|
||||
{
|
||||
"name": "decky-01",
|
||||
"ip": "192.168.1.201",
|
||||
"services": ["ssh", "smb"],
|
||||
"distro": "debian",
|
||||
"base_image": "debian:bookworm-slim",
|
||||
"build_base": "debian:bookworm-slim",
|
||||
"hostname": "fileserver-02",
|
||||
"archetype": "office-fileshare",
|
||||
"service_config": {},
|
||||
"nmap_os": "linux",
|
||||
"mutate_interval": null,
|
||||
"last_mutated": 0.0,
|
||||
"last_login_attempt": 0.0
|
||||
},
|
||||
{
|
||||
"name": "decky-02",
|
||||
"ip": "192.168.1.202",
|
||||
"services": ["rdp"],
|
||||
"distro": "ubuntu22",
|
||||
"base_image": "ubuntu:22.04",
|
||||
"build_base": "debian:bookworm-slim",
|
||||
"hostname": "WIN-DESK01",
|
||||
"archetype": null,
|
||||
"service_config": {},
|
||||
"nmap_os": "windows",
|
||||
"mutate_interval": null,
|
||||
"last_mutated": 0.0,
|
||||
"last_login_attempt": 0.0
|
||||
}
|
||||
]
|
||||
},
|
||||
"compose_path": "/home/anti/Tools/DECNET/decnet-compose.yml"
|
||||
}
|
||||
```
|
||||
|
||||
## API
|
||||
|
||||
All three helpers live in `decnet/config.py`:
|
||||
|
||||
### `save_state(config: DecnetConfig, compose_path: Path) -> None`
|
||||
|
||||
Dumps `{"config": config.model_dump(), "compose_path": str(compose_path)}`
|
||||
as pretty-printed JSON (`indent=2`) to `STATE_FILE`
|
||||
(`<project root>/decnet-state.json`). Overwrites any existing file.
|
||||
|
||||
Called by `decnet/engine/deployer.py::deploy` after the compose file is
|
||||
written and before `docker compose up`.
|
||||
|
||||
### `load_state() -> tuple[DecnetConfig, Path] | None`
|
||||
|
||||
Returns `None` when the file does not exist. Otherwise parses the JSON,
|
||||
re-hydrates `DecnetConfig`, and returns `(config, Path(compose_path))`.
|
||||
|
||||
Callers:
|
||||
|
||||
- `decnet/engine/deployer.py` — `teardown()` and `status()`.
|
||||
- `decnet/sniffer/worker.py` — builds the IP-to-decky-name map.
|
||||
- `decnet/collector/worker.py` — resolves the exact set of service container
|
||||
names to tail. Wrapped in `asyncio.to_thread()` to keep the event loop clean.
|
||||
- `decnet/web/db/sqlmodel_repo.py` — uses `asyncio.to_thread(load_state)` to
|
||||
surface deployment metadata through the dashboard API.
|
||||
|
||||
### `clear_state() -> None`
|
||||
|
||||
`unlink()`s the state file if present. A no-op otherwise. Called once by
|
||||
`teardown()` after `docker compose down` and host-interface cleanup succeed.
|
||||
|
||||
## How teardown cleans host interfaces
|
||||
|
||||
`decnet/engine/deployer.py::teardown(decky_id=None)` runs, in order:
|
||||
|
||||
1. `load_state()`. If it returns `None`, prints
|
||||
`No active deployment found (no decnet-state.json).` and exits.
|
||||
2. If `decky_id` is given, `docker compose stop <decky>-<svc>...` then
|
||||
`docker compose rm -f ...` for that decky only. **No host-interface
|
||||
cleanup and no state clear** — the rest of the fleet is still alive.
|
||||
3. If no `decky_id` (full teardown):
|
||||
1. `docker compose down` with the `compose_path` from state.
|
||||
2. Compute the decky IP range with `ips_to_range([d.ip for d in config.deckies])`.
|
||||
3. Remove the host-side L2 interface:
|
||||
- `teardown_host_ipvlan(decky_range)` when `config.ipvlan` is `true`, or
|
||||
- `teardown_host_macvlan(decky_range)` otherwise.
|
||||
4. `remove_macvlan_network(client)` drops the docker network.
|
||||
5. `clear_state()` deletes `decnet-state.json`.
|
||||
6. Logs `teardown complete` and prints the driver that was removed.
|
||||
|
||||
If step 3 never runs (you ctrl-C'd, or one of the subprocess calls errored),
|
||||
`decnet-state.json` stays on disk and so do the host interfaces. Re-running
|
||||
`sudo decnet teardown --all` is idempotent and safe.
|
||||
|
||||
## When you need `sudo`
|
||||
|
||||
Anything that touches host networking — creating or removing a MACVLAN /
|
||||
IPvlan parent interface, opening a raw socket for the sniffer — needs
|
||||
`CAP_NET_ADMIN`, which in practice means `sudo`:
|
||||
|
||||
- `sudo decnet deploy ...` — creates the host interface, writes
|
||||
`decnet-state.json`, brings up the compose project.
|
||||
- `sudo decnet teardown` / `sudo decnet teardown --all` — removes host
|
||||
interfaces, clears state. Without `sudo` the ip-link calls fail and the
|
||||
state file is left behind.
|
||||
- `sudo decnet teardown --id decky-01` — still needs `sudo` if the compose
|
||||
project was created by root.
|
||||
- `sudo decnet sniffer --daemon` — raw packet capture on the parent iface.
|
||||
|
||||
Read-only commands that only consult `decnet-state.json` and the dashboard
|
||||
DB do not need root:
|
||||
|
||||
- `decnet status`
|
||||
- `decnet services`
|
||||
- `decnet deploy --dry-run` (generates the compose file only)
|
||||
- `decnet api` / `decnet web` once the deployment is up — as long as the
|
||||
state file and `DECNET_SYSTEM_LOGS` are readable by the invoking user.
|
||||
`decnet/config.py` drops root ownership of the system log when invoked via
|
||||
`sudo` precisely so the follow-up non-root commands can append to it.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **`No active deployment found`** — `decnet-state.json` is missing.
|
||||
Either the deploy never completed, or a previous teardown already ran.
|
||||
- **Orphan host interfaces after a crash** — re-run
|
||||
`sudo decnet teardown --all`. If state is gone, remove them manually with
|
||||
`ip link del decnet-mv0` (or the ipvlan equivalent) and delete the docker
|
||||
network.
|
||||
- **`PermissionError` writing the state file** — you ran `decnet deploy`
|
||||
without `sudo` on a fresh checkout; the project root is not writable by
|
||||
the current user. Either `chmod` the directory or run as root.
|
||||
- **Stale `compose_path`** — moving the project directory after deploy
|
||||
breaks teardown. Tear down first, move, redeploy.
|
||||
Reference in New Issue
Block a user