wiki: merge logging-syslog into main

2026-04-18 06:09:32 -04:00
2 changed files with 254 additions and 0 deletions

152
Logging-and-Syslog.md Normal file

@@ -0,0 +1,152 @@
# Logging and Syslog
DECNET speaks RFC 5424 everywhere. Every control-plane log line, every decky
honeypot event, and every forwarded message uses the same wire format so a
single parser (Logstash, rsyslog, the bundled ingester) can consume it end to
end.
## RFC 5424 formatter
The control-plane formatter lives in `decnet/config.py`
(`Rfc5424Formatter`). Its output is:
```
<PRIVAL>1 TIMESTAMP HOSTNAME APP-NAME PROCID MSGID STRUCTURED-DATA MSG
```
Field rules:
- `PRIVAL` = `facility * 8 + severity`. Facility is fixed at **local0**
(16), so every line begins with `<13x>` where `x` depends on severity.
- Version is always `1`.
- Timestamp is the record's `created` time, rendered as an ISO-8601 UTC
string with **microsecond** precision
(`.isoformat(timespec="microseconds")`).
- `HOSTNAME` is captured once at import time via `socket.gethostname()`.
- `APP-NAME` defaults to `decnet`, overridable per record via the
`decnet_component` attribute (set by the `ComponentAdapter` in
`decnet/logging/__init__.py`).
- `PROCID` is the live `os.getpid()`.
- `MSGID` is the Python logger name (e.g. `decnet.config`).
- `STRUCTURED-DATA` is the NILVALUE `-` for control-plane logs.
- `MSG` is the formatted record, with exception tracebacks appended on a
newline when present.
### Severity map
| Python level | RFC 5424 severity |
|--------------|-------------------|
| CRITICAL | 2 (Critical) |
| ERROR | 3 (Error) |
| WARNING | 4 (Warning) |
| INFO | 6 (Informational) |
| DEBUG | 7 (Debug) |
### Example line
```
<134>1 2026-04-12T21:48:03.123456+00:00 host decnet 1234 decnet.config - Dev mode active
```
`134` decodes as facility 16 (local0) × 8 + severity 6 (INFO).
## Handlers installed by `_configure_logging`
`decnet/config.py::_configure_logging(dev)` runs at import time, gated on
`DECNET_DEVELOPER`. It is idempotent — if an RFC 5424 `StreamHandler` is
already on the root logger, it returns.
Installed handlers:
1. A stderr `StreamHandler` with `Rfc5424Formatter`. Root level is
`DEBUG` when `dev=True`, otherwise `INFO`.
2. An `InodeAwareRotatingFileHandler`
(`decnet/logging/inode_aware_handler.py`) pointed at
`DECNET_SYSTEM_LOGS` (default `decnet.system.log` in `$PWD`),
`maxBytes=10 MB`, `backupCount=5`, `encoding="utf-8"`. Skipped when any
`PYTEST*` environment variable is set.
`InodeAwareRotatingFileHandler` extends the stdlib `RotatingFileHandler`
with a cheap `os.stat` on every emit: if the file's `(st_ino, st_dev)`
differ from the held fd, the handler closes and reopens. This survives
`logrotate` (without copytruncate), `rm`, and sudo-induced ownership
flips without losing lines, and it falls back to `handleError` rather
than crashing if it cannot reopen.
## Root-chown under sudo
When deploy runs as root (required for MACVLAN/IPVLAN), the log file is
created root-owned. `decnet/privdrop.py::chown_to_invoking_user` is
called right after the file handler is wired up in `_configure_logging`
— it honours `SUDO_UID` / `SUDO_GID` and hands the file back to the
invoking user so a subsequent non-root `decnet api` or `decnet status`
can still append. The honeypot log file handler
(`decnet/logging/file_handler.py`) additionally calls
`chown_tree_to_invoking_user` on the log directory.
## `--log-target HOST:PORT` forwarding
Service plugins emit directly to the aggregator; DECNET itself stays
agnostic about what listens. `decnet/logging/forwarder.py` exposes:
- `parse_log_target("ip:port") -> (host, port)` — rejects anything that
doesn't split on a trailing `:port` with a digit-only port.
- `probe_log_target(log_target, timeout=2.0)` — a non-fatal TCP connect
used at deploy time to warn the operator if the target is unreachable.
The CLI accepts the value and injects it into every service plugin's
`compose_fragment(...)` as the `log_target` kwarg. Plugins (see
`decnet/services/base.py` and concrete services like `ssh.py`,
`smtp.py`, `sniffer.py`, `mysql.py`, `redis.py`,
`elasticsearch.py`, `https.py`) add it to the container's environment as
`LOG_TARGET=ip:port`. The in-container emit helpers (for example
`templates/ssh/emit_capture.py`) read `LOG_TARGET` and write a
structured-data RFC 5424 line per event. The per-service formatter lives
in `decnet/logging/syslog_formatter.py::format_rfc5424`, which uses
facility `local0`, PEN `relay@55555` for structured data, and escapes
SD-PARAM-VALUE per RFC 5424 §6.3.3.
Deckies typically emit to two destinations simultaneously: a local
rotating file (`decnet/logging/file_handler.py`, default
`/var/log/decnet/decnet.log`, same 10 MB × 5 rotation, controlled by
`DECNET_LOG_FILE`) and the remote `LOG_TARGET` when set.
## Ingestion pipeline
`decnet/web/ingester.py::log_ingestion_worker` is a FastAPI background
task that tails the JSON sidecar of the honeypot log
(`<DECNET_INGEST_LOG_FILE>.json`) and bulk-inserts rows into the
repository.
Batching is governed by two env vars (see
[Environment variables](Environment-Variables)):
- `DECNET_BATCH_SIZE` (default `100`) — flush when the accumulated
batch hits this row count.
- `DECNET_BATCH_MAX_WAIT_MS` (default `250`) — flush when this many
milliseconds have passed since the batch started, even if smaller.
The worker persists its byte-offset in the repository under
`ingest_worker_position`, so restarts resume where they left off.
Partial trailing lines are deferred to the next iteration, and
truncation is detected (`st_size < position`) and resets the offset to
0. Each record gets an OpenTelemetry child span chained off the
collector's via `extract_context`, so the full journey from packet
capture to DB insert is visible in Jaeger — see
[Tracing and profiling](Tracing-and-Profiling).
`_flush_batch` commits the rows via `repo.add_logs(...)`, runs
`_extract_bounty` on each entry (credentials, JA3/JA3S/JA4, JARM,
HASSHServer, TCP/IP fingerprints, TLS certificates, VNC/SSH banners,
HTTP User-Agents), and finally updates the saved position. If the task
is being cancelled during lifespan teardown, it bails out before
touching the DB so the un-committed tail is re-read next startup
instead of being lost.
## From DB to dashboard
Rows land in the repository's `logs` table and are served by the
`/api/logs` endpoints. The live-logs page streams them over
Server-Sent Events and the dashboard renders aggregates
(per-service counts, attackers, bounties). See the
[Web dashboard](Web-Dashboard) for the UI side.

@@ -0,0 +1,102 @@
# Mutation and Randomization
DECNET's value as a deception network depends on the decoy fleet looking heterogeneous at deploy time and shifting over its lifetime. This page documents the two mechanisms that deliver that: randomization at build, and mutation at runtime.
See also: [CLI reference](CLI-Reference), [Archetypes](Archetypes), [Distros](Distro-Profiles).
## Randomization at deploy time
### `--randomize-services`
When `decnet deploy` is invoked with `--randomize-services`, each decky receives a randomly drawn service set instead of the fixed list passed via `--services` or the set implied by `--archetype`.
The selection logic lives in `build_deckies()` (`decnet/fleet.py`). For each decky:
1. A count `k` is drawn uniformly from `[1, min(3, len(pool))]`.
2. `k` service names are sampled without replacement from the pool (`all_service_names()`, or the archetype service list if an archetype is set).
3. The chosen set is compared against a `used_combos: set[frozenset]` that tracks combinations already assigned in this deploy. If it collides with an existing combination, the draw is retried.
4. After 20 retries, the last draw is accepted even if duplicated. With small pools and many deckies, exact uniqueness is not always possible — the retry cap prevents an infinite loop.
### Random hostnames
Hostnames are generated by `random_hostname(distro_slug)` in `decnet/distros.py`. The style depends on the distro profile's `hostname_style` field:
| Style | Example | Distros |
|-----------|-----------------------|---------------------------------|
| `generic` | `SRV-PROD-42` | Debian, Ubuntu |
| `rhel` | `web37.localdomain` | Rocky, CentOS, Fedora |
| `minimal` | `alpha-18` | Alpine |
| `rolling` | `nova-backup` | Kali, Arch |
Word pool and numeric range are defined in `_NAME_WORDS` and the `random.randint(10, 99)` call in the same file.
### Random distros
`random_distro()` picks a uniform-random entry from the `DISTROS` dict (`decnet/distros.py`). Each entry is a `DistroProfile` with a slug, a Docker image, a display name, a hostname style, and a build base image used for service Dockerfiles (which assume `apt-get`, so non-Debian distros fall back to `debian:bookworm-slim` for builds).
The current set: `debian`, `ubuntu22`, `ubuntu20`, `rocky9`, `centos7`, `alpine`, `fedora`, `kali`, `arch`.
### MAC addresses
MAC addresses are not assigned by DECNET. The MACVLAN driver auto-generates a MAC for each container interface at container start. There is no knob to pin or rotate them from the deploy config; if you need deterministic MACs, attach them out-of-band at the Docker network layer.
## Mutation at runtime
Randomization only fires at build time. To keep the fleet moving, DECNET supports per-decky service rotation.
### Storage
Every `DeckyConfig` (`decnet/config.py`) carries two mutation-related fields:
- `mutate_interval: int | None` — minutes between rotations for this decky. `None` disables automatic rotation for that decky.
- `last_mutated: float` — Unix timestamp of the most recent successful mutation.
The top-level `DecnetConfig` also holds a fleet-wide `mutate_interval`, which defaults to `DEFAULT_MUTATE_INTERVAL = 30` (minutes) from `decnet/config.py`. Per-decky values override the fleet default.
### Engine
`decnet/mutator/engine.py` exposes three async entry points, all operating against a `BaseRepository`:
- `mutate_decky(decky_name, repo)` — Intra-archetype shuffle for one decky. Rebuilds the service list by sampling 1-3 services from the decky's archetype pool (or the full registry if no archetype is set), retrying up to 20 times to avoid picking the exact same set. Updates `last_mutated`, persists state, rewrites the compose file, and runs `docker compose up -d --remove-orphans`.
- `mutate_all(repo, force=False)` — Iterates all deckies. For each, computes `elapsed = now - last_mutated` and calls `mutate_decky` when `elapsed >= interval * 60`. `force=True` bypasses the schedule.
- `run_watch_loop(repo, poll_interval_secs=10)` — Infinite loop that calls `mutate_all` every `poll_interval_secs`. Invoked by `decnet mutate --watch`.
### Trigger from the CLI
```bash
# Deploy a fleet that rotates every 15 minutes
decnet deploy --mode unihost --deckies 5 --interface eth0 \
--randomize-services --mutate-interval 15
# Mutate a single decky now (forces immediately, ignores schedule)
decnet mutate --decky decky-03
# Mutate all deckies now
decnet mutate --all
# Run the watcher in the foreground
decnet mutate --watch
# Run the watcher as a detached daemon
decnet mutate --watch --daemon
```
`decnet deploy` also starts a background watcher automatically; the standalone `decnet mutate --watch` is for operators who want to run the loop themselves.
## Operational trade-offs
Mutation is a blunt instrument. The interval you pick is a trade between deception fidelity and observability:
- **Short intervals (≤ 5 min)**: The fleet looks lively and fingerprint scans will never converge, but every rotation churns containers, rewrites compose files, and wipes short-lived attacker state — half-finished brute-force sessions, partially uploaded payloads, active TCP connections. You will lose IOCs that would otherwise have been captured.
- **Default (30 min)**: Reasonable balance. Most scan-and-go attackers see a consistent snapshot; long-dwell attackers see the fleet shift under them, which itself is a useful signal.
- **Long intervals (hours) or `None`**: The fleet looks static. An attacker who fingerprints twice gets identical results, which is not how real networks behave under patching and reconfiguration.
For research deployments where you want a specific attacker session recorded end-to-end, disable mutation on the decky under observation (`mutate_interval=None` on that decky only) while leaving the rest of the fleet rotating.
## Sources
- `decnet/distros.py``DISTROS`, `random_hostname`, `random_distro`
- `decnet/fleet.py``build_deckies`
- `decnet/mutator/engine.py``mutate_decky`, `mutate_all`, `run_watch_loop`
- `decnet/config.py``DEFAULT_MUTATE_INTERVAL`, `DeckyConfig`, `DecnetConfig`
- `decnet/cli.py``deploy`, `mutate` commands