diff --git a/Logging-and-Syslog.md b/Logging-and-Syslog.md new file mode 100644 index 0000000..80edc98 --- /dev/null +++ b/Logging-and-Syslog.md @@ -0,0 +1,152 @@ +# Logging and Syslog + +DECNET speaks RFC 5424 everywhere. Every control-plane log line, every decky +honeypot event, and every forwarded message uses the same wire format so a +single parser (Logstash, rsyslog, the bundled ingester) can consume it end to +end. + +## RFC 5424 formatter + +The control-plane formatter lives in `decnet/config.py` +(`Rfc5424Formatter`). Its output is: + +``` +1 TIMESTAMP HOSTNAME APP-NAME PROCID MSGID STRUCTURED-DATA MSG +``` + +Field rules: + +- `PRIVAL` = `facility * 8 + severity`. Facility is fixed at **local0** + (16), so every line begins with `<13x>` where `x` depends on severity. +- Version is always `1`. +- Timestamp is the record's `created` time, rendered as an ISO-8601 UTC + string with **microsecond** precision + (`.isoformat(timespec="microseconds")`). +- `HOSTNAME` is captured once at import time via `socket.gethostname()`. +- `APP-NAME` defaults to `decnet`, overridable per record via the + `decnet_component` attribute (set by the `ComponentAdapter` in + `decnet/logging/__init__.py`). +- `PROCID` is the live `os.getpid()`. +- `MSGID` is the Python logger name (e.g. `decnet.config`). +- `STRUCTURED-DATA` is the NILVALUE `-` for control-plane logs. +- `MSG` is the formatted record, with exception tracebacks appended on a + newline when present. + +### Severity map + +| Python level | RFC 5424 severity | +|--------------|-------------------| +| CRITICAL | 2 (Critical) | +| ERROR | 3 (Error) | +| WARNING | 4 (Warning) | +| INFO | 6 (Informational) | +| DEBUG | 7 (Debug) | + +### Example line + +``` +<134>1 2026-04-12T21:48:03.123456+00:00 host decnet 1234 decnet.config - Dev mode active +``` + +`134` decodes as facility 16 (local0) × 8 + severity 6 (INFO). + +## Handlers installed by `_configure_logging` + +`decnet/config.py::_configure_logging(dev)` runs at import time, gated on +`DECNET_DEVELOPER`. It is idempotent — if an RFC 5424 `StreamHandler` is +already on the root logger, it returns. + +Installed handlers: + +1. A stderr `StreamHandler` with `Rfc5424Formatter`. Root level is + `DEBUG` when `dev=True`, otherwise `INFO`. +2. An `InodeAwareRotatingFileHandler` + (`decnet/logging/inode_aware_handler.py`) pointed at + `DECNET_SYSTEM_LOGS` (default `decnet.system.log` in `$PWD`), + `maxBytes=10 MB`, `backupCount=5`, `encoding="utf-8"`. Skipped when any + `PYTEST*` environment variable is set. + +`InodeAwareRotatingFileHandler` extends the stdlib `RotatingFileHandler` +with a cheap `os.stat` on every emit: if the file's `(st_ino, st_dev)` +differ from the held fd, the handler closes and reopens. This survives +`logrotate` (without copytruncate), `rm`, and sudo-induced ownership +flips without losing lines, and it falls back to `handleError` rather +than crashing if it cannot reopen. + +## Root-chown under sudo + +When deploy runs as root (required for MACVLAN/IPVLAN), the log file is +created root-owned. `decnet/privdrop.py::chown_to_invoking_user` is +called right after the file handler is wired up in `_configure_logging` +— it honours `SUDO_UID` / `SUDO_GID` and hands the file back to the +invoking user so a subsequent non-root `decnet api` or `decnet status` +can still append. The honeypot log file handler +(`decnet/logging/file_handler.py`) additionally calls +`chown_tree_to_invoking_user` on the log directory. + +## `--log-target HOST:PORT` forwarding + +Service plugins emit directly to the aggregator; DECNET itself stays +agnostic about what listens. `decnet/logging/forwarder.py` exposes: + +- `parse_log_target("ip:port") -> (host, port)` — rejects anything that + doesn't split on a trailing `:port` with a digit-only port. +- `probe_log_target(log_target, timeout=2.0)` — a non-fatal TCP connect + used at deploy time to warn the operator if the target is unreachable. + +The CLI accepts the value and injects it into every service plugin's +`compose_fragment(...)` as the `log_target` kwarg. Plugins (see +`decnet/services/base.py` and concrete services like `ssh.py`, +`smtp.py`, `sniffer.py`, `mysql.py`, `redis.py`, +`elasticsearch.py`, `https.py`) add it to the container's environment as +`LOG_TARGET=ip:port`. The in-container emit helpers (for example +`templates/ssh/emit_capture.py`) read `LOG_TARGET` and write a +structured-data RFC 5424 line per event. The per-service formatter lives +in `decnet/logging/syslog_formatter.py::format_rfc5424`, which uses +facility `local0`, PEN `relay@55555` for structured data, and escapes +SD-PARAM-VALUE per RFC 5424 §6.3.3. + +Deckies typically emit to two destinations simultaneously: a local +rotating file (`decnet/logging/file_handler.py`, default +`/var/log/decnet/decnet.log`, same 10 MB × 5 rotation, controlled by +`DECNET_LOG_FILE`) and the remote `LOG_TARGET` when set. + +## Ingestion pipeline + +`decnet/web/ingester.py::log_ingestion_worker` is a FastAPI background +task that tails the JSON sidecar of the honeypot log +(`.json`) and bulk-inserts rows into the +repository. + +Batching is governed by two env vars (see +[Environment variables](Environment-Variables)): + +- `DECNET_BATCH_SIZE` (default `100`) — flush when the accumulated + batch hits this row count. +- `DECNET_BATCH_MAX_WAIT_MS` (default `250`) — flush when this many + milliseconds have passed since the batch started, even if smaller. + +The worker persists its byte-offset in the repository under +`ingest_worker_position`, so restarts resume where they left off. +Partial trailing lines are deferred to the next iteration, and +truncation is detected (`st_size < position`) and resets the offset to +0. Each record gets an OpenTelemetry child span chained off the +collector's via `extract_context`, so the full journey from packet +capture to DB insert is visible in Jaeger — see +[Tracing and profiling](Tracing-and-Profiling). + +`_flush_batch` commits the rows via `repo.add_logs(...)`, runs +`_extract_bounty` on each entry (credentials, JA3/JA3S/JA4, JARM, +HASSHServer, TCP/IP fingerprints, TLS certificates, VNC/SSH banners, +HTTP User-Agents), and finally updates the saved position. If the task +is being cancelled during lifespan teardown, it bails out before +touching the DB so the un-committed tail is re-read next startup +instead of being lost. + +## From DB to dashboard + +Rows land in the repository's `logs` table and are served by the +`/api/logs` endpoints. The live-logs page streams them over +Server-Sent Events and the dashboard renders aggregates +(per-service counts, attackers, bounties). See the +[Web dashboard](Web-Dashboard) for the UI side. diff --git a/Mutation-and-Randomization.md b/Mutation-and-Randomization.md new file mode 100644 index 0000000..24bebbc --- /dev/null +++ b/Mutation-and-Randomization.md @@ -0,0 +1,102 @@ +# Mutation and Randomization + +DECNET's value as a deception network depends on the decoy fleet looking heterogeneous at deploy time and shifting over its lifetime. This page documents the two mechanisms that deliver that: randomization at build, and mutation at runtime. + +See also: [CLI reference](CLI-Reference), [Archetypes](Archetypes), [Distros](Distro-Profiles). + +## Randomization at deploy time + +### `--randomize-services` + +When `decnet deploy` is invoked with `--randomize-services`, each decky receives a randomly drawn service set instead of the fixed list passed via `--services` or the set implied by `--archetype`. + +The selection logic lives in `build_deckies()` (`decnet/fleet.py`). For each decky: + +1. A count `k` is drawn uniformly from `[1, min(3, len(pool))]`. +2. `k` service names are sampled without replacement from the pool (`all_service_names()`, or the archetype service list if an archetype is set). +3. The chosen set is compared against a `used_combos: set[frozenset]` that tracks combinations already assigned in this deploy. If it collides with an existing combination, the draw is retried. +4. After 20 retries, the last draw is accepted even if duplicated. With small pools and many deckies, exact uniqueness is not always possible — the retry cap prevents an infinite loop. + +### Random hostnames + +Hostnames are generated by `random_hostname(distro_slug)` in `decnet/distros.py`. The style depends on the distro profile's `hostname_style` field: + +| Style | Example | Distros | +|-----------|-----------------------|---------------------------------| +| `generic` | `SRV-PROD-42` | Debian, Ubuntu | +| `rhel` | `web37.localdomain` | Rocky, CentOS, Fedora | +| `minimal` | `alpha-18` | Alpine | +| `rolling` | `nova-backup` | Kali, Arch | + +Word pool and numeric range are defined in `_NAME_WORDS` and the `random.randint(10, 99)` call in the same file. + +### Random distros + +`random_distro()` picks a uniform-random entry from the `DISTROS` dict (`decnet/distros.py`). Each entry is a `DistroProfile` with a slug, a Docker image, a display name, a hostname style, and a build base image used for service Dockerfiles (which assume `apt-get`, so non-Debian distros fall back to `debian:bookworm-slim` for builds). + +The current set: `debian`, `ubuntu22`, `ubuntu20`, `rocky9`, `centos7`, `alpine`, `fedora`, `kali`, `arch`. + +### MAC addresses + +MAC addresses are not assigned by DECNET. The MACVLAN driver auto-generates a MAC for each container interface at container start. There is no knob to pin or rotate them from the deploy config; if you need deterministic MACs, attach them out-of-band at the Docker network layer. + +## Mutation at runtime + +Randomization only fires at build time. To keep the fleet moving, DECNET supports per-decky service rotation. + +### Storage + +Every `DeckyConfig` (`decnet/config.py`) carries two mutation-related fields: + +- `mutate_interval: int | None` — minutes between rotations for this decky. `None` disables automatic rotation for that decky. +- `last_mutated: float` — Unix timestamp of the most recent successful mutation. + +The top-level `DecnetConfig` also holds a fleet-wide `mutate_interval`, which defaults to `DEFAULT_MUTATE_INTERVAL = 30` (minutes) from `decnet/config.py`. Per-decky values override the fleet default. + +### Engine + +`decnet/mutator/engine.py` exposes three async entry points, all operating against a `BaseRepository`: + +- `mutate_decky(decky_name, repo)` — Intra-archetype shuffle for one decky. Rebuilds the service list by sampling 1-3 services from the decky's archetype pool (or the full registry if no archetype is set), retrying up to 20 times to avoid picking the exact same set. Updates `last_mutated`, persists state, rewrites the compose file, and runs `docker compose up -d --remove-orphans`. +- `mutate_all(repo, force=False)` — Iterates all deckies. For each, computes `elapsed = now - last_mutated` and calls `mutate_decky` when `elapsed >= interval * 60`. `force=True` bypasses the schedule. +- `run_watch_loop(repo, poll_interval_secs=10)` — Infinite loop that calls `mutate_all` every `poll_interval_secs`. Invoked by `decnet mutate --watch`. + +### Trigger from the CLI + +```bash +# Deploy a fleet that rotates every 15 minutes +decnet deploy --mode unihost --deckies 5 --interface eth0 \ + --randomize-services --mutate-interval 15 + +# Mutate a single decky now (forces immediately, ignores schedule) +decnet mutate --decky decky-03 + +# Mutate all deckies now +decnet mutate --all + +# Run the watcher in the foreground +decnet mutate --watch + +# Run the watcher as a detached daemon +decnet mutate --watch --daemon +``` + +`decnet deploy` also starts a background watcher automatically; the standalone `decnet mutate --watch` is for operators who want to run the loop themselves. + +## Operational trade-offs + +Mutation is a blunt instrument. The interval you pick is a trade between deception fidelity and observability: + +- **Short intervals (≤ 5 min)**: The fleet looks lively and fingerprint scans will never converge, but every rotation churns containers, rewrites compose files, and wipes short-lived attacker state — half-finished brute-force sessions, partially uploaded payloads, active TCP connections. You will lose IOCs that would otherwise have been captured. +- **Default (30 min)**: Reasonable balance. Most scan-and-go attackers see a consistent snapshot; long-dwell attackers see the fleet shift under them, which itself is a useful signal. +- **Long intervals (hours) or `None`**: The fleet looks static. An attacker who fingerprints twice gets identical results, which is not how real networks behave under patching and reconfiguration. + +For research deployments where you want a specific attacker session recorded end-to-end, disable mutation on the decky under observation (`mutate_interval=None` on that decky only) while leaving the rest of the fleet rotating. + +## Sources + +- `decnet/distros.py` — `DISTROS`, `random_hostname`, `random_distro` +- `decnet/fleet.py` — `build_deckies` +- `decnet/mutator/engine.py` — `mutate_decky`, `mutate_all`, `run_watch_loop` +- `decnet/config.py` — `DEFAULT_MUTATE_INTERVAL`, `DeckyConfig`, `DecnetConfig` +- `decnet/cli.py` — `deploy`, `mutate` commands