diff --git a/Logging-and-Syslog.md b/Logging-and-Syslog.md new file mode 100644 index 0000000..80edc98 --- /dev/null +++ b/Logging-and-Syslog.md @@ -0,0 +1,152 @@ +# Logging and Syslog + +DECNET speaks RFC 5424 everywhere. Every control-plane log line, every decky +honeypot event, and every forwarded message uses the same wire format so a +single parser (Logstash, rsyslog, the bundled ingester) can consume it end to +end. + +## RFC 5424 formatter + +The control-plane formatter lives in `decnet/config.py` +(`Rfc5424Formatter`). Its output is: + +``` +1 TIMESTAMP HOSTNAME APP-NAME PROCID MSGID STRUCTURED-DATA MSG +``` + +Field rules: + +- `PRIVAL` = `facility * 8 + severity`. Facility is fixed at **local0** + (16), so every line begins with `<13x>` where `x` depends on severity. +- Version is always `1`. +- Timestamp is the record's `created` time, rendered as an ISO-8601 UTC + string with **microsecond** precision + (`.isoformat(timespec="microseconds")`). +- `HOSTNAME` is captured once at import time via `socket.gethostname()`. +- `APP-NAME` defaults to `decnet`, overridable per record via the + `decnet_component` attribute (set by the `ComponentAdapter` in + `decnet/logging/__init__.py`). +- `PROCID` is the live `os.getpid()`. +- `MSGID` is the Python logger name (e.g. `decnet.config`). +- `STRUCTURED-DATA` is the NILVALUE `-` for control-plane logs. +- `MSG` is the formatted record, with exception tracebacks appended on a + newline when present. + +### Severity map + +| Python level | RFC 5424 severity | +|--------------|-------------------| +| CRITICAL | 2 (Critical) | +| ERROR | 3 (Error) | +| WARNING | 4 (Warning) | +| INFO | 6 (Informational) | +| DEBUG | 7 (Debug) | + +### Example line + +``` +<134>1 2026-04-12T21:48:03.123456+00:00 host decnet 1234 decnet.config - Dev mode active +``` + +`134` decodes as facility 16 (local0) × 8 + severity 6 (INFO). + +## Handlers installed by `_configure_logging` + +`decnet/config.py::_configure_logging(dev)` runs at import time, gated on +`DECNET_DEVELOPER`. It is idempotent — if an RFC 5424 `StreamHandler` is +already on the root logger, it returns. + +Installed handlers: + +1. A stderr `StreamHandler` with `Rfc5424Formatter`. Root level is + `DEBUG` when `dev=True`, otherwise `INFO`. +2. An `InodeAwareRotatingFileHandler` + (`decnet/logging/inode_aware_handler.py`) pointed at + `DECNET_SYSTEM_LOGS` (default `decnet.system.log` in `$PWD`), + `maxBytes=10 MB`, `backupCount=5`, `encoding="utf-8"`. Skipped when any + `PYTEST*` environment variable is set. + +`InodeAwareRotatingFileHandler` extends the stdlib `RotatingFileHandler` +with a cheap `os.stat` on every emit: if the file's `(st_ino, st_dev)` +differ from the held fd, the handler closes and reopens. This survives +`logrotate` (without copytruncate), `rm`, and sudo-induced ownership +flips without losing lines, and it falls back to `handleError` rather +than crashing if it cannot reopen. + +## Root-chown under sudo + +When deploy runs as root (required for MACVLAN/IPVLAN), the log file is +created root-owned. `decnet/privdrop.py::chown_to_invoking_user` is +called right after the file handler is wired up in `_configure_logging` +— it honours `SUDO_UID` / `SUDO_GID` and hands the file back to the +invoking user so a subsequent non-root `decnet api` or `decnet status` +can still append. The honeypot log file handler +(`decnet/logging/file_handler.py`) additionally calls +`chown_tree_to_invoking_user` on the log directory. + +## `--log-target HOST:PORT` forwarding + +Service plugins emit directly to the aggregator; DECNET itself stays +agnostic about what listens. `decnet/logging/forwarder.py` exposes: + +- `parse_log_target("ip:port") -> (host, port)` — rejects anything that + doesn't split on a trailing `:port` with a digit-only port. +- `probe_log_target(log_target, timeout=2.0)` — a non-fatal TCP connect + used at deploy time to warn the operator if the target is unreachable. + +The CLI accepts the value and injects it into every service plugin's +`compose_fragment(...)` as the `log_target` kwarg. Plugins (see +`decnet/services/base.py` and concrete services like `ssh.py`, +`smtp.py`, `sniffer.py`, `mysql.py`, `redis.py`, +`elasticsearch.py`, `https.py`) add it to the container's environment as +`LOG_TARGET=ip:port`. The in-container emit helpers (for example +`templates/ssh/emit_capture.py`) read `LOG_TARGET` and write a +structured-data RFC 5424 line per event. The per-service formatter lives +in `decnet/logging/syslog_formatter.py::format_rfc5424`, which uses +facility `local0`, PEN `relay@55555` for structured data, and escapes +SD-PARAM-VALUE per RFC 5424 §6.3.3. + +Deckies typically emit to two destinations simultaneously: a local +rotating file (`decnet/logging/file_handler.py`, default +`/var/log/decnet/decnet.log`, same 10 MB × 5 rotation, controlled by +`DECNET_LOG_FILE`) and the remote `LOG_TARGET` when set. + +## Ingestion pipeline + +`decnet/web/ingester.py::log_ingestion_worker` is a FastAPI background +task that tails the JSON sidecar of the honeypot log +(`.json`) and bulk-inserts rows into the +repository. + +Batching is governed by two env vars (see +[Environment variables](Environment-Variables)): + +- `DECNET_BATCH_SIZE` (default `100`) — flush when the accumulated + batch hits this row count. +- `DECNET_BATCH_MAX_WAIT_MS` (default `250`) — flush when this many + milliseconds have passed since the batch started, even if smaller. + +The worker persists its byte-offset in the repository under +`ingest_worker_position`, so restarts resume where they left off. +Partial trailing lines are deferred to the next iteration, and +truncation is detected (`st_size < position`) and resets the offset to +0. Each record gets an OpenTelemetry child span chained off the +collector's via `extract_context`, so the full journey from packet +capture to DB insert is visible in Jaeger — see +[Tracing and profiling](Tracing-and-Profiling). + +`_flush_batch` commits the rows via `repo.add_logs(...)`, runs +`_extract_bounty` on each entry (credentials, JA3/JA3S/JA4, JARM, +HASSHServer, TCP/IP fingerprints, TLS certificates, VNC/SSH banners, +HTTP User-Agents), and finally updates the saved position. If the task +is being cancelled during lifespan teardown, it bails out before +touching the DB so the un-committed tail is re-read next startup +instead of being lost. + +## From DB to dashboard + +Rows land in the repository's `logs` table and are served by the +`/api/logs` endpoints. The live-logs page streams them over +Server-Sent Events and the dashboard renders aggregates +(per-service counts, attackers, bounties). See the +[Web dashboard](Web-Dashboard) for the UI side.