Add Unit 11: Logging and Syslog wiki page
152
Logging-and-Syslog.md
Normal file
152
Logging-and-Syslog.md
Normal file
@@ -0,0 +1,152 @@
|
||||
# Logging and Syslog
|
||||
|
||||
DECNET speaks RFC 5424 everywhere. Every control-plane log line, every decky
|
||||
honeypot event, and every forwarded message uses the same wire format so a
|
||||
single parser (Logstash, rsyslog, the bundled ingester) can consume it end to
|
||||
end.
|
||||
|
||||
## RFC 5424 formatter
|
||||
|
||||
The control-plane formatter lives in `decnet/config.py`
|
||||
(`Rfc5424Formatter`). Its output is:
|
||||
|
||||
```
|
||||
<PRIVAL>1 TIMESTAMP HOSTNAME APP-NAME PROCID MSGID STRUCTURED-DATA MSG
|
||||
```
|
||||
|
||||
Field rules:
|
||||
|
||||
- `PRIVAL` = `facility * 8 + severity`. Facility is fixed at **local0**
|
||||
(16), so every line begins with `<13x>` where `x` depends on severity.
|
||||
- Version is always `1`.
|
||||
- Timestamp is the record's `created` time, rendered as an ISO-8601 UTC
|
||||
string with **microsecond** precision
|
||||
(`.isoformat(timespec="microseconds")`).
|
||||
- `HOSTNAME` is captured once at import time via `socket.gethostname()`.
|
||||
- `APP-NAME` defaults to `decnet`, overridable per record via the
|
||||
`decnet_component` attribute (set by the `ComponentAdapter` in
|
||||
`decnet/logging/__init__.py`).
|
||||
- `PROCID` is the live `os.getpid()`.
|
||||
- `MSGID` is the Python logger name (e.g. `decnet.config`).
|
||||
- `STRUCTURED-DATA` is the NILVALUE `-` for control-plane logs.
|
||||
- `MSG` is the formatted record, with exception tracebacks appended on a
|
||||
newline when present.
|
||||
|
||||
### Severity map
|
||||
|
||||
| Python level | RFC 5424 severity |
|
||||
|--------------|-------------------|
|
||||
| CRITICAL | 2 (Critical) |
|
||||
| ERROR | 3 (Error) |
|
||||
| WARNING | 4 (Warning) |
|
||||
| INFO | 6 (Informational) |
|
||||
| DEBUG | 7 (Debug) |
|
||||
|
||||
### Example line
|
||||
|
||||
```
|
||||
<134>1 2026-04-12T21:48:03.123456+00:00 host decnet 1234 decnet.config - Dev mode active
|
||||
```
|
||||
|
||||
`134` decodes as facility 16 (local0) × 8 + severity 6 (INFO).
|
||||
|
||||
## Handlers installed by `_configure_logging`
|
||||
|
||||
`decnet/config.py::_configure_logging(dev)` runs at import time, gated on
|
||||
`DECNET_DEVELOPER`. It is idempotent — if an RFC 5424 `StreamHandler` is
|
||||
already on the root logger, it returns.
|
||||
|
||||
Installed handlers:
|
||||
|
||||
1. A stderr `StreamHandler` with `Rfc5424Formatter`. Root level is
|
||||
`DEBUG` when `dev=True`, otherwise `INFO`.
|
||||
2. An `InodeAwareRotatingFileHandler`
|
||||
(`decnet/logging/inode_aware_handler.py`) pointed at
|
||||
`DECNET_SYSTEM_LOGS` (default `decnet.system.log` in `$PWD`),
|
||||
`maxBytes=10 MB`, `backupCount=5`, `encoding="utf-8"`. Skipped when any
|
||||
`PYTEST*` environment variable is set.
|
||||
|
||||
`InodeAwareRotatingFileHandler` extends the stdlib `RotatingFileHandler`
|
||||
with a cheap `os.stat` on every emit: if the file's `(st_ino, st_dev)`
|
||||
differ from the held fd, the handler closes and reopens. This survives
|
||||
`logrotate` (without copytruncate), `rm`, and sudo-induced ownership
|
||||
flips without losing lines, and it falls back to `handleError` rather
|
||||
than crashing if it cannot reopen.
|
||||
|
||||
## Root-chown under sudo
|
||||
|
||||
When deploy runs as root (required for MACVLAN/IPVLAN), the log file is
|
||||
created root-owned. `decnet/privdrop.py::chown_to_invoking_user` is
|
||||
called right after the file handler is wired up in `_configure_logging`
|
||||
— it honours `SUDO_UID` / `SUDO_GID` and hands the file back to the
|
||||
invoking user so a subsequent non-root `decnet api` or `decnet status`
|
||||
can still append. The honeypot log file handler
|
||||
(`decnet/logging/file_handler.py`) additionally calls
|
||||
`chown_tree_to_invoking_user` on the log directory.
|
||||
|
||||
## `--log-target HOST:PORT` forwarding
|
||||
|
||||
Service plugins emit directly to the aggregator; DECNET itself stays
|
||||
agnostic about what listens. `decnet/logging/forwarder.py` exposes:
|
||||
|
||||
- `parse_log_target("ip:port") -> (host, port)` — rejects anything that
|
||||
doesn't split on a trailing `:port` with a digit-only port.
|
||||
- `probe_log_target(log_target, timeout=2.0)` — a non-fatal TCP connect
|
||||
used at deploy time to warn the operator if the target is unreachable.
|
||||
|
||||
The CLI accepts the value and injects it into every service plugin's
|
||||
`compose_fragment(...)` as the `log_target` kwarg. Plugins (see
|
||||
`decnet/services/base.py` and concrete services like `ssh.py`,
|
||||
`smtp.py`, `sniffer.py`, `mysql.py`, `redis.py`,
|
||||
`elasticsearch.py`, `https.py`) add it to the container's environment as
|
||||
`LOG_TARGET=ip:port`. The in-container emit helpers (for example
|
||||
`templates/ssh/emit_capture.py`) read `LOG_TARGET` and write a
|
||||
structured-data RFC 5424 line per event. The per-service formatter lives
|
||||
in `decnet/logging/syslog_formatter.py::format_rfc5424`, which uses
|
||||
facility `local0`, PEN `relay@55555` for structured data, and escapes
|
||||
SD-PARAM-VALUE per RFC 5424 §6.3.3.
|
||||
|
||||
Deckies typically emit to two destinations simultaneously: a local
|
||||
rotating file (`decnet/logging/file_handler.py`, default
|
||||
`/var/log/decnet/decnet.log`, same 10 MB × 5 rotation, controlled by
|
||||
`DECNET_LOG_FILE`) and the remote `LOG_TARGET` when set.
|
||||
|
||||
## Ingestion pipeline
|
||||
|
||||
`decnet/web/ingester.py::log_ingestion_worker` is a FastAPI background
|
||||
task that tails the JSON sidecar of the honeypot log
|
||||
(`<DECNET_INGEST_LOG_FILE>.json`) and bulk-inserts rows into the
|
||||
repository.
|
||||
|
||||
Batching is governed by two env vars (see
|
||||
[Environment variables](Environment-Variables)):
|
||||
|
||||
- `DECNET_BATCH_SIZE` (default `100`) — flush when the accumulated
|
||||
batch hits this row count.
|
||||
- `DECNET_BATCH_MAX_WAIT_MS` (default `250`) — flush when this many
|
||||
milliseconds have passed since the batch started, even if smaller.
|
||||
|
||||
The worker persists its byte-offset in the repository under
|
||||
`ingest_worker_position`, so restarts resume where they left off.
|
||||
Partial trailing lines are deferred to the next iteration, and
|
||||
truncation is detected (`st_size < position`) and resets the offset to
|
||||
0. Each record gets an OpenTelemetry child span chained off the
|
||||
collector's via `extract_context`, so the full journey from packet
|
||||
capture to DB insert is visible in Jaeger — see
|
||||
[Tracing and profiling](Tracing-and-Profiling).
|
||||
|
||||
`_flush_batch` commits the rows via `repo.add_logs(...)`, runs
|
||||
`_extract_bounty` on each entry (credentials, JA3/JA3S/JA4, JARM,
|
||||
HASSHServer, TCP/IP fingerprints, TLS certificates, VNC/SSH banners,
|
||||
HTTP User-Agents), and finally updates the saved position. If the task
|
||||
is being cancelled during lifespan teardown, it bails out before
|
||||
touching the DB so the un-committed tail is re-read next startup
|
||||
instead of being lost.
|
||||
|
||||
## From DB to dashboard
|
||||
|
||||
Rows land in the repository's `logs` table and are served by the
|
||||
`/api/logs` endpoints. The live-logs page streams them over
|
||||
Server-Sent Events and the dashboard renders aggregates
|
||||
(per-service counts, attackers, bounties). See the
|
||||
[Web dashboard](Web-Dashboard) for the UI side.
|
||||
Reference in New Issue
Block a user