Files
DECNET/development/docs/services/COLLECTOR.md
anti 8dd4c78b33 refactor: strip DECNET tokens from container-visible surface
Rename the container-side logging module decnet_logging → syslog_bridge
(canonical at templates/syslog_bridge.py, synced into each template by
the deployer). Drop the stale per-template copies; setuptools find was
picking them up anyway. Swap useradd/USER/chown "decnet" for "logrelay"
so no obvious token appears in the rendered container image.

Apply the same cloaking pattern to the telnet template that SSH got:
syslog pipe moves to /run/systemd/journal/syslog-relay and the relay
is cat'd via exec -a "systemd-journal-fwd". rsyslog.d conf rename
99-decnet.conf → 50-journal-forward.conf. SSH capture script:
/var/decnet/captured → /var/lib/systemd/coredump (real systemd path),
logger tag decnet-capture → systemd-journal. Compose volume updated
to match the new in-container quarantine path.

SD element ID shifts decnet@55555 → relay@55555; synced across
collector, parser, sniffer, prober, formatter, tests, and docs so the
host-side pipeline still matches what containers emit.
2026-04-17 22:57:53 -04:00

3.8 KiB

DECNET Collector

The decnet/collector module is responsible for the background acquisition, normalization, and filtering of logs generated by the honeypot fleet. It acts as the bridge between the transient Docker container logs and the persistent analytical database.

Architecture

The Collector runs as a host-side worker (typically managed by the CLI or a daemon). It employs a hybrid asynchronous and multi-threaded model to handle log streaming from a dynamic number of containers without blocking the main event loop.

Log Pipeline Flow

  1. Discovery: Scans decnet-state.json to identify active Decky service containers.
  2. Streaming: Spawns a dedicated thread for every active container to tail its stdout via the Docker SDK.
  3. Normalization: Parses the raw RFC 5424 Syslog lines into structured JSON.
  4. Filtering: Applies a rate-limiter to deduplicate high-frequency connection events.
  5. Storage: Appends raw lines to .log and filtered JSON to .json for database ingestion.

Core Components

worker.py

log_collector_worker(log_file: str)

The main asynchronous entry point.

  • Initial Scan: Identifies all running containers that match the DECNET service naming convention.
  • Event Loop: Uses the Docker events API to listen for container:start events, allowing it to automatically pick up new Deckies that are deployed after the collector has started.
  • Task Management: Manages a dictionary of active streaming tasks, ensuring no container is streamed more than once and cleaning up completed tasks.

Log Normalization (RFC 5424)

DECNET services emit logs using a standardized RFC 5424 format with structured data. The parse_rfc5424 function is the primary tool for extracting this information.

  • Structured Data: Extracts parameters from the relay@55555 SD-ELEMENT.
  • Field Mapping: Identifies the attacker_ip by scanning common source IP fields (src_ip, client_ip, etc.).
  • Consistency: Formats timestamps into a human-readable %Y-%m-%d %H:%M:%S format for the analytical stream.

Ingestion Rate Limiter

To prevent the local SQLite database from being overwhelmed during credential-stuffing attacks or heavy port scanning, the Collector implements a window-based rate limiter for "lifecycle" events.

  • Scope: By default, it limits: connect, disconnect, connection, accept, and close.
  • Logic: It groups events by (attacker_ip, decky, service, event_type). If the same event occurs within the window, it is written to the raw .log file (for forensics) but discarded for the .json stream (ingestion).
  • Configuration:
    • DECNET_COLLECTOR_RL_WINDOW_SEC: The deduplication window size (default: 1.0s).
    • DECNET_COLLECTOR_RL_EVENT_TYPES: Comma-separated list of event types to limit.

Resilience & Operational Stability

Inode Tracking (_reopen_if_needed)

Log files can be rotated by logrotate or manually deleted. The Collector tracks the inode of the log handles. If the file on disk changes (indicating rotation or deletion), the collector transparently closes and reopens the handle, ensuring no logs are lost and preventing "stale handle" errors.

Docker SDK Integration

The Collector uses asyncio.to_thread to run the blocking Docker SDK logs(stream=True) calls. This ensures that the high-latency network calls to the Docker daemon do not starve the asynchronous event loop responsible for monitoring container starts.

Container Identification

The Collector uses two layers of verification to ensure it only collects logs from DECNET honeypots:

  1. Name Matching: Checks if the container name matches the {decky}-{service} pattern.
  2. State Verification: Cross-references container names with the current decnet-state.json.