feat(swarm): unbundle master-only code from agent tarball + sync systemd units on update

Agents now ship with collector/prober/sniffer as systemd services; mutator,
profiler, web, and API stay master-only (profiler rebuilds attacker profiles
against the master DB — no per-host DB exists). Expand _EXCLUDES to drop the
full decnet/web, decnet/mutator, decnet/profiler, and decnet_web trees from
the enrollment bundle.

Updater now calls _heal_path_symlink + _sync_systemd_units after rotation so
fleets pick up new unit files and /usr/local/bin/decnet tracks the shared venv
without a manual reinstall. daemon-reload runs once per update when any unit
changed.

Fix _service_registry matchers to accept systemd-style /usr/local/bin/decnet
cmdlines (psutil returns a list — join to string before substring-checking)
so agent-mode `decnet status` reports collector/prober/sniffer correctly.
This commit is contained in:
2026-04-19 19:19:17 -04:00
parent d2cf1e8b3a
commit 2bef3edb72
8 changed files with 56 additions and 169 deletions

1
.gitignore vendored
View File

@@ -26,3 +26,4 @@ decnet.json
.coverage .coverage
.hypothesis/ .hypothesis/
profiles/* profiles/*
tests/test_decnet.db*

104
GEMINI.md
View File

@@ -1,104 +0,0 @@
# DECNET (Deception Network) Project Context
DECNET is a high-fidelity honeypot framework designed to deploy heterogeneous fleets of fake machines (called **deckies**) that appear as real hosts on a local network.
## Project Overview
- **Core Purpose:** To lure, profile, and log attacker interactions within a controlled, deceptive environment.
- **Key Technology:** Linux-native container networking (MACVLAN/IPvlan) combined with Docker to give each decoy its own MAC address, IP, and realistic TCP/IP stack behavior.
- **Main Components:**
- **Deckies:** Group of containers sharing a network namespace (one base container + multiple service containers).
- **Archetypes:** Pre-defined machine profiles (e.g., `windows-workstation`, `linux-server`) that bundle services and OS fingerprints.
- **Services:** Modular honeypot plugins (SSH, SMB, RDP, etc.) built as `BaseService` subclasses.
- **OS Fingerprinting:** Sysctl-based TCP/IP stack tuning to spoof OS detection (nmap).
- **Logging Pipeline:** RFC 5424 syslog forwarding to an isolated SIEM/ELK stack.
## Technical Stack
- **Language:** Python 3.11+
- **CLI Framework:** [Typer](https://typer.tiangolo.com/)
- **Data Validation:** [Pydantic v2](https://docs.pydantic.dev/)
- **Orchestration:** Docker Engine 24+ (via Docker SDK for Python)
- **Networking:** MACVLAN (default) or IPvlan L2 (for WiFi/restricted environments).
- **Testing:** Pytest (100% pass requirement).
- **Formatting/Linting:** Ruff, Bandit (SAST), pip-audit.
## Architecture
```text
Host NIC (eth0)
└── MACVLAN Bridge
├── Decky-01 (192.168.1.10) -> [Base] + [SSH] + [HTTP]
├── Decky-02 (192.168.1.11) -> [Base] + [SMB] + [RDP]
└── ...
```
- **Base Container:** Owns the IP/MAC, sets `sysctls` for OS spoofing, and runs `sleep infinity`.
- **Service Containers:** Use `network_mode: service:<base>` to share the identity and networking of the base container.
- **Isolation:** Decoy traffic is strictly separated from the logging network.
## Key Commands
### Development & Maintenance
- **Install (Dev):**
- `rm .venv -rf`
- `python3 -m venv .venv`
- `source .venv/bin/activate`
- `pip install -e .`
- **Run Tests:** `pytest` (Run before any commit)
- **Linting:** `ruff check .`
- **Security Scan:** `bandit -r decnet/`
- **Web Git:** git.resacachile.cl (Gitea)
### CLI Usage
- **List Services:** `decnet services`
- **List Archetypes:** `decnet archetypes`
- **Dry Run (Compose Gen):** `decnet deploy --deckies 3 --randomize-services --dry-run`
- **Deploy (Full):** `sudo .venv/bin/decnet deploy --interface eth0 --deckies 5 --randomize-services`
- **Status:** `decnet status`
- **Teardown:** `sudo .venv/bin/decnet teardown --all`
## Development Conventions
- **Code Style:**
- Strict adherence to Ruff/PEP8.
- **Always use typed variables**. If any non-types variables are found, they must be corrected.
- The correct way is `x: int = 1`, never `x : int = 1`.
- If assignment is present, always use a space between the type and the equal sign `x: int = 1`.
- **Never** use lowercase L (l), uppercase o (O) or uppercase i (i) in single-character names.
- **Internal vars are to be declared with an underscore** (_internal_variable_name).
- **Internal to internal vars are to be declared with double underscore** (__internal_variable_name).
- Always use snake_case for code.
- Always use PascalCase for classes and generics.
- **Testing:** New features MUST include a `pytest` case. 100% test pass rate is mandatory before merging.
- **Plugin System:**
- New services go in `decnet/services/<name>.py`.
- Subclass `decnet.services.base.BaseService`.
- The registry uses auto-discovery; no manual registration required.
- **Configuration:**
- Use Pydantic models in `decnet/config.py` for any new settings.
- INI file parsing is handled in `decnet/ini_loader.py`.
- **State Management:**
- Runtime state is persisted in `decnet-state.json`.
- Do not modify this file manually.
- **General Development Guidelines**:
- **Never** commit broken code, or before running `pytest`s or `bandit` at the project level.
- **No matter how small** the changes, they must be committed.
- **If new features are addedd** new tests must be added, too.
- **Never present broken code to the user**. Test, validate, then present.
- **Extensive testing** for every function must be created.
- **Always develop in the `dev` branch, never in `main`.**
- **Test in the `testing` branch.**
- **IMPORTANT**: The system now strictly enforces dependency injection for storage. Do not import `SQLiteRepository` directly in new features; instead, use `get_repository()` from the factory or the FastAPI `get_repo` dependency.
## Directory Structure
- `decnet/`: Main source code.
- `services/`: Honeypot service implementations.
- `logging/`: Syslog formatting and forwarding logic.
- `correlation/`: (In Progress) Logic for grouping attacker events.
- `templates/`: Dockerfiles and entrypoint scripts for services.
- `tests/`: Pytest suite.
- `pyproject.toml`: Dependency and entry point definitions.
- `CLAUDE.md`: Claude-specific environment guidance.
- `DEVELOPMENT.md`: Roadmap and TODOs.

View File

@@ -1163,30 +1163,45 @@ def _service_registry(log_file: str) -> list[tuple[str, callable, list[str]]]:
import sys import sys
_py = sys.executable _py = sys.executable
# On agents these run as systemd units invoking /usr/local/bin/decnet,
# which doesn't include "decnet.cli" in its cmdline. On master dev boxes
# they're launched via `python -m decnet.cli`. Match either form — cmd
# is a list of argv tokens, so substring-check each token.
def _matches(sub: str, extras: tuple[str, ...] = ()):
def _check(cmd) -> bool:
joined = " ".join(cmd) if not isinstance(cmd, str) else cmd
if "decnet" not in joined:
return False
if sub not in joined:
return False
return all(e in joined for e in extras)
return _check
return [ return [
( (
"Collector", "Collector",
lambda cmd: "decnet.cli" in cmd and "collect" in cmd, _matches("collect"),
[_py, "-m", "decnet.cli", "collect", "--daemon", "--log-file", log_file], [_py, "-m", "decnet.cli", "collect", "--daemon", "--log-file", log_file],
), ),
( (
"Mutator", "Mutator",
lambda cmd: "decnet.cli" in cmd and "mutate" in cmd and "--watch" in cmd, _matches("mutate", ("--watch",)),
[_py, "-m", "decnet.cli", "mutate", "--daemon", "--watch"], [_py, "-m", "decnet.cli", "mutate", "--daemon", "--watch"],
), ),
( (
"Prober", "Prober",
lambda cmd: "decnet.cli" in cmd and "probe" in cmd, _matches("probe"),
[_py, "-m", "decnet.cli", "probe", "--daemon", "--log-file", log_file], [_py, "-m", "decnet.cli", "probe", "--daemon", "--log-file", log_file],
), ),
( (
"Profiler", "Profiler",
lambda cmd: "decnet.cli" in cmd and "profiler" in cmd, _matches("profiler"),
[_py, "-m", "decnet.cli", "profiler", "--daemon"], [_py, "-m", "decnet.cli", "profiler", "--daemon"],
), ),
( (
"Sniffer", "Sniffer",
lambda cmd: "decnet.cli" in cmd and "sniffer" in cmd, _matches("sniffer"),
[_py, "-m", "decnet.cli", "sniffer", "--daemon", "--log-file", log_file], [_py, "-m", "decnet.cli", "sniffer", "--daemon", "--log-file", log_file],
), ),
( (
@@ -1323,11 +1338,11 @@ def status() -> None:
_status() _status()
registry = _service_registry(str(DECNET_INGEST_LOG_FILE)) registry = _service_registry(str(DECNET_INGEST_LOG_FILE))
# On agents, the Mutator runs master-side only (it schedules decky # On agents, Mutator + Profiler are master-only (they need the master
# respawns across the swarm) and the API is never shipped. Hide those # DB and orchestrate across the swarm), and the API is never shipped.
# rows so operators aren't chasing permanent DOWN entries. # Hide those rows so operators aren't chasing permanent DOWN entries.
if _agent_mode_active(): if _agent_mode_active():
registry = [r for r in registry if r[0] not in {"Mutator", "API"}] registry = [r for r in registry if r[0] not in {"Mutator", "Profiler", "API"}]
svc_table = Table(title="DECNET Services", show_lines=True) svc_table = Table(title="DECNET Services", show_lines=True)
svc_table.add_column("Service", style="bold cyan") svc_table.add_column("Service", style="bold cyan")
svc_table.add_column("Status") svc_table.add_column("Status")
@@ -1767,15 +1782,16 @@ def db_reset(
# MASTER_ONLY when touching command registration. # MASTER_ONLY when touching command registration.
# #
# Worker-legitimate commands (NOT in these sets): agent, updater, forwarder, # Worker-legitimate commands (NOT in these sets): agent, updater, forwarder,
# status, collect, probe, profiler, sniffer. Agents run deckies locally and # status, collect, probe, sniffer. Agents run deckies locally and should be
# should be able to inspect them + run the per-host microservices (collector # able to inspect them + run the per-host microservices (collector streams
# streams container logs, prober/profiler characterize attackers hitting # container logs, prober characterizes attackers hitting this host, sniffer
# this host, sniffer captures traffic). Mutator stays master-only because # captures traffic). Mutator and Profiler stay master-only: the mutator
# it orchestrates respawns across the swarm. # orchestrates respawns across the swarm; the profiler rebuilds attacker
# profiles against the master DB (no per-host DB exists).
# ─────────────────────────────────────────────────────────────────────────── # ───────────────────────────────────────────────────────────────────────────
MASTER_ONLY_COMMANDS: frozenset[str] = frozenset({ MASTER_ONLY_COMMANDS: frozenset[str] = frozenset({
"api", "swarmctl", "deploy", "redeploy", "teardown", "api", "swarmctl", "deploy", "redeploy", "teardown",
"mutate", "listener", "mutate", "listener", "profiler",
"services", "distros", "correlate", "archetypes", "web", "services", "distros", "correlate", "archetypes", "web",
"db-reset", "db-reset",
}) })

View File

@@ -243,7 +243,7 @@ UPDATER_SYSTEMD_UNIT = "decnet-updater.service"
# without these units installed shouldn't abort the update. # without these units installed shouldn't abort the update.
AUXILIARY_SYSTEMD_UNITS = ( AUXILIARY_SYSTEMD_UNITS = (
"decnet-collector.service", "decnet-prober.service", "decnet-collector.service", "decnet-prober.service",
"decnet-profiler.service", "decnet-sniffer.service", "decnet-sniffer.service",
) )

View File

@@ -63,19 +63,15 @@ _EXCLUDES: tuple[str, ...] = (
"wiki-checkout", "wiki-checkout/*", "wiki-checkout", "wiki-checkout/*",
# Frontend is master-only; agents never serve UI. # Frontend is master-only; agents never serve UI.
"decnet_web", "decnet_web/*", "decnet_web/**", "decnet_web", "decnet_web/*", "decnet_web/**",
# Master API surface. Agents ship with decnet.web.db + auth + dependencies # Master FastAPI app and everything under decnet/web/ — no agent-side
# (the profiler microservice needs the repo singleton), but the FastAPI # code imports it. The agent/updater/forwarder/collector/prober/sniffer
# app itself (api.py, swarm_api.py, the full router tree, the ingester, # entrypoints are all under decnet/agent, decnet/updater, decnet/swarm,
# and the .j2 templates that the master renders into the tarball) has no # decnet/collector, decnet/prober, decnet/sniffer.
# business running on a worker. "decnet/web", "decnet/web/*", "decnet/web/**",
"decnet/web/api.py", # Mutator + Profiler are master-only (mutator schedules respawns across
"decnet/web/swarm_api.py", # the swarm; profiler rebuilds attacker profiles against the master DB).
"decnet/web/ingester.py",
"decnet/web/router", "decnet/web/router/*", "decnet/web/router/**",
"decnet/web/templates", "decnet/web/templates/*", "decnet/web/templates/**",
# Mutator is master-only (it schedules decky respawns across the swarm);
# agents never invoke it. Keep it off the worker.
"decnet/mutator", "decnet/mutator/*", "decnet/mutator/**", "decnet/mutator", "decnet/mutator/*", "decnet/mutator/**",
"decnet/profiler", "decnet/profiler/*", "decnet/profiler/**",
"decnet-state.json", "decnet-state.json",
"master.log", "master.json", "master.log", "master.json",
"decnet.tar", "decnet.tar",
@@ -265,8 +261,10 @@ def _build_tarball(
_SYSTEMD_UNITS = ( _SYSTEMD_UNITS = (
"decnet-agent", "decnet-forwarder", "decnet-engine", "decnet-updater", "decnet-agent", "decnet-forwarder", "decnet-engine", "decnet-updater",
# Per-host microservices — activated by enroll_bootstrap.sh. # Per-host microservices — activated by enroll_bootstrap.sh. The
"decnet-collector", "decnet-prober", "decnet-profiler", "decnet-sniffer", # profiler intentionally stays master-side: it rebuilds attacker
# profiles against the master DB, which workers don't share.
"decnet-collector", "decnet-prober", "decnet-sniffer",
) )

View File

@@ -1,20 +0,0 @@
[Unit]
Description=DECNET attacker profiler — {{ agent_name }}
Documentation=https://github.com/anti/DECNET
After=network-online.target decnet-agent.service
Wants=network-online.target
PartOf=decnet-agent.service
[Service]
Type=simple
WorkingDirectory=/opt/decnet
Environment=DECNET_MODE=agent
Environment=DECNET_SYSTEM_LOGS=/var/log/decnet/decnet.profiler.log
ExecStart=/usr/local/bin/decnet profiler --interval 30
Restart=on-failure
RestartSec=5
StandardOutput=append:/var/log/decnet/decnet.profiler.log
StandardError=append:/var/log/decnet/decnet.profiler.log
[Install]
WantedBy=multi-user.target

View File

@@ -62,7 +62,7 @@ ln -sf "$VENV_DIR/bin/decnet" /usr/local/bin/decnet
echo "[DECNET] installing systemd units..." echo "[DECNET] installing systemd units..."
for unit in \ for unit in \
decnet-agent decnet-forwarder decnet-engine \ decnet-agent decnet-forwarder decnet-engine \
decnet-collector decnet-prober decnet-profiler decnet-sniffer; do decnet-collector decnet-prober decnet-sniffer; do
install -Dm0644 "etc/systemd/system/${unit}.service" "/etc/systemd/system/${unit}.service" install -Dm0644 "etc/systemd/system/${unit}.service" "/etc/systemd/system/${unit}.service"
done done
if [[ "$WITH_UPDATER" == "true" ]]; then if [[ "$WITH_UPDATER" == "true" ]]; then
@@ -76,7 +76,7 @@ systemctl daemon-reload
ACTIVE_UNITS=( ACTIVE_UNITS=(
decnet-agent.service decnet-forwarder.service decnet-agent.service decnet-forwarder.service
decnet-collector.service decnet-prober.service decnet-collector.service decnet-prober.service
decnet-profiler.service decnet-sniffer.service decnet-sniffer.service
) )
if [[ "$WITH_UPDATER" == "true" ]]; then if [[ "$WITH_UPDATER" == "true" ]]; then
ACTIVE_UNITS+=(decnet-updater.service) ACTIVE_UNITS+=(decnet-updater.service)

View File

@@ -185,9 +185,10 @@ async def test_systemd_units_shipped_and_installed(client, auth_token):
assert "etc/systemd/system/decnet-forwarder.service" in names assert "etc/systemd/system/decnet-forwarder.service" in names
assert "etc/systemd/system/decnet-engine.service" in names assert "etc/systemd/system/decnet-engine.service" in names
# Per-host microservices get their own systemd units now. # Per-host microservices get their own systemd units now.
for unit in ("decnet-collector", "decnet-prober", # Profiler is master-only (uses the master DB) and must NOT ship.
"decnet-profiler", "decnet-sniffer"): for unit in ("decnet-collector", "decnet-prober", "decnet-sniffer"):
assert f"etc/systemd/system/{unit}.service" in names, unit assert f"etc/systemd/system/{unit}.service" in names, unit
assert "etc/systemd/system/decnet-profiler.service" not in names
fwd = tf.extractfile("etc/systemd/system/decnet-forwarder.service").read().decode() fwd = tf.extractfile("etc/systemd/system/decnet-forwarder.service").read().decode()
assert "--master-host 10.9.8.7" in fwd assert "--master-host 10.9.8.7" in fwd
@@ -206,7 +207,7 @@ async def test_systemd_units_shipped_and_installed(client, auth_token):
for unit in ( for unit in (
"decnet-agent.service", "decnet-forwarder.service", "decnet-agent.service", "decnet-forwarder.service",
"decnet-collector.service", "decnet-prober.service", "decnet-collector.service", "decnet-prober.service",
"decnet-profiler.service", "decnet-sniffer.service", "decnet-sniffer.service",
): ):
assert unit in sh, unit assert unit in sh, unit
assert "decnet-updater.service" in sh assert "decnet-updater.service" in sh
@@ -307,18 +308,13 @@ async def test_get_tgz_contents(client, auth_token, tmp_path):
assert not bad.endswith(".env"), f"leaked env file: {bad}" assert not bad.endswith(".env"), f"leaked env file: {bad}"
assert ".env.local" not in bad, f"leaked env file: {bad}" assert ".env.local" not in bad, f"leaked env file: {bad}"
assert ".env.example" not in bad, f"leaked env file: {bad}" assert ".env.example" not in bad, f"leaked env file: {bad}"
# Master-only trees: agents don't run the FastAPI master app or the # Master-only trees: agents don't run the FastAPI master app, the
# React frontend, so shipping them bloats the tarball and widens the # React frontend, the mutator (swarm-wide respawn scheduler), or
# worker's attack surface for no benefit. decnet/web/db and # the profiler (rebuilds profiles against the master DB).
# decnet/web/dependencies.py DO ship — the profiler microservice on
# the agent needs the repo singleton.
assert not bad.startswith("decnet_web/"), f"leaked frontend: {bad}" assert not bad.startswith("decnet_web/"), f"leaked frontend: {bad}"
assert bad != "decnet/web/api.py", f"leaked master API: {bad}" assert not bad.startswith("decnet/web/"), f"leaked master-api: {bad}"
assert bad != "decnet/web/swarm_api.py", f"leaked swarm API: {bad}"
assert bad != "decnet/web/ingester.py", f"leaked ingester: {bad}"
assert not bad.startswith("decnet/web/router/"), f"leaked router: {bad}"
assert not bad.startswith("decnet/web/templates/"), f"leaked tpl: {bad}"
assert not bad.startswith("decnet/mutator/"), f"leaked mutator: {bad}" assert not bad.startswith("decnet/mutator/"), f"leaked mutator: {bad}"
assert not bad.startswith("decnet/profiler/"), f"leaked profiler: {bad}"
# INI content is correct # INI content is correct
ini = tf.extractfile("etc/decnet/decnet.ini").read().decode() ini = tf.extractfile("etc/decnet/decnet.ini").read().decode()