docs: add Resource-Footprint page with real numbers from first VPS deploy

Document the disk/RAM/CPU footprint of a live deployment so anyone sizing a VPS for DECNET can see what to expect. Numbers are from the first Contabo deploy: 4.5 GiB disk, 2.1 GiB RAM, 0.03 load average, 12 containers, 12 workers, ~80 attackers in the first hour. Adds a "what scales with topology size" breakdown so operators can project from these numbers to their own target deployment, and a sizing-floor recommendation per deployment shape (UNIHOST small, UNIHOST medium, SWARM master, SWARM agent). Linked from the User docs section of the sidebar between Tailscale-Global-Deployment and MazeNET.
2026-04-27 23:06:21 -04:00
parent 3e9173bf41
commit 1855cfe7ae
2 changed files with 176 additions and 0 deletions
--- a/Resource-Footprint.md
+++ b/Resource-Footprint.md
@@ -0,0 +1,175 @@
+# Resource Footprint
+
+What a live DECNET deployment actually costs. Real numbers from a public
+VPS hunting attackers in the wild — not synthetic benchmarks.
+
+> If you want API throughput under simulated load, see
+> [Performance-Story](Performance-Story). This page is about the
+> "is my $4/mo VPS enough?" question.
+
+---
+
+## Reference deployment
+
+| Item | Value |
+|---|---|
+| Provider tier | Contabo VPS S — 4 vCPU, 8 GiB RAM, 200 GB SSD |
+| OS | Debian 12 (bookworm) |
+| Topology | 1 DMZ gateway + 3 internal deckies, mixed SSH/HTTP/FTP/Telnet |
+| Containers | 12 total (4 decoy bases + 8 service containers) |
+| Workers running | api, web, collector, ingester, prober, profiler, correlator, mutator, sniffer, bus, updater, agent |
+| Database | SQLite (default) |
+| Network | Public IP for decoys + Tailscale for management |
+| Uptime at measurement | ~1 hour, ~80 unique attacker IPs observed |
+
+---
+
+## Steady-state numbers
+
+```
+$ df -h /
+Filesystem      Size  Used Avail Use%
+/dev/sda1       145G  4.5G  140G   4%
+
+$ free -h
+               total        used        free
+Mem:           7.8Gi       2.1Gi       5.2Gi
+Swap:             0B          0B          0B
+
+$ uptime
+load average: 0.03, 0.03, 0.03
+```
+
+- **Disk: 4.5 GiB used.** That includes the OS (~1.5 GiB), Docker engine
+  + image layers (~2 GiB across all decoy + service images), the DECNET
+  source tree + venv, the SQLite database, and one hour of accumulated
+  logs and PCAPs.
+- **RAM: 2.1 GiB used.** Twelve workers + twelve containers + Docker
+  daemon + system. Buff/cache: 786 MiB, available: 5.7 GiB.
+- **CPU: 0.03 load average across 1/5/15 minutes.** On a 4-vCPU box that
+  is roughly 0.75% utilisation. Idle machine that occasionally wakes up
+  to log something.
+
+---
+
+## What scales with topology size
+
+These are the levers that move the needle. Everything else is roughly
+fixed cost.
+
+| Component | Cost shape | Notes |
+|---|---|---|
+| Decoy base containers | ~5–15 MiB RAM each | `debian:bookworm-slim` running `sleep infinity`. Cheap. |
+| Service containers (sshd, ftpd, http, smb, telnet, …) | ~20–60 MiB RAM each | Real daemons, not emulators. SSH is the heaviest. |
+| Bridge networks | Negligible | One Linux bridge per LAN in the topology. |
+| Docker image layers | ~50–200 MiB per unique service image | Built once, shared across deckies that use the same service. |
+| SQLite database | ~10 MiB / 100k events | Compresses well; logs themselves are larger than the indexed rows. |
+| `/var/log/decnet/` | ~10–50 MiB / hour under attack | Bounded by the logrotate config shipped by `decnet init` (7 daily rotations, 100 MiB cap). |
+| PCAP captures (sniffer) | ~10–30 MiB / hour idle, much higher under flood | Sniffer-managed rotation; not in the master logrotate scope. |
+
+**Master workers** are fixed cost regardless of topology size: the API,
+correlator, profiler, ingester, bus, updater all have the same memory
+footprint whether you run 4 deckies or 40.
+
+---
+
+## Sizing recommendations
+
+These are the floors, not the ceilings. Bigger is fine. Smaller is when
+things start failing under attack load.
+
+### Single-host (UNIHOST) honeypot, small topology (≤8 deckies)
+
+- 2 vCPU
+- 2 GiB RAM
+- 20 GiB disk
+- Any cheap VPS tier ($4–6/mo at most providers)
+
+### Single-host, medium topology (8–32 deckies, attacker-rich environment)
+
+- 2–4 vCPU
+- 4 GiB RAM
+- 40 GiB disk
+
+This is where we landed for the reference deployment. Headroom for
+attacker bursts and a SQLite database that's growing visibly.
+
+### SWARM master, multiple worker hosts
+
+- 4 vCPU
+- 8 GiB RAM (the dashboard + correlator are the heaviest)
+- 80 GiB disk if MySQL backend; 40 GiB for SQLite
+- The master's resource shape is dominated by *log ingest rate from
+  workers* and the *total attacker count being correlated*, not by its
+  own decoys (it usually has none).
+
+### Workers / agents in SWARM
+
+Same as a UNIHOST — the per-host workload is identical. The bus, agent,
+and forwarder are tiny additions on top of the topology containers.
+
+---
+
+## Where the disk actually goes
+
+Inspect on a deployed host:
+
+```sh
+docker system df              # Docker images / containers / volumes
+du -sh /var/log/decnet/       # Logs (logrotate-bounded)
+du -sh /var/lib/decnet/       # State + artifacts (canary quarantine, sniffer pcaps)
+du -sh /opt/decnet/           # Source tree + venv
+du -sh /opt/decnet/decnet.db  # SQLite, if applicable
+```
+
+For a live deployment under attack, expect roughly:
+
+```
+/var/log/decnet/    capped by logrotate (≤700 MiB at the default cap)
+/var/lib/decnet/    grows with attacker uploads (artifacts) and PCAPs
+/opt/decnet/        ~500 MiB — source tree + venv + node_modules for the dashboard build
+docker images       ~2 GiB — depends on which service archetypes you've built
+decnet.db           grows linearly with event count; ~10 MiB per 100k events
+```
+
+If `/var/lib/decnet/` grows uncontrollably, an attacker is dropping
+files into a service container's quarantine bind-mount. That's working
+as designed — bounty material — but worth watching.
+
+---
+
+## What does *not* show up in these numbers
+
+- **Sustained heavy attack traffic.** A target attracting brute-force
+  SSH from hundreds of IPs simultaneously will push CPU and ingest rate
+  noticeably. The ingester's batch flushes and SQLite write contention
+  are the first bottlenecks; see [Performance-Story](Performance-Story)
+  for the pipeline-tuning history.
+- **MySQL backend.** Switching to MySQL (`DECNET_DB_TYPE=mysql`)
+  trades local SQLite simplicity for ~200–400 MiB of extra RAM
+  (mysqld + connection pool) and a new failure mode (network DB).
+- **The sniffer under flood.** PCAP rotation is on the sniffer worker,
+  not on logrotate. A SYN flood or scan campaign can produce
+  multi-GiB captures fast if you don't cap the retention window. Tune
+  via the sniffer's environment variables (see
+  [Environment-Variables](Environment-Variables)).
+- **The MazeNET attacker pool.** Observed-only nodes are pure
+  visualisation, no extra runtime cost.
+
+---
+
+## Takeaway
+
+DECNET is comfortably hostable on the cheapest tier most VPS providers
+sell. The decoys are containers running real services, not heavy
+emulators; the workers are Python coroutines, not enterprise Java; the
+database default is SQLite. The only thing that grows with attacker
+volume is the SQLite event table and the logs — both bounded by
+configurable caps.
+
+If you're staring at a $4/mo VPS plan wondering whether to gamble it on
+DECNET: yes. There is a lot of headroom.
+
+See also: [Tailscale-Global-Deployment](Tailscale-Global-Deployment),
+[Deployment-Modes](Deployment-Modes), [Performance-Story](Performance-Story),
+[Logging-and-Syslog](Logging-and-Syslog).
--- a/_Sidebar.md
+++ b/_Sidebar.md
@@ -19,6 +19,7 @@
 - [Deployment-Modes](Deployment-Modes)
 - [SWARM-Mode](SWARM-Mode)
 - [Tailscale-Global-Deployment](Tailscale-Global-Deployment)
+- [Resource-Footprint](Resource-Footprint)
 - [MazeNET](MazeNET)
 - [Remote-Updates](Remote-Updates)
 - [Environment-Variables](Environment-Variables)