docs: add Resource-Footprint page with real numbers from first VPS deploy

Document the disk/RAM/CPU footprint of a live deployment so anyone
sizing a VPS for DECNET can see what to expect. Numbers are from the
first Contabo deploy: 4.5 GiB disk, 2.1 GiB RAM, 0.03 load average,
12 containers, 12 workers, ~80 attackers in the first hour.

Adds a "what scales with topology size" breakdown so operators can
project from these numbers to their own target deployment, and a
sizing-floor recommendation per deployment shape (UNIHOST small,
UNIHOST medium, SWARM master, SWARM agent).

Linked from the User docs section of the sidebar between
Tailscale-Global-Deployment and MazeNET.
2026-04-27 23:06:21 -04:00
parent 3e9173bf41
commit 1855cfe7ae
2 changed files with 176 additions and 0 deletions

175
Resource-Footprint.md Normal file

@@ -0,0 +1,175 @@
# Resource Footprint
What a live DECNET deployment actually costs. Real numbers from a public
VPS hunting attackers in the wild — not synthetic benchmarks.
> If you want API throughput under simulated load, see
> [Performance-Story](Performance-Story). This page is about the
> "is my $4/mo VPS enough?" question.
---
## Reference deployment
| Item | Value |
|---|---|
| Provider tier | Contabo VPS S — 4 vCPU, 8 GiB RAM, 200 GB SSD |
| OS | Debian 12 (bookworm) |
| Topology | 1 DMZ gateway + 3 internal deckies, mixed SSH/HTTP/FTP/Telnet |
| Containers | 12 total (4 decoy bases + 8 service containers) |
| Workers running | api, web, collector, ingester, prober, profiler, correlator, mutator, sniffer, bus, updater, agent |
| Database | SQLite (default) |
| Network | Public IP for decoys + Tailscale for management |
| Uptime at measurement | ~1 hour, ~80 unique attacker IPs observed |
---
## Steady-state numbers
```
$ df -h /
Filesystem Size Used Avail Use%
/dev/sda1 145G 4.5G 140G 4%
$ free -h
total used free
Mem: 7.8Gi 2.1Gi 5.2Gi
Swap: 0B 0B 0B
$ uptime
load average: 0.03, 0.03, 0.03
```
- **Disk: 4.5 GiB used.** That includes the OS (~1.5 GiB), Docker engine
+ image layers (~2 GiB across all decoy + service images), the DECNET
source tree + venv, the SQLite database, and one hour of accumulated
logs and PCAPs.
- **RAM: 2.1 GiB used.** Twelve workers + twelve containers + Docker
daemon + system. Buff/cache: 786 MiB, available: 5.7 GiB.
- **CPU: 0.03 load average across 1/5/15 minutes.** On a 4-vCPU box that
is roughly 0.75% utilisation. Idle machine that occasionally wakes up
to log something.
---
## What scales with topology size
These are the levers that move the needle. Everything else is roughly
fixed cost.
| Component | Cost shape | Notes |
|---|---|---|
| Decoy base containers | ~515 MiB RAM each | `debian:bookworm-slim` running `sleep infinity`. Cheap. |
| Service containers (sshd, ftpd, http, smb, telnet, …) | ~2060 MiB RAM each | Real daemons, not emulators. SSH is the heaviest. |
| Bridge networks | Negligible | One Linux bridge per LAN in the topology. |
| Docker image layers | ~50200 MiB per unique service image | Built once, shared across deckies that use the same service. |
| SQLite database | ~10 MiB / 100k events | Compresses well; logs themselves are larger than the indexed rows. |
| `/var/log/decnet/` | ~1050 MiB / hour under attack | Bounded by the logrotate config shipped by `decnet init` (7 daily rotations, 100 MiB cap). |
| PCAP captures (sniffer) | ~1030 MiB / hour idle, much higher under flood | Sniffer-managed rotation; not in the master logrotate scope. |
**Master workers** are fixed cost regardless of topology size: the API,
correlator, profiler, ingester, bus, updater all have the same memory
footprint whether you run 4 deckies or 40.
---
## Sizing recommendations
These are the floors, not the ceilings. Bigger is fine. Smaller is when
things start failing under attack load.
### Single-host (UNIHOST) honeypot, small topology (≤8 deckies)
- 2 vCPU
- 2 GiB RAM
- 20 GiB disk
- Any cheap VPS tier ($46/mo at most providers)
### Single-host, medium topology (832 deckies, attacker-rich environment)
- 24 vCPU
- 4 GiB RAM
- 40 GiB disk
This is where we landed for the reference deployment. Headroom for
attacker bursts and a SQLite database that's growing visibly.
### SWARM master, multiple worker hosts
- 4 vCPU
- 8 GiB RAM (the dashboard + correlator are the heaviest)
- 80 GiB disk if MySQL backend; 40 GiB for SQLite
- The master's resource shape is dominated by *log ingest rate from
workers* and the *total attacker count being correlated*, not by its
own decoys (it usually has none).
### Workers / agents in SWARM
Same as a UNIHOST — the per-host workload is identical. The bus, agent,
and forwarder are tiny additions on top of the topology containers.
---
## Where the disk actually goes
Inspect on a deployed host:
```sh
docker system df # Docker images / containers / volumes
du -sh /var/log/decnet/ # Logs (logrotate-bounded)
du -sh /var/lib/decnet/ # State + artifacts (canary quarantine, sniffer pcaps)
du -sh /opt/decnet/ # Source tree + venv
du -sh /opt/decnet/decnet.db # SQLite, if applicable
```
For a live deployment under attack, expect roughly:
```
/var/log/decnet/ capped by logrotate (≤700 MiB at the default cap)
/var/lib/decnet/ grows with attacker uploads (artifacts) and PCAPs
/opt/decnet/ ~500 MiB — source tree + venv + node_modules for the dashboard build
docker images ~2 GiB — depends on which service archetypes you've built
decnet.db grows linearly with event count; ~10 MiB per 100k events
```
If `/var/lib/decnet/` grows uncontrollably, an attacker is dropping
files into a service container's quarantine bind-mount. That's working
as designed — bounty material — but worth watching.
---
## What does *not* show up in these numbers
- **Sustained heavy attack traffic.** A target attracting brute-force
SSH from hundreds of IPs simultaneously will push CPU and ingest rate
noticeably. The ingester's batch flushes and SQLite write contention
are the first bottlenecks; see [Performance-Story](Performance-Story)
for the pipeline-tuning history.
- **MySQL backend.** Switching to MySQL (`DECNET_DB_TYPE=mysql`)
trades local SQLite simplicity for ~200400 MiB of extra RAM
(mysqld + connection pool) and a new failure mode (network DB).
- **The sniffer under flood.** PCAP rotation is on the sniffer worker,
not on logrotate. A SYN flood or scan campaign can produce
multi-GiB captures fast if you don't cap the retention window. Tune
via the sniffer's environment variables (see
[Environment-Variables](Environment-Variables)).
- **The MazeNET attacker pool.** Observed-only nodes are pure
visualisation, no extra runtime cost.
---
## Takeaway
DECNET is comfortably hostable on the cheapest tier most VPS providers
sell. The decoys are containers running real services, not heavy
emulators; the workers are Python coroutines, not enterprise Java; the
database default is SQLite. The only thing that grows with attacker
volume is the SQLite event table and the logs — both bounded by
configurable caps.
If you're staring at a $4/mo VPS plan wondering whether to gamble it on
DECNET: yes. There is a lot of headroom.
See also: [Tailscale-Global-Deployment](Tailscale-Global-Deployment),
[Deployment-Modes](Deployment-Modes), [Performance-Story](Performance-Story),
[Logging-and-Syslog](Logging-and-Syslog).

@@ -19,6 +19,7 @@
- [Deployment-Modes](Deployment-Modes)
- [SWARM-Mode](SWARM-Mode)
- [Tailscale-Global-Deployment](Tailscale-Global-Deployment)
- [Resource-Footprint](Resource-Footprint)
- [MazeNET](MazeNET)
- [Remote-Updates](Remote-Updates)
- [Environment-Variables](Environment-Variables)