From 189a4cd383a04998e6bf3454eaf6e854e95caacc Mon Sep 17 00:00:00 2001 From: anti Date: Mon, 20 Apr 2026 17:01:18 -0400 Subject: [PATCH] docs(mazenet): add MazeNET wiki page + sidebar entry --- MazeNET.md | 373 ++++++++++++++++++++++++++++++++++++++++++++++++++++ _Sidebar.md | 1 + 2 files changed, 374 insertions(+) create mode 100644 MazeNET.md diff --git a/MazeNET.md b/MazeNET.md new file mode 100644 index 0000000..637a3ea --- /dev/null +++ b/MazeNET.md @@ -0,0 +1,373 @@ +# MazeNET — Nested Network of Networks + +MazeNET is DECNET's recursive deception topology. Instead of a flat fleet +of deckies on one LAN, MazeNET produces a **DAG of segmented LANs** with +multi-homed "bridge deckies" forwarding L3 between them — a DMZ at the +edge, internal segments behind it, and, with cross-edges enabled, pivot +paths a patient attacker can chase deeper into the maze. + +A flat deployment burns an attacker for minutes. A nested topology burns +them for hours. + +See also: [CLI reference](CLI-Reference), +[Deployment modes](Deployment-Modes), +[Networking: MACVLAN and IPVLAN](Networking-MACVLAN-IPVLAN), +[Teardown](Teardown-and-State), +[Logging and syslog](Logging-and-Syslog). + +--- + +## When to use MazeNET + +Use MazeNET when: + +- You are deploying on a VPS or a single dedicated box and want the + appearance of a **segmented internal network** (DMZ → services → + internal) behind the public IP. +- You want attackers to **pivot** — discover one decky, enumerate it, + find a foothold into a deeper LAN, repeat. +- You want per-LAN isolation so a compromise of one decky can't reach + sibling segments without going through a bridge you control. + +Stick with flat UNIHOST mode (see [Deployment +Modes](Deployment-Modes)) when: + +- You only need a handful of deckies on a shared LAN. +- You want LAN-peer realism (deckies reachable from your existing + workstations over MACVLAN/IPVLAN). MazeNET uses plain Docker bridge + networks — it does not hand IPs out onto your real LAN. + +--- + +## Concepts + +### LAN + +A MazeNET LAN is one plain Docker bridge network. Each LAN has a +`/24` subnet carved from the configured base prefix (default +`172.20.X.0/24`, one LAN per octet). LANs are arranged as a tree of +configurable depth and branching factor. + +- **LAN-00** is always the DMZ root, `is_dmz=True`, publicly routable + via the host's default bridge egress. +- Every other LAN is created with Docker's `--internal` flag — no + host-level default egress. The only way out is through a bridge + decky. + +### Decky + +Same concept as flat mode: a "base" container holds the LAN IPs and +any service containers share its network namespace via +`network_mode: service:`. One base, N service containers. + +In MazeNET a decky is identified by a UUID (table `topology_deckies`) +and is scoped to one topology. Names are unique **within** a topology; +two different topologies can both have a `decky-001`. + +### Bridge decky + +A decky multi-homed onto ≥2 LANs. Every non-DMZ LAN has **exactly one +parent bridge** — a decky on that LAN that's also given an IP in the +parent LAN. That's how packets leave a segment. + +If `bridge_forward_probability` rolls true for that bridge, the base +container gets `net.ipv4.ip_forward=1` (compose-level sysctl) and +`NET_ADMIN`, turning the bridge into an actual router. If it rolls +false, the bridge is multi-homed but will not forward — the attacker +must find a forwarder or own the bridge container itself. + +### Cross-edges — tree vs DAG + +With `cross_edge_probability=0` (default) the topology is a pure tree: +each non-DMZ LAN has exactly one parent bridge and no other inter-LAN +connections. + +With `cross_edge_probability > 0` the generator rolls per LAN; on a +hit it multi-homes a random decky to a non-parent, non-child, non-self +peer LAN. This is where the DAG comes from. The data model and the +teardown path have supported DAGs from day one — cross-edges just +exercise the code that's already there. + +### Determinism + +The generator is seeded (`TopologyConfig.seed`). Same seed + same +config ⇒ bit-identical LAN layout, decky names, service assignments, +and edges. Persistence stores the full config snapshot so you can +regenerate or audit exactly what was deployed. + +--- + +## Status lifecycle + +Every topology carries a `status` column with this state machine: + +``` +pending ──► deploying ──► active ──► tearing_down ──► torn_down + │ │ │ ▲ + │ ├──► failed ─┘ │ + │ │ │ + │ └──► degraded ◄──► active │ + │ │ │ + ▼ ├──► tearing_down ────────────────┘ +torn_down │ + └──► tearing_down ────────────────┘ +``` + +- `pending` — persisted plan, no Docker state yet. +- `deploying` — bridge networks being created, compose coming up. +- `active` — healthy and serving. +- `failed` — deploy aborted; partial state may remain on the daemon. + Legal successor: `tearing_down`. +- `degraded` — **schema-reserved** for the future Healer. No v1 code + path reaches it. Treat it as read-only. +- `tearing_down` — compose down + network removal in progress. +- `torn_down` — terminal. No legal successor. + +Every transition writes a row to `topology_status_events` +(from/to/when/reason) — an audit log you can query later. + +Illegal transitions raise `TopologyStatusError` from +`decnet.topology.status.assert_transition`. There is no `force` +escape hatch; transitions are enforced everywhere. + +--- + +## Schema + +Five new SQLModel tables live in `decnet/web/db/models.py`. They +coexist with `DeckyShard` (SWARM mode); flat/SWARM deployments do not +touch MazeNET tables and vice versa. + +| Table | Purpose | +|---|---| +| `topologies` | One row per topology. Carries `status`, `config_snapshot` (the full `TopologyConfig` including `seed`), `created_at`, `status_changed_at`. | +| `lans` | One row per LAN. `subnet`, `is_dmz`, `(topology_id, name)` unique, `docker_network_id` populated at deploy. | +| `topology_deckies` | One row per decky. UUID PK, `decky_config` blob holds `ips_by_lan` + `forwards_l3`, `(topology_id, name)` unique. | +| `topology_edges` | `(decky_uuid, lan_id)` membership. `is_bridge=True` iff the decky appears on ≥2 LANs; `forwards_l3` flag mirrored from the decky. | +| `topology_status_events` | Audit log — one row per status transition with `reason` text. | + +Repository methods land on the shared `SQLModelRepository` base, so +both SQLite and MySQL backends get them for free. Never import a +backend directly; use `get_repository()` (see [Database +Drivers](Database-Drivers)). + +--- + +## CLI walkthrough + +MazeNET commands live under `decnet topology`. The group is +**master-only** — hidden on agents via `MASTER_ONLY_GROUPS`. + +### 1. Generate a plan + +```bash +decnet topology generate \ + --name corp-decoy \ + --depth 3 \ + --branching 2 \ + --deckies-per-lan 1-3 \ + --cross-edge-p 0.15 \ + --seed 42 +``` + +Writes a new `topologies` row in `pending` status and all the LAN / +decky / edge children. No Docker calls, no containers. Prints: + +``` +Topology persisted as pending — id=9b1e... + LANs: 8 deckies: 14 edges: 16 +``` + +Flags: + +``` +--name Topology label. Required. +--depth <1..16> Max tree depth from the DMZ. +--branching <1..8> Max child LANs per non-leaf LAN. +--deckies-per-lan MIN-MAX Range per LAN, e.g. 1-3. +--bridge-forward-p 0..1 P(bridge forwards L3). default: 1.0 +--cross-edge-p 0..1 P(non-DMZ LAN adds a DAG cross-edge). default: 0.0 +--services a,b,c Fixed service set (bypasses --randomize-services). +--randomize-services Default: true. Pick 1–3 random services per decky. +--seed Deterministic RNG. Same seed ⇒ same topology. +``` + +### 2. List + +```bash +decnet topology list +``` + +Table of id, name, mode, status, created_at for every persisted +topology. Empty when there are none. + +### 3. Show + +```bash +decnet topology show 9b1e1234-5678-... +``` + +Structured text rendering — LAN-by-LAN, each LAN's deckies with IP, +services, and `(bridge, L3-forward)` tags where applicable. No ASCII +art; visual DAG rendering belongs in the web dashboard (see +[Web-Dashboard](Web-Dashboard)). + +Example: + +``` +corp-decoy id=9b1e1234-... status=pending mode=unihost + +LAN LAN-00 172.20.0.0/24 (DMZ) + • decky-001 172.20.0.2 svcs=ssh,http + +LAN LAN-01 172.20.1.0/24 + • decky-002 172.20.1.2 svcs=smb (bridge, L3-forward) + • decky-003 172.20.1.3 svcs=ftp + +LAN LAN-02 172.20.2.0/24 + • decky-002 172.20.2.2 svcs=smb (bridge, L3-forward) + • decky-004 172.20.2.2 svcs=rdp +... +``` + +### 4. Deploy + +```bash +sudo decnet topology deploy 9b1e1234-5678-... +``` + +Runs the engine deployer. For a `pending` topology: + +1. Transition to `deploying`. +2. Create one plain Docker bridge network per LAN + (`decnet_t__lan-NN`). DMZ LAN is regular; internal LANs + are created with `--internal`. +3. Write a per-topology compose file (`decnet-topology--compose.yml`). + Each decky's base lists every LAN it's on with a per-LAN + `ipv4_address`. Bridge deckies with `forwards_l3=True` get + `sysctls: {net.ipv4.ip_forward: 1}` + `cap_add: [NET_ADMIN]`. +4. `docker compose up --build -d` (with retry on transient errors). +5. Transition to `active`. + +On exception the topology is transitioned to `failed` with the error +text in the status event's `reason`. Partial Docker state is left in +place so you can tear it down cleanly. + +Dry-run mode writes the compose file and exits without touching +Docker or the topology's status: + +```bash +decnet topology deploy --dry-run +``` + +Use `--dry-run` to diff the compose output against a previous deploy +or to sanity-check the plan before committing networks. + +### 5. Teardown + +```bash +sudo decnet topology teardown 9b1e1234-5678-... +``` + +Legal from any of `active`, `degraded`, `failed`, or `deploying`. Runs: + +1. Transition to `tearing_down`. +2. `docker compose down --remove-orphans` (best effort — continues on + failure so a half-deployed topology can still be cleaned). +3. Remove each LAN's Docker bridge network in **leaf-first** order + (LAN names are BFS-numbered, so reverse-name order is a valid + topological sort). +4. Delete the per-topology compose file. +5. Transition to `torn_down`. + +`torn_down` is terminal. The repo row is kept for audit; to purge it +outright, call `repo.delete_topology_cascade(topology_id)` from code +(no CLI wrapper by design — deletes are destructive). + +--- + +## What a deployed topology looks like on the host + +```bash +# One bridge network per LAN, all prefixed decnet_t__. +docker network ls --filter name=decnet_t_ + +# Every decky base + its services as containers. Base containers are +# named decnet_t__decky-NNN; services share the base's +# netns. +docker ps --filter name=decnet_t_ + +# Inside a bridge decky's base, two interfaces (one per LAN). +docker exec decnet_t_abcd1234_decky-002 ip -br addr + +# ip_forward enabled on L3 forwarders. +docker exec decnet_t_abcd1234_decky-002 sysctl net.ipv4.ip_forward +# net.ipv4.ip_forward = 1 + +# Ping a deep LAN decky from the DMZ. With L3 forwarders in between, +# this succeeds — the attacker can reach it too. +docker exec decnet_t_abcd1234_decky-001 ping -c1 172.20.3.2 +``` + +Logs follow the standard DECNET pipeline. Each decky's service +containers write RFC 5424 to stdout; the host's `decnet collect` +worker tails `docker logs` and appends to +`DECNET_INGEST_LOG_FILE` — no changes to the collector are needed +for MazeNET. See [Logging and Syslog](Logging-and-Syslog). + +--- + +## Known limitations (v1) + +- **Single-host only.** MazeNET topologies do not span SWARM workers + — no overlay networks, no VXLAN. One box, one maze. Cross-host + topologies are phase 2. +- **No Healer.** `degraded` is schema-reserved but unreachable. A + container crashing leaves the topology in `active` until you notice + and tear down. Reconciliation worker is phase 2. +- **No mutation.** Topologies are static after deploy. You cannot + add/remove LANs or rewire bridges without a full teardown + + regenerate. The Mutator ([Mutation and Randomization](Mutation-and-Randomization)) + does not touch MazeNET. +- **No per-hop latency shaping.** Bridge deckies forward at wire + speed. `tc netem` per hop (to simulate WAN links) is phase 2. +- **No web UI yet.** Generate, list, show, deploy, teardown are all + CLI. Dashboard integration — including the visual DAG — is on the + roadmap (see [Roadmap](Roadmap-and-Known-Debt)). +- **IP base cap.** Default `172.20.X.0/24` base prefix caps a + topology at 256 LANs (idx > 255 raises). Well above the `depth=16, + branching=8` envelope, but don't set `subnet_base_prefix` to + something tighter expecting it to still fit. + +--- + +## Troubleshooting + +| Symptom | Likely cause | Fix | +|---|---|---| +| `topology deploy` immediately raises `TopologyStatusError` | Topology is already `active`/`failed`/`torn_down` — deploy is only legal from `pending` | `decnet topology list` to check the status; run `teardown` first if appropriate | +| `teardown` raises `TopologyStatusError` | Already `torn_down`, or tried to tear down from `pending` | `pending` → `torn_down` is legal, but you must be in `pending`; if the row shows `torn_down` there's nothing to do | +| Deploy fails with `create_bridge_network` errors about subnet overlap | A previous deploy of the same topology left networks behind, or another topology used the same `172.20.X.0/24` | `docker network ls --filter name=decnet_t_` and remove stragglers by name; teardown is idempotent — run it again | +| Bridge decky can't forward packets between LANs | `forwards_l3` rolled false for this bridge | By design. Check `decnet topology show ` — non-forwarding bridges are tagged `(bridge)` without the `L3-forward` tag. Regenerate with `--bridge-forward-p 1.0` if you want every bridge forwarding | +| Attacker can't reach deep LANs from DMZ | Intermediate bridge is not forwarding, or a LAN in the path is `--internal` with no forwarder in its direct parent | `docker exec sysctl net.ipv4.ip_forward` should print `1` on every bridge along the path | +| Two topologies clash on LAN subnets | Both were generated with the default `subnet_base_prefix=172.20` | Regenerate one with `--seed` changed is not enough — set a different base prefix via INI/config. Subnet base prefix is per-topology and must not overlap with anything else on the box | + +--- + +## Where the code lives + +| Module | Role | +|---|---| +| `decnet/topology/config.py` | `TopologyConfig` Pydantic model + dataclass records for the generator. | +| `decnet/topology/generator.py` | Deterministic plan generator. Tree first, then overlay cross-edges. | +| `decnet/topology/status.py` | `TopologyStatus` constants + `assert_transition` state machine. | +| `decnet/topology/persistence.py` | `persist`, `hydrate`, `transition_status` — repo adapter. | +| `decnet/topology/compose.py` | Per-topology compose-file generator. | +| `decnet/engine/deployer.py` | `deploy_topology`, `teardown_topology`, `_teardown_order`. | +| `decnet/cli/topology.py` | `decnet topology {generate,list,show,deploy,teardown}`. | +| `decnet/web/db/models.py` | Five MazeNET SQLModel tables + request DTOs. | +| `tests/topology/` | Generator determinism, status machine, persistence roundtrip, compose generation, deploy/failure paths, live docker e2e. | + +Full test coverage is enforced in the repo's 91%+ floor. Run +`pytest tests/topology/ -m "not live"` for the fast suite; add +`-m live` to exercise the Docker-daemon path (skipped on CI). diff --git a/_Sidebar.md b/_Sidebar.md index eb32020..658f701 100644 --- a/_Sidebar.md +++ b/_Sidebar.md @@ -18,6 +18,7 @@ - [Networking-MACVLAN-IPVLAN](Networking-MACVLAN-IPVLAN) - [Deployment-Modes](Deployment-Modes) - [SWARM-Mode](SWARM-Mode) +- [MazeNET](MazeNET) - [Remote-Updates](Remote-Updates) - [Environment-Variables](Environment-Variables) - [Teardown-and-State](Teardown-and-State)