docs(1.1): correct strategy — consolidation is primary lever (C3)

Measurement after C2: all 25 CLI modules transitively pull the SQLModel
ORM, and the table chain is a hub (one table loads the whole registry).
Most workers genuinely use the DB, so lazy imports only help the 2-3
DB-less workers. Consolidation (pay the 86MB floor once for the herd)
is the reliable ~600MB win — promoted from fallback to primary.
This commit is contained in:
2026-06-17 16:36:04 -04:00
parent 4e2b1cdaf3
commit 87eb986467

View File

@@ -1,10 +1,23 @@
# DECNET 1.1 — RAM / Process-Footprint Release # DECNET 1.1 — RAM / Process-Footprint Release
Predecessor: `v1.0.0`. Theme: cut the fleet resident set from **2.57 GB → target ~1.3 GB** Predecessor: `v1.0.0`. Theme: cut the fleet resident set from **2.57 GB → target ~1.0 GB**
with near-zero risk, then optionally further via worker consolidation. by paying the import floor once per process-group instead of once per worker.
Analysis & measurements: see [improvements.md](improvements.md). Analysis & measurements: see [improvements.md](improvements.md).
## ⚠️ Strategy correction (after C2 measurement)
Initial assumption: lazy imports would drop idle workers below the 86 MB floor → ~1.3 GB.
**Measurement disproved this.** All 25 CLI command modules transitively pull the SQLModel
ORM, and the 38 MB table chain is a *hub*: importing any one table loads the whole registry.
Most workers legitimately touch the DB, so they can't shed the ORM. Lazy registration only
helps the 2-3 genuinely **DB-less** workers (`forwarder`, `listener`, maybe `bus`).
**Therefore consolidation is the PRIMARY lever, not the fallback.** Paying the 86 MB floor
once for the idle herd (instead of ~8×) is the reliable ~600 MB win. C2/C3 stay as cheap
hygiene and to let the DB-less workers actually skip the ORM, but they are no longer the
headline.
## Why ## Why
18 long-running workers, ~2.57 GB resident. ~1.5 GB of that is the **same 86 MB import 18 long-running workers, ~2.57 GB resident. ~1.5 GB of that is the **same 86 MB import
@@ -27,12 +40,15 @@ the ORM into every worker. No production code imports `generate` from the packag
## Commit plan (incremental, one concern each) ## Commit plan (incremental, one concern each)
- [x] **C1 — docs.** `improvements.md` (analysis) + this release plan. - [x] **C1 — docs.** `improvements.md` (analysis) + this release plan.
- [ ] **C2 — lazy topology re-export.** PEP 562 `__getattr__` in `topology/__init__.py` - [x] **C2 — lazy topology re-export.** PEP 562 `__getattr__` in `topology/__init__.py`
so `generate` loads on access, not on package import. Public API unchanged. so `generate` loads on access, not on package import. Public API unchanged.
Test: `import decnet.cli` must NOT pull `decnet.web.db.models`. Re-measure floor. Guard test added. *Finding: correct, but ORM pull is pervasive (see correction above)
- [ ] **C3 — sweep remaining eager model pulls.** After C2, re-trace `import decnet.cli`; — C2 alone does not move the floor.*
defer any other command module that drags the ORM in for registration only. - [ ] **C3 — DB-less worker lazy registration (reduced scope).** Confirm `forwarder` /
Test: assert idle-worker floor stays under target. `listener` (and audit `bus`) never touch models at runtime; make ONLY their command
modules import-clean so those specific processes skip the 38 MB ORM. Skip the
DB-touching majority — they can't shed it. Test: those modules don't pull
`decnet.web.db.models`.
- [ ] **C4 — extract idle-herd coroutines.** Hoist the inline `_run()` closures - [ ] **C4 — extract idle-herd coroutines.** Hoist the inline `_run()` closures
(`webhook`, `canary`, `listener`, `forwarder`, `mutate`, `enrich`) into reusable (`webhook`, `canary`, `listener`, `forwarder`, `mutate`, `enrich`) into reusable
`async def run(bus, cfg)` in their packages, so they're hostable by a supervisor. `async def run(bus, cfg)` in their packages, so they're hostable by a supervisor.
@@ -59,7 +75,9 @@ the ORM into every worker. No production code imports `generate` from the packag
restart + `MemoryMax`. **Medium.** Verify on the live fleet before adopting; keep the restart + `MemoryMax`. **Medium.** Verify on the live fleet before adopting; keep the
individual units as the fallback. Do C2C4 first; C5 only if RAM still bites. individual units as the fallback. Do C2C4 first; C5 only if RAM still bites.
## Projected ## Projected (revised)
- C2C3 (import floor): 2.57 GB → **~1.3 GB**. Nearly free. - C2C3 (import floor): only the 2-3 DB-less workers shed the ORM. **~100 MB.** Cheap hygiene.
- C4C6 (consolidation): **~0.9 GB**. Costs process isolation. - **C4C5 (consolidation): the main event.** Idle herd ~8 × 86 MB → 1 × 86 MB ≈ **600 MB**.
- C6 (scapy merge): 3 × 76 MB → 1 × 76 MB ≈ **150 MB**.
- Total: 2.57 GB → **~1.0 GB**. The bulk comes from consolidation, which costs isolation.