Commit Graph

16 Commits

Author SHA1 Message Date
64610bf96e fix(tests): sync 4 tests to current production contracts
- SSH schema: add user + user_password fields (service extended post-test)
- TopologySummary: repo.get_topology() returns model now, not raw dict
- health live: tarpit_watcher added to get_background_tasks(), add to expected set
2026-05-10 06:48:42 -04:00
95593cb804 fix(test): access DeckyRow.uuid as attribute, not dict key 2026-05-10 05:36:07 -04:00
16e032b7a5 fix(test): access LANRow.id as attribute, not dict key 2026-05-10 05:26:49 -04:00
32eeb0c813 refactor(orchestrator): collapse decnet-emailgen.service into orchestrator
Stage 5 of the realism migration. Email generation is no longer a
separate worker / systemd unit / CLI subcommand — the orchestrator's
single tick loop covers SSH traffic, file plants, and email drops.
Going from 21 services to 20.

Worker:
- _one_tick rolls between traffic / file / email (45/45/10 weights).
  The 10% email weight at a 60s orchestrator interval produces ~one
  email per 10 minutes, close to the pre-collapse 5-minute cadence.
- get_driver_for(action) (stage 4) handles SSH vs Email dispatch.
- Quiet branches fall through so a (decky-set, persona-pool,
  mail-decky) shape that silences one branch doesn't waste the tick.
- Periodic prune covers both orchestrator_events and
  orchestrator_emails tables.

Deletions:
- deploy/decnet-emailgen.service.j2
- decnet/orchestrator/emailgen/worker.py
- decnet/cli/emailgen.py
- tests/orchestrator/emailgen/test_worker_integration.py

Renames (history-preserving):
- decnet/web/router/emailgen/ -> decnet/web/router/realism/
- tests/api/emailgen/        -> tests/api/realism/
- tests/cli/test_emailgen_*  -> tests/cli/test_realism_*

Public surface changes (clean break, pre-v1):
- API URL /api/v1/emailgen/personas -> /api/v1/realism/personas
- CLI `decnet emailgen import-personas` -> `decnet realism
  import-personas`. `decnet emailgen run` is gone — the orchestrator
  covers it.
- gating.py: emailgen master-only group replaced by realism.
- decnet-orchestrator.service.j2: DECNET_REALISM_* env block added.
- decnet.target: decnet-emailgen.service entry removed.
- frontend: PersonaGeneration.tsx fetches /realism/personas.
2026-04-27 16:33:04 -04:00
ee176a6f79 Revert "feat(mazenet): per-LAN swarm host pin"
This reverts commit 0d92170a57.
2026-04-25 03:26:19 -04:00
0d92170a57 feat(mazenet): per-LAN swarm host pin
Adds nullable LAN.host_uuid (FK swarm_hosts.uuid). Resolution order
when deploying a LAN: lan.host_uuid → topology.target_host_uuid →
master. A LAN is one Docker bridge so the bridge cannot span hosts;
this pin forces every decky in the LAN onto the named host.

LANCreateRequest / LANUpdateRequest accept host_uuid; both validate
that the host exists, returning 400 on unknown UUIDs. PATCH still
gated by the existing pending-only guard, so reassignment of a live
LAN is not yet possible (deferred to mutator support).

LANRow surfaces the field so the frontend can render per-host badges.
2026-04-25 03:04:23 -04:00
c214cdd7bb fix(api/topology): map duplicate-name IntegrityError to 409
POST /topologies raised a 500 with a raw SQLAlchemy IntegrityError
traceback when the name collided with an existing topology. Catch
the error at the router, verify it's the ix_topologies_name
constraint (so unrelated integrity failures still surface as 500s
with their real traceback), and return 409 with a helpful detail.

Test covers the create-then-duplicate-create flow.
2026-04-24 19:06:37 -04:00
162f7c1194 feat(api/sse): per-user connection cap + viewer-safe invariant
New decnet/web/sse_limits.py provides sse_connection_slot, an async
context manager that counts live SSE connections per user UUID and
raises 429 when a per-user cap is exceeded (default 5, override via
DECNET_SSE_MAX_PER_USER). Wired into both SSE generators as their
first async with, so the cap check fires before any stream data is
yielded.

The cap must sit inside the generator — StreamingResponse returns
before the generator body runs, so a handler-level wrapper would
release the slot immediately. Put prefetch + slot + loop all under
the one async with.

Also documents F6/I (role leakage) as mitigated-by-construction via
handler docstrings: every event type on both streams wraps data
already reachable via viewer-gated REST, so no per-event filter is
needed until a new event family is introduced. The invariant is
written into the handler docstrings so a future PR can't silently
add admin-only events.

Resolves THREAT_MODEL F6/I and F6/D.
2026-04-24 15:01:20 -04:00
1968f6e741 test(mutator,web): cover bus publishes, bus-wake, and SSE events route
- tests/topology/test_mutator.py: reconcile_topologies publishes
  applying+applied on success, applying+failed+status on failure; and
  stays safe when bus=None. _wake_on_enqueue sets its asyncio.Event
  on every matching enqueue event.
- tests/api/topology/test_mutations.py: POST /mutations publishes
  mutation.enqueued after a successful DB write, via a FakeBus
  injected in place of the app-wide bus singleton.
- tests/api/topology/test_events_stream.py: SSE route returns 401
  unauthenticated, 404 for unknown topologies, and (driving the
  async generator directly) emits a snapshot on connect plus
  forwards a published mutation.applied as an `event: mutation.applied`
  SSE frame.
2026-04-21 14:39:12 -04:00
12e18b75db feat(swarm): expose needs_resync on TopologySummary + upsert record_error
Two small observability follow-ups to the phase-1 agent/topology wiring:

TopologySummary now carries needs_resync so operators can see the
heartbeat's resync flag via the topology list/detail API without
dropping into the DB.

TopologyStore.record_error becomes an upsert — when a docker/compose
failure fires during the first materialise (put() never reached), we
still land a marker row so GET /topology/state surfaces the error and
the next heartbeat carries an empty applied_version_hash. That empty
hash is what master's heartbeat check relies on to flag the topology
for resync instead of assuming the apply succeeded.
2026-04-21 01:41:30 -04:00
5a0cf5d7c8 feat(topology): add target_host_uuid to pin topologies to swarm agents
Adds the `target_host_uuid` FK on `Topology` plus wiring through the
two create endpoints (`POST /topologies`, `POST /topologies/blank`).
Validates the mode/host pair: `mode='agent'` now requires a known,
routable host; `mode='unihost'` must leave the field unset.
Surfaced on `TopologySummary` so list/detail responses expose it.
Purely additive at the schema level — existing unihost flows unchanged
(field defaults to `NULL`).

Step 1 of the agent <-> topology integration.
2026-04-21 01:19:45 -04:00
d06b04221f feat(api/topology): live mutation queue endpoints (POST/GET /mutations) 2026-04-20 19:38:55 -04:00
ff0b2efbb0 feat(api/topology): pending-only child CRUD for LANs, deckies, edges 2026-04-20 19:37:16 -04:00
999113e3c3 feat(api/topology): POST/DELETE/deploy endpoints for MazeNET topologies 2026-04-20 19:34:35 -04:00
f182c98ffa feat(api): phase 3 step 2 — topology read endpoints (list/get/status/catalog)
GET /api/v1/topologies — paginated list with status filter. Extends
repo.list_topologies() to accept limit/offset and adds count_topologies()
for the total envelope field.

GET /api/v1/topologies/{id} — hydrated TopologyDetail; 404 if missing.
GET /api/v1/topologies/{id}/status-events — audit trail, limit-capped.

Catalog helpers for the phase-4 canvas UI:
* GET /topologies/services — full service catalog.
* GET /topologies/next-subnet?base=172.20 — wraps SubnetAllocator against
  reserved_subnets across non-torn-down topologies.
* GET /topologies/{id}/lans/{lan_id}/next-ip — IPAllocator pre-seeded
  with existing decky IPs in that LAN.

All read routes are viewer-or-admin. Sub-routers are included in an
order that keeps literal catalog paths (/services, /next-subnet) from
being shadowed by the /{topology_id} trie branch.
2026-04-20 18:25:33 -04:00
2379b2aeda feat(api): phase 3 step 1 — topology request/response models + router skeleton
Add Pydantic DTOs in decnet/web/db/models.py covering every phase-3
endpoint shape: TopologyGenerateRequest, TopologySummary/Detail, child
create/update requests, MutationEnqueueRequest (Literal op guard),
MutationRow with JSON-payload decoder, validation/version/not-editable
error envelopes, and the three catalog responses.

Create decnet/web/router/topology/ as an import-safe package exporting
topology_router (prefix /topologies) — sub-routers land step-by-step in
subsequent commits. Mount under the main api router alongside swarm_mgmt.

tests/api/topology/test_models.py pins repo-dict ↔ DTO parity so future
repo-row drift breaks the contract test before the endpoints.
2026-04-20 18:16:30 -04:00