Files
DECNET/decnet/realism/planner.py
anti cb1872c52f feat(realism): synthetic_files table + planner wiring + scheduler swap
Stage 3 of the realism migration. Replaces orchestrator/scheduler.py's
hardcoded _FILE_TEMPLATES/_USERS (3 templates emitting epoch-suffixed
filenames like notes-1777315854.txt with identical bodies per
template) with a persona-driven realism engine.

New surface:

- SyntheticFile SQLModel (synthetic_files table, UNIQUE on
  decky_uuid+path) — per-(decky, path) state for the future
  edit-in-place flow. Pre-v1, no _migrate_* helper.
- BaseRepository methods: record_synthetic_file,
  update_synthetic_file, list_synthetic_files,
  pick_random_synthetic_file_for_edit (used by stage 3b).
- realism/naming.py: per-content-class filename templates,
  persona-conditioned. /var/log/cron.log + logrotate skeleton for
  system-class; /home/<persona>/TODO.md, scratch.md, etc. for
  user-class. Anti-regression test pins "no 8+ digit decimals in
  basenames" (the realism failure today).
- realism/bodies.py: deterministic body templates per content_class.
  TODO body uses checkbox markdown, script body has a shebang, cron
  body matches syslog cron shape ("CRON[PID]: (user) CMD (...)").
- realism/planner.py: pick(deckies, now, rng) returns a Plan.
  Diurnal-gated, weighted user/system content split (70/30 user
  bias). Create-only in stage 3; edit branch lands in stage 3b.

Scheduler split:

- scheduler.pick is now traffic-only (sync).
- scheduler.pick_file is async, takes a repo, resolves personas
  (Topology.email_personas for topology-source deckies; global
  realism.personas_pool otherwise), and maps Plan -> FileAction.
- FileAction gains persona/content_class/mtime fields.

Worker:

- _one_tick rolls 50/50 between traffic and file each tick. After a
  successful FileAction plant, _record_synthetic_file persists or
  patches the synthetic_files row (catching the unique-constraint
  collision on re-plant of the same path).
- SSHDriver._run_file passes action.mtime through to plant_file so
  files don't all stamp at wall-clock-now.
2026-04-27 16:22:07 -04:00

133 lines
4.5 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
"""Realism planner — picks the next ``(decky, persona, class, action)`` tuple.
Stage 3: returns ``create``-only plans (the edit branch lands in
stage 3b). Pure-function, deterministic given the same inputs:
caller passes deckies (with personas pre-resolved on each row),
``now``, and an RNG.
The persona resolution split — topology-pool vs. global-pool — is
the orchestrator's job, not the planner's. Each decky dict reaching
:func:`pick` carries a ``_realism_personas`` key with the resolved
:class:`~decnet.realism.personas.EmailPersona` list. Keeps the
planner test-isolated and avoids forcing it to know about the
:class:`~decnet.web.db.repository.BaseRepository` / topology pool /
global pool.
Diurnal gating uses :func:`decnet.realism.diurnal.in_work_hours` per
persona; we filter the (decky, persona) pairs *before* picking, so a
persona outside its window is never considered.
"""
from __future__ import annotations
import secrets
from datetime import datetime
from typing import Any, Optional, Sequence
from decnet.realism import bodies, naming
from decnet.realism.diurnal import in_work_hours, sample_mtime
from decnet.realism.personas import EmailPersona
from decnet.realism.taxonomy import ContentClass, Plan
# Stage-3 weighted sampling:
# * User content (notes/todo/draft/script) gets the bulk — those are
# the realism win when a persona "looks busy."
# * System content (cron/daemon/cache) is plausible filler.
# * Email + canary are owned by other paths and not picked here.
_USER_CLASS_WEIGHTS: tuple[tuple[ContentClass, int], ...] = (
(ContentClass.NOTE, 30),
(ContentClass.TODO, 20),
(ContentClass.DRAFT, 15),
(ContentClass.SCRIPT, 10),
)
_SYSTEM_CLASS_WEIGHTS: tuple[tuple[ContentClass, int], ...] = (
(ContentClass.LOG_CRON, 12),
(ContentClass.LOG_DAEMON, 8),
(ContentClass.CACHE_TMP, 5),
)
def _weighted_pick(
weights: tuple[tuple[ContentClass, int], ...],
rng: secrets.SystemRandom,
) -> ContentClass:
total = sum(w for _, w in weights)
target = rng.randint(1, total)
running = 0
for cls, w in weights:
running += w
if target <= running:
return cls
return weights[-1][0] # unreachable, satisfy mypy
def _eligible_pairs(
deckies: Sequence[dict[str, Any]],
now: datetime,
) -> list[tuple[dict[str, Any], EmailPersona]]:
"""Cross-product of deckies × resolved personas, diurnal-filtered.
A decky with no personas (empty ``_realism_personas``) is skipped
entirely; same fail-quiet semantics as the emailgen scheduler.
"""
out: list[tuple[dict[str, Any], EmailPersona]] = []
for decky in deckies:
personas: list[EmailPersona] = decky.get("_realism_personas") or []
for persona in personas:
if in_work_hours(persona.active_hours, now):
out.append((decky, persona))
return out
def pick(
deckies: Sequence[dict[str, Any]],
now: datetime,
*,
rand: Optional[secrets.SystemRandom] = None,
) -> Optional[Plan]:
"""Return a single :class:`Plan` for the orchestrator's tick.
Stage-3 policy: create-only. Stage 3b extends with the
create/edit/leave roll and the synthetic_files lookup for edits.
Returns ``None`` when no eligible (decky, persona) pair exists —
the orchestrator treats that as "skip this tick" the same way the
pre-realism scheduler did.
"""
rng = rand or secrets.SystemRandom()
eligible = _eligible_pairs(deckies, now)
if not eligible:
return None
decky, persona = rng.choice(eligible)
# User vs system content — biased toward user (realism wins are
# bigger there). Once stage 3b ships edit-in-place, the edit
# branch will reuse the same content_class as the existing row;
# the create branch picks fresh here.
if rng.random() < 0.7:
content_class = _weighted_pick(_USER_CLASS_WEIGHTS, rng)
else:
content_class = _weighted_pick(_SYSTEM_CLASS_WEIGHTS, rng)
target_path = naming.make_path(content_class, persona.name, rand=rng)
body_hint = bodies.make_body(content_class, persona.name, rand=rng)
mtime = sample_mtime(persona.active_hours, now, rand=rng)
return Plan(
decky_uuid=decky["uuid"],
decky_name=decky["name"],
persona=persona.name,
content_class=content_class,
action="create",
target_path=target_path,
mtime=mtime,
body_hint=body_hint,
notes=(
f"persona={persona.name}",
f"class={content_class.value}",
f"window={persona.active_hours}",
),
)