refactor(emailgen): pluggable LLM backend (base/factory/impl)

Lift the Ollama subprocess shell-out out of EmailDriver and into a proper provider subpackage shape: decnet/orchestrator/emailgen/llm/ base.py — LLMBackend Protocol + LLMResult + LLMTimeout factory.py — get_llm() reads DECNET_EMAILGEN_LLM impl/ollama.py — current 'ollama run' subprocess path impl/fake.py — canned-output backend used by tests Driver now takes an LLMBackend on construction (or inherits the factory default). Tests inject FakeBackend instead of monkeypatching the subprocess layer, which is cleaner and ~10x faster. Swapping Ollama for the Anthropic API / vLLM / llama.cpp is now a third branch in factory.py; no driver rewrite needed. Mirrors the convention used by decnet.web.db.factory + decnet.bus.factory per the provider-subpackages-from-day-one rule in memory.
2026-04-26 22:43:36 -04:00
parent 4badc75fb2
commit 6d520eaa6f
10 changed files with 546 additions and 79 deletions
--- a/decnet/orchestrator/emailgen/llm/base.py
+++ b/decnet/orchestrator/emailgen/llm/base.py
@@ -0,0 +1,47 @@
+"""Backend protocol shared by every LLM transport.
+
+Deliberately narrow: emailgen needs one async ``generate`` call that
+takes a prompt string and returns the model's output text plus enough
+metadata for the worker to populate the orchestrator-email payload
+(model name, latency, success bit).  Streaming, embeddings, multi-turn
+chat — all out of scope here; emailgen only ever does one-shot
+single-prompt generations.
+"""
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from typing import Any, Protocol
+
+
+class LLMTimeout(Exception):
+    """Raised when a generation exceeds the backend's wall-clock cap.
+
+    Backends MUST raise this rather than returning silently empty
+    output; the driver discriminates timeout from "model produced
+    nothing useful" so payloads carry the right ``stage`` value.
+    """
+
+
+@dataclass
+class LLMResult:
+    """Outcome of one ``generate`` call.
+
+    ``success`` is ``False`` when the backend ran cleanly but produced
+    no usable output (e.g. an empty stdout).  Hard failures (subprocess
+    crash, network error) raise; soft failures land here so the driver
+    can persist + log them as one event.
+    """
+    success: bool
+    text: str
+    model: str
+    latency_ms: int
+    extra: dict[str, Any] = field(default_factory=dict)
+
+
+class LLMBackend(Protocol):
+    """Minimal contract for an emailgen LLM provider."""
+
+    model: str
+    timeout: float
+
+    async def generate(self, prompt: str) -> LLMResult: ...