refactor(emailgen): pluggable LLM backend (base/factory/impl)
Lift the Ollama subprocess shell-out out of EmailDriver and into a
proper provider subpackage shape:
decnet/orchestrator/emailgen/llm/
base.py — LLMBackend Protocol + LLMResult + LLMTimeout
factory.py — get_llm() reads DECNET_EMAILGEN_LLM
impl/ollama.py — current 'ollama run' subprocess path
impl/fake.py — canned-output backend used by tests
Driver now takes an LLMBackend on construction (or inherits the
factory default). Tests inject FakeBackend instead of monkeypatching
the subprocess layer, which is cleaner and ~10x faster. Swapping
Ollama for the Anthropic API / vLLM / llama.cpp is now a third branch
in factory.py; no driver rewrite needed.
Mirrors the convention used by decnet.web.db.factory + decnet.bus.factory
per the provider-subpackages-from-day-one rule in memory.
This commit is contained in:
47
decnet/orchestrator/emailgen/llm/base.py
Normal file
47
decnet/orchestrator/emailgen/llm/base.py
Normal file
@@ -0,0 +1,47 @@
|
||||
"""Backend protocol shared by every LLM transport.
|
||||
|
||||
Deliberately narrow: emailgen needs one async ``generate`` call that
|
||||
takes a prompt string and returns the model's output text plus enough
|
||||
metadata for the worker to populate the orchestrator-email payload
|
||||
(model name, latency, success bit). Streaming, embeddings, multi-turn
|
||||
chat — all out of scope here; emailgen only ever does one-shot
|
||||
single-prompt generations.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any, Protocol
|
||||
|
||||
|
||||
class LLMTimeout(Exception):
|
||||
"""Raised when a generation exceeds the backend's wall-clock cap.
|
||||
|
||||
Backends MUST raise this rather than returning silently empty
|
||||
output; the driver discriminates timeout from "model produced
|
||||
nothing useful" so payloads carry the right ``stage`` value.
|
||||
"""
|
||||
|
||||
|
||||
@dataclass
|
||||
class LLMResult:
|
||||
"""Outcome of one ``generate`` call.
|
||||
|
||||
``success`` is ``False`` when the backend ran cleanly but produced
|
||||
no usable output (e.g. an empty stdout). Hard failures (subprocess
|
||||
crash, network error) raise; soft failures land here so the driver
|
||||
can persist + log them as one event.
|
||||
"""
|
||||
success: bool
|
||||
text: str
|
||||
model: str
|
||||
latency_ms: int
|
||||
extra: dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
|
||||
class LLMBackend(Protocol):
|
||||
"""Minimal contract for an emailgen LLM provider."""
|
||||
|
||||
model: str
|
||||
timeout: float
|
||||
|
||||
async def generate(self, prompt: str) -> LLMResult: ...
|
||||
Reference in New Issue
Block a user