fix(bus): silently drop publishes on closed bus instead of raising

Worker bus instances (collector, ingester) close their private buses
in finally blocks on shutdown, but stream threads holding closure
references kept calling publish after close — one `RuntimeError:
publish on closed bus` per stream line, caught by publish_safely
and logged per call, flooding server logs.

Changes:
- `UnixSocketBus.publish()` now drops post-close calls. First drop
  WARNs loudly (bus is critical infra — silent drops would hide real
  problems); subsequent drops on the same instance log at DEBUG to
  prevent the flood. Sticky `_closed_publish_warned` flag, reset
  naturally per new bus instance.
- `make_thread_safe_publisher` short-circuits on a closed bus before
  marshalling a coroutine onto the loop. Avoids the wasted scheduling
  work in the hot shutdown path.

Degradation is safe: callers go through `publish_safely`, which
already treats exceptions as 'dropped notification, DB is source of
truth.' We just stop manufacturing the exception in the first place
for a known-benign condition.
This commit is contained in:
2026-04-23 18:00:47 -04:00
parent eb2308d9e1
commit 4418608a54
3 changed files with 132 additions and 1 deletions

View File

@@ -61,6 +61,14 @@ def make_thread_safe_publisher(
return lambda _topic, _payload, _event_type="": None
def _publish(topic: str, payload: dict[str, Any], event_type: str = "") -> None:
# Stream threads may keep draining after the bus owner closed it
# (shutdown race). Short-circuit here so we don't marshal a
# coroutine onto a dead loop just to have publish_safely swallow
# it. bus.publish's own WARN-once guard handles the rare case
# where _closed flips between this check and the coroutine
# actually running.
if getattr(bus, "_closed", False):
return
try:
asyncio.run_coroutine_threadsafe(
publish_safely(bus, topic, payload, event_type=event_type),