feat(clustering): revocable merges (merge + unmerge)

Reworks the clusterer's tick to handle multi-identity components and
re-evaluate prior merges. Two passes per tick:

Pass 1 — per-component reconciliation:
  * Fresh component → mint identity (commit 4 path).
  * Single-identity component → link unassigned observations.
  * Multi-identity component → soft-merge: pick the smallest-uuid
    winner deterministically, set merged_into_uuid on each loser,
    link unassigned observations to the winner. Observations stay
    FK'd to their original identity row — the merge is a soft
    pointer, not a re-point. Audit trail preserved; cached
    subscribers resolve through the chain.

Pass 2 — revocable-merge undo:
  * For each merged-out identity, check whether its observations
    still cluster with its winner's. If not, the merge is
    contradicted by new evidence — clear merged_into_uuid and emit
    identities_unmerged. The resurrected identity keeps its original
    uuid, so subscribers that cached it during the merged interval
    re-attach without a new lookup.

A pre-built merge-chain dict feeds Pass 1 so the effective-identity
lookup is O(1) per observation. The chain has a hop cap (paranoia
against accidental cycles in the underlying state).

Repo additions on BaseRepository + SQLModelRepository:
  * list_all_identities() — includes merged-out rows.
  * update_identity_merged_into(uuid, winner_or_None) — single
    setter for both merge and unmerge.
DummyRepo coverage stub updated.

Tests:
  * Two distinct identities bridged by a new observation merge with
    the smaller uuid as winner.
  * A pre-seeded soft-merge whose underlying observations diverge
    gets revoked; resurrected uuid emerges with merged_into_uuid
    cleared.
  * Tick is idempotent under no state changes.
This commit is contained in:
2026-04-26 08:33:32 -04:00
parent 87412da1ca
commit e364ef8859
5 changed files with 343 additions and 50 deletions

View File

@@ -1514,6 +1514,27 @@ class SQLModelRepository(BaseRepository):
await session.execute(statement)
await session.commit()
async def list_all_identities(self) -> list[dict[str, Any]]:
statement = select(AttackerIdentity).order_by(AttackerIdentity.created_at)
async with self._session() as session:
result = await session.execute(statement)
return [i.model_dump(mode="json") for i in result.scalars().all()]
async def update_identity_merged_into(
self, identity_uuid: str, winner_uuid: Optional[str],
) -> None:
statement = (
update(AttackerIdentity)
.where(AttackerIdentity.uuid == identity_uuid)
.values(
merged_into_uuid=winner_uuid,
updated_at=datetime.now(timezone.utc),
)
)
async with self._session() as session:
await session.execute(statement)
await session.commit()
async def get_attacker_commands(
self,
uuid: str,