feat(attackers): scanned vs. interacted service bucketing on detail page

Adds a new card on AttackerDetail: SCANNED · N services | INTERACTED
WITH · M services. Distinguishes port-scanners (N high, M=0) from
actual engagement (M>0) at a glance — the analyst's first question
when triaging a new attacker row.

Classifier lives in decnet/correlation/event_kinds.py, a single
source of truth for the event-type vocabulary:

- INTERACTION_EVENT_TYPES — command-family (command/exec/query/...),
  SMTP engagement (mail_from/rcpt_to/message_accepted), file/payload
  activity (file_captured/upload/download_attempt/retr), pub/sub
  (publish/subscribe), recorded TTY sessions.
- NOISE_EVENT_TYPES — DECNET-internal (startup/shutdown/parse_error/
  unknown_*).
- Everything else defaults to scan. Conservative by design: new
  template verbs show up as "scanned" until explicitly promoted.

Bucket logic: a service is "interacted" if ≥1 of its events
classifies as interaction; otherwise "scanned" if ≥1 scan event;
noise-only services drop. Disjoint by construction.

Deliberate no-schema path: compute on-the-fly in the detail endpoint
via SELECT DISTINCT service, event_type FROM logs. Small result set
(tens of pairs per attacker), cost is trivial vs. the existing
behavior/commands queries. Trade-off: one more DB round-trip per
detail view in exchange for zero ALTER TABLE migration pain and
immediate classifier-change feedback loop.

Profiler's _COMMAND_EVENT_TYPES stays as-is (strict subset of
interactions that carry executable text), with a comment pointing at
the new canonical module.

Closes DEVELOPMENT.md "Attacker Intelligence §Service-Level Behavioral
Profiling — Services actively interacted with".
This commit is contained in:
2026-04-24 17:12:20 -04:00
parent ce6b4a4174
commit 351a8939c3
8 changed files with 322 additions and 1 deletions

View File

@@ -247,6 +247,16 @@ class BaseRepository(ABC):
"""Return `session_recorded` log rows for this attacker, newest first."""
pass
async def get_attacker_service_activity(
self, attacker_uuid: str
) -> list[tuple[str, str]]:
"""Return the distinct ``(service, event_type)`` pairs observed
for one attacker, for bucketing into scanned vs. interacted
services. Default is NotImplementedError so non-SQLModel backends
must opt in; SQLModelRepository overrides with a cheap DISTINCT
query."""
raise NotImplementedError
@abstractmethod
async def get_session_log(self, sid: str) -> Optional[dict[str, Any]]:
"""Look up the `session_recorded` Log row for a given session UUID."""

View File

@@ -881,6 +881,32 @@ class SQLModelRepository(BaseRepository):
page = commands[offset: offset + limit]
return {"total": total, "data": page}
async def get_attacker_service_activity(
self, attacker_uuid: str
) -> list[tuple[str, str]]:
"""Return distinct ``(service, event_type)`` pairs for an attacker.
Resolves IP then ``SELECT DISTINCT service, event_type FROM logs
WHERE attacker_ip = :ip`` — the result set is bounded by the
cardinality of services × event_types (tens, not thousands), so
this stays cheap even for attackers with long event streams.
Caller applies `event_kinds.bucket_services` to split into
scanned vs. interacted.
"""
async with self._session() as session:
ip_res = await session.execute(
select(Attacker.ip).where(Attacker.uuid == attacker_uuid)
)
ip = ip_res.scalar_one_or_none()
if not ip:
return []
rows = await session.execute(
select(Log.service, Log.event_type)
.where(Log.attacker_ip == ip)
.distinct()
)
return [(svc, evt) for svc, evt in rows.all()]
async def get_attacker_artifacts(self, uuid: str) -> list[dict[str, Any]]:
"""Return `file_captured` logs for the attacker identified by UUID.

View File

@@ -2,6 +2,7 @@ from typing import Any
from fastapi import APIRouter, Depends, HTTPException
from decnet.correlation.event_kinds import bucket_services
from decnet.telemetry import traced as _traced
from decnet.web.dependencies import require_viewer, repo
@@ -27,4 +28,10 @@ async def get_attacker_detail(
if not attacker:
raise HTTPException(status_code=404, detail="Attacker not found")
attacker["behavior"] = await repo.get_attacker_behavior(uuid)
# Scanned vs. interacted-with — computed per-request from the log
# stream, not persisted. Cheap (DISTINCT bounded by service ×
# event_type cardinality), and changes to the classifier take effect
# immediately without a profiler re-tick.
pairs = await repo.get_attacker_service_activity(uuid)
attacker["service_activity"] = bucket_services(pairs)
return attacker