The threat-intel surface was IP-keyed on day one as an expedient — the
worker is woken by IP-bearing bus events. ANTI's call: don't carry that
debt. NO IPs as primary keys anywhere on the attacker-intel surface.
Schema:
- attacker_uuid is now the canonical key — UNIQUE + FK to attackers.uuid.
- attacker_ip stays as a denormalised, indexed, NON-UNIQUE value column.
Updated on every upsert; useful for SIEM payloads and audit lookups,
but explicitly NOT a key. Model docstring says so.
- Pre-v1, no Alembic migration needed. SQLModel.metadata.create_all()
builds the new shape on fresh DBs.
Repo:
- upsert_attacker_intel now keys on attacker_uuid.
- get_attacker_intel_by_ip → get_attacker_intel_by_uuid.
- get_unenriched_attacker_ips → get_unenriched_attackers, returning
[{uuid, ip}] tuples so the worker writes by UUID and dispatches
provider calls by IP without a second round-trip.
Worker:
- _enrich_one(uuid, ip, ...) — UUID lands on the row, IP rides for
provider egress.
- attacker.intel.enriched bus payload gains attacker_uuid alongside
attacker_ip — webhook → SIEM consumers benefit; no removal.
API:
- GET /api/v1/attackers/{ip}/intel deleted outright (rip-and-replace,
never deployed beyond dev).
- GET /api/v1/attackers/{uuid}/intel is the only public route, matching
every other /attackers/* route.
Frontend:
- <IntelPanel uuid={id!} /> uses the URL param directly, fetches in
parallel with the rest of AttackerDetail rather than waiting on
attacker.ip.
Tests: re-keyed in place, 39 passed (same coverage as before the
refactor). Provider-impl tests untouched.
DEBT-041: closed in DEBT.md (entry preserved as historical rationale,
summary table flipped to ✅, remaining-open list shortened by one).
94 lines
4.5 KiB
Python
94 lines
4.5 KiB
Python
"""Threat-intel enrichment row — one per attacker IP, TTL-cached."""
|
|
from datetime import datetime, timezone
|
|
from typing import Optional
|
|
|
|
from sqlalchemy import Column
|
|
from sqlmodel import Field, SQLModel
|
|
|
|
from ._base import _BIG_TEXT
|
|
|
|
|
|
class AttackerIntel(SQLModel, table=True):
|
|
"""Aggregated threat-intel verdict for a single attacker IP.
|
|
|
|
Populated by the ``decnet enrich`` worker, which queries multiple
|
|
free-tier intel providers (GreyNoise Community, AbuseIPDB,
|
|
abuse.ch Feodo Tracker + ThreatFox) and writes one row per
|
|
attacker IP. The row is TTL-cached via ``expires_at`` so re-firings
|
|
inside the cache window short-circuit before any HTTP egress.
|
|
|
|
Per-provider columns are nullable until each provider has answered;
|
|
the enrichment pass writes whichever providers succeeded and leaves
|
|
the rest unchanged on a partial failure.
|
|
|
|
``schema_version`` is committed to storage from day one — federation
|
|
gossip in v2/v3 requires cross-operator compatibility, and
|
|
retrofitting a version column after rows exist is painful. Mirrors
|
|
the rationale on :class:`SessionProfile`.
|
|
"""
|
|
|
|
__tablename__ = "attacker_intel"
|
|
|
|
uuid: str = Field(primary_key=True) # uuid.uuid4().hex, generated by writer
|
|
# Canonical key. One intel row per attacker UUID; FK guarantees no orphan
|
|
# rows when an attacker is deleted, and UNIQUE keeps upserts honest.
|
|
attacker_uuid: str = Field(
|
|
foreign_key="attackers.uuid",
|
|
unique=True,
|
|
index=True,
|
|
)
|
|
# DENORMALISED — NOT a key. The IP the worker queried providers with at
|
|
# write time. Useful for SIEM payloads and audit lookups; updated on every
|
|
# upsert if the attacker rotates IPs. Never use this column as a lookup
|
|
# key; ``attacker_uuid`` is the only canonical identifier here.
|
|
attacker_ip: str = Field(index=True)
|
|
schema_version: int = Field(default=1)
|
|
|
|
# ── GreyNoise Community ─────────────────────────────────────────────
|
|
# classification ∈ {"benign", "malicious", "suspicious", "unknown"}
|
|
greynoise_classification: Optional[str] = Field(default=None, max_length=32)
|
|
greynoise_raw: str = Field(
|
|
default="{}",
|
|
sa_column=Column("greynoise_raw", _BIG_TEXT, nullable=False, default="{}"),
|
|
)
|
|
greynoise_queried_at: Optional[datetime] = Field(default=None)
|
|
|
|
# ── AbuseIPDB ────────────────────────────────────────────────────────
|
|
# 0..100 abuse confidence score
|
|
abuseipdb_score: Optional[int] = Field(default=None)
|
|
abuseipdb_raw: str = Field(
|
|
default="{}",
|
|
sa_column=Column("abuseipdb_raw", _BIG_TEXT, nullable=False, default="{}"),
|
|
)
|
|
abuseipdb_queried_at: Optional[datetime] = Field(default=None)
|
|
|
|
# ── abuse.ch Feodo Tracker ───────────────────────────────────────────
|
|
feodo_listed: Optional[bool] = Field(default=None)
|
|
feodo_raw: str = Field(
|
|
default="{}",
|
|
sa_column=Column("feodo_raw", _BIG_TEXT, nullable=False, default="{}"),
|
|
)
|
|
feodo_queried_at: Optional[datetime] = Field(default=None)
|
|
|
|
# ── abuse.ch ThreatFox ───────────────────────────────────────────────
|
|
threatfox_listed: Optional[bool] = Field(default=None)
|
|
threatfox_raw: str = Field(
|
|
default="{}",
|
|
sa_column=Column("threatfox_raw", _BIG_TEXT, nullable=False, default="{}"),
|
|
)
|
|
threatfox_queried_at: Optional[datetime] = Field(default=None)
|
|
|
|
# ── Aggregate verdict ────────────────────────────────────────────────
|
|
# Synthesised from per-provider columns. ∈ {"malicious", "suspicious",
|
|
# "benign", "unknown"}. Used by the dashboard and webhook consumers
|
|
# that don't want to reason over four provider columns.
|
|
aggregate_verdict: Optional[str] = Field(
|
|
default=None, max_length=32, index=True
|
|
)
|
|
|
|
# ── TTL bookkeeping ──────────────────────────────────────────────────
|
|
cached_at: datetime = Field(
|
|
default_factory=lambda: datetime.now(timezone.utc), index=True
|
|
)
|
|
expires_at: datetime = Field(index=True)
|