Adds per-mint nonce gating, structural shape validation, mint UUID consistency checks, and a per-(token, IP) rate limiter to the canary worker so attackers who extract a canary from a decky filesystem cannot poison fingerprint forensics by replaying or forging ?d= submissions. Changes: base.py fingerprint_nonce: Optional[str] added to CanaryArtifact so generators can surface the nonce to the cultivator without coupling the generator directly to DB code. obfuscator.py nonce_for(callback_token, mint_uuid): HMAC-SHA256 keyed on DECNET_CANARY_FINGERPRINT_SECRET, truncated to 16 hex chars. FingerprintSecretMissing raised at mint time if env var is unset. render_fingerprint_js() now accepts nonce= and substitutes MINT_NONCE. fingerprint_payload.js New MINT_NONCE placeholder. Appended as &k= on all beacon URLs (bare-open, single-shot, chunked). Using &k= avoids colliding with &n= (chunk total). fingerprint_html.py / fingerprint_svg.py Derive nonce via nonce_for() and pass to render_fingerprint_js(). Set artifact.fingerprint_nonce so the cultivator can persist it. cultivator.py Passes fingerprint_nonce into create_canary_token() when present on the artifact; NULL for all non-fingerprint generators. canary.py (model) fingerprint_nonce: Optional[str] = Field(default=None, max_length=16) added to CanaryToken. None for non-fingerprint tokens. worker.py _extract_fingerprint now returns (meta_dict, parsed_fp) tuple. _record_hit accepts parsed_fp + raw_nonce and runs 4 layers after token lookup: nonce match, shape check, mint UUID consistency, rate limit. Each failure sets _fp_invalid_* flag and drops structured _fp. Trigger row always lands regardless. tests/canary/conftest.py Session-scoped autouse fixture sets DECNET_CANARY_FINGERPRINT_SECRET so fingerprint generator and worker tests work offline. tests 5 new worker HTTP tests and 2 new generator tests covering each validation layer.
260 lines
11 KiB
Python
260 lines
11 KiB
Python
"""Canary token tables + CRUD DTOs.
|
|
|
|
Canary tokens are decoy artifacts (operator-uploaded honeydocs / synthesised
|
|
fake configs) planted inside a decky's filesystem. When an attacker exfils
|
|
the artifact and uses it, an HTTP slug or DNS subdomain encoded into the
|
|
file is hit; the ``decnet canary`` worker observes the callback and
|
|
publishes ``canary.{token_id}.triggered`` on the bus. The webhook fanout
|
|
+ correlator pick it up the same way they handle any other attacker
|
|
event — no canary-specific consumer wiring needed downstream.
|
|
|
|
Three tables:
|
|
|
|
* :class:`CanaryBlob` — operator-uploaded source artifact, deduped by
|
|
sha256. The original bytes live on disk under
|
|
``/var/lib/decnet/canary/blobs/{sha256}``; this row carries metadata
|
|
+ refcount-aware deletion.
|
|
* :class:`CanaryToken` — one planted artifact in one decky. Either
|
|
references a blob (``blob_id``) and an instrumenter, or is a wholly
|
|
synthesised fake (e.g. ``aws_creds`` / ``git_config`` from a
|
|
generator) and ``blob_id`` is NULL. ``callback_token`` is the short
|
|
random slug embedded into HTTP URLs and DNS labels — unique across
|
|
the fleet so the worker can resolve a hit to a row in one query.
|
|
* :class:`CanaryTrigger` — append-only log of every callback hit.
|
|
``attacker_id`` is back-filled by the correlator after it attributes
|
|
``src_ip`` to an existing :class:`Attacker`; NULL until then.
|
|
|
|
We follow the project convention from :mod:`webhooks` and
|
|
:mod:`orchestrator`: stringly-typed UUIDs (``str`` PKs via
|
|
``str(uuid4())``), no FK to the composite-PK fleet table, indexes on
|
|
the join keys. Pydantic request/response shapes live in this same
|
|
file (per :mod:`feedback_models_single_source`).
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
import json
|
|
from datetime import datetime, timezone
|
|
from typing import Any, List, Literal, Optional
|
|
from uuid import uuid4
|
|
|
|
from pydantic import BaseModel, Field as PydanticField
|
|
from sqlalchemy import Column, Index, Text
|
|
from sqlmodel import Field, SQLModel
|
|
|
|
from ._base import _BIG_TEXT
|
|
|
|
|
|
# --- Enum-shaped string literals -------------------------------------------
|
|
|
|
CanaryKind = Literal["http", "dns", "aws_passive"]
|
|
"""Detection mechanism for a token.
|
|
|
|
* ``http`` — slug embedded in artifact; attacker fetches our HTTP endpoint.
|
|
* ``dns`` — subdomain embedded; attacker's resolver looks up our DNS server.
|
|
* ``aws_passive`` — fake AWS credentials with no callback wiring. Trips
|
|
zero alerts on its own; useful only as bait + as evidence the attacker
|
|
read the file when correlated with other timing signals.
|
|
"""
|
|
|
|
CanaryState = Literal["planted", "revoked", "failed"]
|
|
"""Lifecycle state of a token row.
|
|
|
|
* ``planted`` — file is in the decky and the slug/host is live.
|
|
* ``revoked`` — operator deleted the token; planter unlinked the file
|
|
(best-effort) and the slug/host stops resolving.
|
|
* ``failed`` — placement failed (docker exec error, instrumenter
|
|
rejected the blob, etc.); surfaced in the UI so the operator can
|
|
retry or pick a different kind.
|
|
"""
|
|
|
|
|
|
# --- DB tables -------------------------------------------------------------
|
|
|
|
class CanaryBlob(SQLModel, table=True):
|
|
"""Operator-uploaded source artifact, deduped by sha256.
|
|
|
|
The same bytes uploaded twice produce the same row (insert-or-get
|
|
semantics in the repository). We never store the bytes inline —
|
|
only the disk path derived from ``sha256``. Deletion is
|
|
refcount-aware: ``DELETE`` is rejected while at least one
|
|
:class:`CanaryToken` references the blob.
|
|
"""
|
|
__tablename__ = "canary_blobs"
|
|
|
|
uuid: str = Field(default_factory=lambda: str(uuid4()), primary_key=True)
|
|
sha256: str = Field(index=True, unique=True)
|
|
filename: str # original filename — UI display only, not used for path resolution
|
|
content_type: str # sniffed MIME (python-magic); drives instrumenter selection
|
|
size_bytes: int
|
|
uploaded_by: str = Field(index=True) # User.uuid
|
|
uploaded_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
|
|
|
|
|
|
class CanaryToken(SQLModel, table=True):
|
|
"""One canary artifact planted inside one decky."""
|
|
__tablename__ = "canary_tokens"
|
|
__table_args__ = (
|
|
Index("ix_canary_tokens_decky", "decky_name", "state"),
|
|
)
|
|
|
|
uuid: str = Field(default_factory=lambda: str(uuid4()), primary_key=True)
|
|
kind: str = Field(index=True) # CanaryKind literal at the API layer
|
|
decky_name: str = Field(index=True) # FleetDecky.name; no FK (composite PK)
|
|
# When NULL, the token is on a fleet decky (decky_name resolves to
|
|
# ``<name>-ssh``). When set, it points at a MazeNET topology — the
|
|
# planter resolves the container via :func:`resolve_topology_container`.
|
|
# No FK: topologies are mutable and we don't want a row to vanish on
|
|
# cascade; the row is the historical record of placement.
|
|
topology_id: Optional[str] = Field(default=None, index=True)
|
|
blob_uuid: Optional[str] = Field(
|
|
default=None, foreign_key="canary_blobs.uuid", index=True,
|
|
)
|
|
# Which instrumenter mutated the blob (``docx``/``xlsx``/``pdf``/``html``/
|
|
# ``image``/``plain``/``passthrough``). NULL when the artifact came
|
|
# from a synthesizer (``git_config``/``env_file``/``ssh_key``/
|
|
# ``aws_creds``/``honeydoc``); ``generator`` carries that name instead.
|
|
instrumenter: Optional[str] = Field(default=None)
|
|
generator: Optional[str] = Field(default=None)
|
|
placement_path: str # absolute path inside the container
|
|
# Short random slug (e.g. 16 url-safe bytes). Embedded in HTTP URLs
|
|
# *and* DNS labels — same value, different envelope, so both
|
|
# detection paths resolve to the same token row.
|
|
callback_token: str = Field(unique=True, index=True)
|
|
# Stable secret used by re-instrumentation: same blob + same seed
|
|
# = same mutated bytes, so re-seeding produces the same on-disk
|
|
# artifact and the planter is naturally idempotent.
|
|
secret_seed: str
|
|
placed_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
|
|
last_triggered_at: Optional[datetime] = Field(default=None, index=True)
|
|
trigger_count: int = Field(default=0)
|
|
created_by: str = Field(index=True) # User.uuid; "system" for baseline-seeded tokens
|
|
state: str = Field(default="planted", index=True)
|
|
last_error: Optional[str] = Field(
|
|
default=None, sa_column=Column("last_error", Text, nullable=True),
|
|
)
|
|
# 16-hex HMAC nonce embedded in fingerprint canary JS payloads. NULL for
|
|
# all non-fingerprint generators. Derived at mint time from
|
|
# HMAC-SHA256(DECNET_CANARY_FINGERPRINT_SECRET, callback_token + mint_uuid)
|
|
# truncated to 16 chars; the worker validates incoming ?n= against this
|
|
# value to reject slug-only fingerprint spoofs.
|
|
fingerprint_nonce: Optional[str] = Field(default=None, max_length=16)
|
|
|
|
|
|
class CanaryTrigger(SQLModel, table=True):
|
|
"""Append-only log of one callback hit."""
|
|
__tablename__ = "canary_triggers"
|
|
__table_args__ = (
|
|
Index("ix_canary_triggers_token_ts", "token_uuid", "occurred_at"),
|
|
Index("ix_canary_triggers_attacker", "attacker_id"),
|
|
)
|
|
|
|
uuid: str = Field(default_factory=lambda: str(uuid4()), primary_key=True)
|
|
token_uuid: str = Field(foreign_key="canary_tokens.uuid", index=True)
|
|
occurred_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
|
|
src_ip: str = Field(index=True)
|
|
user_agent: Optional[str] = None
|
|
request_path: Optional[str] = None # HTTP path including the slug
|
|
dns_qname: Optional[str] = None # DNS qname when the hit came over DNS
|
|
# JSON-encoded request headers (HTTP) or empty for DNS. Stored as
|
|
# TEXT for cross-dialect portability — same trick as
|
|
# :attr:`WebhookSubscription.topic_patterns`.
|
|
raw_headers: str = Field(
|
|
default="{}",
|
|
sa_column=Column("raw_headers", _BIG_TEXT, nullable=False, default="{}"),
|
|
)
|
|
# Set by the correlator once it attributes ``src_ip`` to an existing
|
|
# :class:`Attacker`. NULL until correlation runs (which happens on
|
|
# the bus event we publish, so latency is sub-second).
|
|
attacker_id: Optional[str] = Field(default=None, index=True)
|
|
|
|
def headers(self) -> dict[str, Any]:
|
|
"""Decode :attr:`raw_headers` JSON; ``{}`` on bad/empty input."""
|
|
try:
|
|
raw = json.loads(self.raw_headers or "{}")
|
|
except (ValueError, TypeError):
|
|
return {}
|
|
return raw if isinstance(raw, dict) else {}
|
|
|
|
|
|
# --- API request / response shapes -----------------------------------------
|
|
|
|
class CanaryBlobResponse(BaseModel):
|
|
uuid: str
|
|
sha256: str
|
|
filename: str
|
|
content_type: str
|
|
size_bytes: int
|
|
uploaded_by: str
|
|
uploaded_at: datetime
|
|
# Number of tokens currently referencing this blob. Surfaces in the
|
|
# UI so operators don't try to delete a blob that's still in use,
|
|
# and the API uses it to gate ``DELETE`` (returns 409).
|
|
token_count: int = 0
|
|
|
|
|
|
class CanaryTokenCreateRequest(BaseModel):
|
|
"""Generate + plant a new token.
|
|
|
|
Exactly one of ``blob_uuid`` (operator-supplied artifact) or
|
|
``generator`` (synthesised fake) must be set. Validated in the
|
|
router so the 400 carries a clear detail message.
|
|
"""
|
|
decky_name: str = PydanticField(..., min_length=1)
|
|
# When set, ``decky_name`` is interpreted as a MazeNET topology decky
|
|
# name; the server validates membership and resolves the container
|
|
# accordingly. Absent ⇒ fleet semantics (today's behavior).
|
|
topology_id: Optional[str] = None
|
|
kind: CanaryKind
|
|
placement_path: str = PydanticField(..., min_length=1)
|
|
blob_uuid: Optional[str] = None
|
|
generator: Optional[str] = None # git_config | env_file | ssh_key | aws_creds | honeydoc
|
|
# Optional override for the path-mapping helper — useful when the
|
|
# operator wants a specific Windows-shaped path on a windows-persona
|
|
# decky. Defaults to placement_path verbatim.
|
|
persona_path_hint: Optional[str] = None
|
|
|
|
|
|
class CanaryTokenResponse(BaseModel):
|
|
uuid: str
|
|
kind: CanaryKind
|
|
decky_name: str
|
|
topology_id: Optional[str] = None
|
|
blob_uuid: Optional[str]
|
|
instrumenter: Optional[str]
|
|
generator: Optional[str]
|
|
placement_path: str
|
|
callback_token: str
|
|
placed_at: datetime
|
|
last_triggered_at: Optional[datetime]
|
|
trigger_count: int
|
|
created_by: str
|
|
state: CanaryState
|
|
last_error: Optional[str]
|
|
|
|
|
|
class CanaryTriggerResponse(BaseModel):
|
|
uuid: str
|
|
token_uuid: str
|
|
occurred_at: datetime
|
|
src_ip: str
|
|
user_agent: Optional[str]
|
|
request_path: Optional[str]
|
|
dns_qname: Optional[str]
|
|
headers: dict[str, Any] = PydanticField(default_factory=dict)
|
|
attacker_id: Optional[str]
|
|
|
|
|
|
class CanaryTokensResponse(BaseModel):
|
|
tokens: List[CanaryTokenResponse]
|
|
total: int
|
|
|
|
|
|
class CanaryTriggersResponse(BaseModel):
|
|
triggers: List[CanaryTriggerResponse]
|
|
total: int
|
|
|
|
|
|
class CanaryBlobsResponse(BaseModel):
|
|
blobs: List[CanaryBlobResponse]
|
|
total: int
|