feat(creds): future-proof Credential storage model
Replaces the opaque Bounty.bounty_type='credential' path with a
dedicated `credentials` table whose schema is forward-compatible
across every auth-bearing service in the fleet. Hoisted indexed
columns (secret_sha256, principal, service, attacker_ip) carry the
universal reuse-analytics signal; service-specific JSON keys ride
in `fields`. Cross-service reuse queries become an indexed lookup
on secret_sha256 instead of JSON_EXTRACT scans.
Schema decisions baked in (per ANTI):
- New `Credential` table, not extension to Bounty
- Hoisted `principal` column for cross-service principal-reuse
- Standardized JSON keys: every payload carries secret_b64 +
secret_printable + principal universally; service-specific extras
(user, domain, dn, mech, …) ride alongside
The auth-helper SD-block emits the new shape natively. The ingester
forks at _extract_bounty:
- Native shape (SSH/Telnet, future emitters): secret_b64 present →
direct upsert_credential
- Legacy shape (FTP/POP3/IMAP/SMTP today): username + password →
adapter synthesizes secret_{b64,sha256,printable} on the fly,
upserts into the same Credential table. Tracked as DEBT-039;
one-shot bridge until those service templates migrate.
Defense-in-depth across five layers (input validation):
- C helper: bytes outside [0x20, 0x7f) collapse to '?', RFC 5424
escape rules for \\, ", ]; b64 preserves exact bytes
- Ingester native branch: rejects malformed secret_b64 (regex), drops
the credential row but keeps the underlying Log
- Ingester legacy adapter: same printable-ASCII filter as the C
code; sha256 + b64 over the original utf-8 bytes (lossless, even
when secret_printable is sanitized)
- DB column caps with truncation warning; sha256 always over the
full pre-truncation bytes so reuse queries match across truncation
- JSON serialized with ensure_ascii=True so utf8mb4 columns stay
safe even with non-ASCII service-specific keys
Bounty.bounty_type='credential' is no longer written. Pre-v1: no
historical backfill; existing rows stay untouched but unused.
595 tests pass; new tests cover the model + repo (upsert dedup,
null-principal independence, cross-service reuse, filters), both
ingester branches, b64 validation, sanitization preserving the
fingerprinting signal in b64.
This commit is contained in:
@@ -13,12 +13,23 @@
|
||||
* 55555, MSGID `auth_attempt` (matches FTP's existing event type so
|
||||
* the parser + dashboard pick it up with zero changes).
|
||||
*
|
||||
* Two password fields ride in the SD-block:
|
||||
* password RFC 5424-escaped ASCII-printable, '?' for non-printables.
|
||||
* FTP-compatible; consumed by existing dashboard rendering.
|
||||
* password_b64 base64 of the exact PAM_AUTHTOK bytes. Lossless.
|
||||
* Preserves NUL/0xff/control bytes that the plain field
|
||||
* would silently drop — useful fingerprinting signal.
|
||||
* SD-block carries the standardized credential shape (matches
|
||||
* decnet/web/db/models/logs.py:Credential). Universal keys consumed
|
||||
* directly by the ingester's native-shape branch:
|
||||
* principal the human-meaningful identity the attacker sent
|
||||
* (username for SSH/Telnet; would be a domain for
|
||||
* SMTP, a DN for LDAP, etc.)
|
||||
* secret_printable RFC 5424-escaped ASCII-printable, '?' for non-
|
||||
* printables. Best-effort display form; may be
|
||||
* lossy on non-UTF8 bytes.
|
||||
* secret_b64 base64 of the exact PAM_AUTHTOK bytes. Lossless.
|
||||
* Preserves NUL/0xff/control bytes that the plain
|
||||
* field would silently drop — useful fingerprinting
|
||||
* signal that survives display sanitization.
|
||||
*
|
||||
* `username` rides alongside as a service-specific identity field for
|
||||
* SSH/Telnet (mirrors `principal`); future emitters (SMTP, LDAP, …)
|
||||
* drop `username` in favor of their service-native identity field.
|
||||
*
|
||||
* Fail-open: every error path silently exits 0. The PAM line is `optional`
|
||||
* so a malfunctioning helper must never break sshd auth.
|
||||
@@ -150,13 +161,19 @@ pw_done:;
|
||||
b64_encode(pw_raw, pw_len, pw_b64, sizeof(pw_b64));
|
||||
|
||||
/* Priority: facility=local0(16), severity=INFO(6) → <16*8+6> = <134>.
|
||||
* Matches the syslog_bridge.py default exactly. */
|
||||
* Matches the syslog_bridge.py default exactly.
|
||||
*
|
||||
* SD-block keys match the Credential storage model: principal +
|
||||
* secret_printable + secret_b64 are the universal keys the ingester
|
||||
* keys off; username is emitted alongside principal so existing
|
||||
* dashboards that read SSH/Telnet `username=` keep working until
|
||||
* the cred-reuse UI lands. */
|
||||
char line[LINE_BUF];
|
||||
int n = snprintf(line, sizeof(line),
|
||||
"<134>1 %s %s auth-helper - auth_attempt "
|
||||
"[relay@55555 username=\"%s\" password=\"%s\" "
|
||||
"password_b64=\"%s\" src_ip=\"%s\"]\n",
|
||||
tsbuf, host, user_esc, pw_esc, pw_b64, rhost_esc);
|
||||
"[relay@55555 username=\"%s\" principal=\"%s\" "
|
||||
"secret_printable=\"%s\" secret_b64=\"%s\" src_ip=\"%s\"]\n",
|
||||
tsbuf, host, user_esc, user_esc, pw_esc, pw_b64, rhost_esc);
|
||||
if (n <= 0 || (size_t)n >= sizeof(line)) return 0;
|
||||
|
||||
/* /proc/1/fd/1 is the entrypoint's stdout — the fd Docker captures
|
||||
|
||||
@@ -13,12 +13,23 @@
|
||||
* 55555, MSGID `auth_attempt` (matches FTP's existing event type so
|
||||
* the parser + dashboard pick it up with zero changes).
|
||||
*
|
||||
* Two password fields ride in the SD-block:
|
||||
* password RFC 5424-escaped ASCII-printable, '?' for non-printables.
|
||||
* FTP-compatible; consumed by existing dashboard rendering.
|
||||
* password_b64 base64 of the exact PAM_AUTHTOK bytes. Lossless.
|
||||
* Preserves NUL/0xff/control bytes that the plain field
|
||||
* would silently drop — useful fingerprinting signal.
|
||||
* SD-block carries the standardized credential shape (matches
|
||||
* decnet/web/db/models/logs.py:Credential). Universal keys consumed
|
||||
* directly by the ingester's native-shape branch:
|
||||
* principal the human-meaningful identity the attacker sent
|
||||
* (username for SSH/Telnet; would be a domain for
|
||||
* SMTP, a DN for LDAP, etc.)
|
||||
* secret_printable RFC 5424-escaped ASCII-printable, '?' for non-
|
||||
* printables. Best-effort display form; may be
|
||||
* lossy on non-UTF8 bytes.
|
||||
* secret_b64 base64 of the exact PAM_AUTHTOK bytes. Lossless.
|
||||
* Preserves NUL/0xff/control bytes that the plain
|
||||
* field would silently drop — useful fingerprinting
|
||||
* signal that survives display sanitization.
|
||||
*
|
||||
* `username` rides alongside as a service-specific identity field for
|
||||
* SSH/Telnet (mirrors `principal`); future emitters (SMTP, LDAP, …)
|
||||
* drop `username` in favor of their service-native identity field.
|
||||
*
|
||||
* Fail-open: every error path silently exits 0. The PAM line is `optional`
|
||||
* so a malfunctioning helper must never break sshd auth.
|
||||
@@ -150,13 +161,19 @@ pw_done:;
|
||||
b64_encode(pw_raw, pw_len, pw_b64, sizeof(pw_b64));
|
||||
|
||||
/* Priority: facility=local0(16), severity=INFO(6) → <16*8+6> = <134>.
|
||||
* Matches the syslog_bridge.py default exactly. */
|
||||
* Matches the syslog_bridge.py default exactly.
|
||||
*
|
||||
* SD-block keys match the Credential storage model: principal +
|
||||
* secret_printable + secret_b64 are the universal keys the ingester
|
||||
* keys off; username is emitted alongside principal so existing
|
||||
* dashboards that read SSH/Telnet `username=` keep working until
|
||||
* the cred-reuse UI lands. */
|
||||
char line[LINE_BUF];
|
||||
int n = snprintf(line, sizeof(line),
|
||||
"<134>1 %s %s auth-helper - auth_attempt "
|
||||
"[relay@55555 username=\"%s\" password=\"%s\" "
|
||||
"password_b64=\"%s\" src_ip=\"%s\"]\n",
|
||||
tsbuf, host, user_esc, pw_esc, pw_b64, rhost_esc);
|
||||
"[relay@55555 username=\"%s\" principal=\"%s\" "
|
||||
"secret_printable=\"%s\" secret_b64=\"%s\" src_ip=\"%s\"]\n",
|
||||
tsbuf, host, user_esc, user_esc, pw_esc, pw_b64, rhost_esc);
|
||||
if (n <= 0 || (size_t)n >= sizeof(line)) return 0;
|
||||
|
||||
/* /proc/1/fd/1 is the entrypoint's stdout — the fd Docker captures
|
||||
|
||||
@@ -13,12 +13,23 @@
|
||||
* 55555, MSGID `auth_attempt` (matches FTP's existing event type so
|
||||
* the parser + dashboard pick it up with zero changes).
|
||||
*
|
||||
* Two password fields ride in the SD-block:
|
||||
* password RFC 5424-escaped ASCII-printable, '?' for non-printables.
|
||||
* FTP-compatible; consumed by existing dashboard rendering.
|
||||
* password_b64 base64 of the exact PAM_AUTHTOK bytes. Lossless.
|
||||
* Preserves NUL/0xff/control bytes that the plain field
|
||||
* would silently drop — useful fingerprinting signal.
|
||||
* SD-block carries the standardized credential shape (matches
|
||||
* decnet/web/db/models/logs.py:Credential). Universal keys consumed
|
||||
* directly by the ingester's native-shape branch:
|
||||
* principal the human-meaningful identity the attacker sent
|
||||
* (username for SSH/Telnet; would be a domain for
|
||||
* SMTP, a DN for LDAP, etc.)
|
||||
* secret_printable RFC 5424-escaped ASCII-printable, '?' for non-
|
||||
* printables. Best-effort display form; may be
|
||||
* lossy on non-UTF8 bytes.
|
||||
* secret_b64 base64 of the exact PAM_AUTHTOK bytes. Lossless.
|
||||
* Preserves NUL/0xff/control bytes that the plain
|
||||
* field would silently drop — useful fingerprinting
|
||||
* signal that survives display sanitization.
|
||||
*
|
||||
* `username` rides alongside as a service-specific identity field for
|
||||
* SSH/Telnet (mirrors `principal`); future emitters (SMTP, LDAP, …)
|
||||
* drop `username` in favor of their service-native identity field.
|
||||
*
|
||||
* Fail-open: every error path silently exits 0. The PAM line is `optional`
|
||||
* so a malfunctioning helper must never break sshd auth.
|
||||
@@ -150,13 +161,19 @@ pw_done:;
|
||||
b64_encode(pw_raw, pw_len, pw_b64, sizeof(pw_b64));
|
||||
|
||||
/* Priority: facility=local0(16), severity=INFO(6) → <16*8+6> = <134>.
|
||||
* Matches the syslog_bridge.py default exactly. */
|
||||
* Matches the syslog_bridge.py default exactly.
|
||||
*
|
||||
* SD-block keys match the Credential storage model: principal +
|
||||
* secret_printable + secret_b64 are the universal keys the ingester
|
||||
* keys off; username is emitted alongside principal so existing
|
||||
* dashboards that read SSH/Telnet `username=` keep working until
|
||||
* the cred-reuse UI lands. */
|
||||
char line[LINE_BUF];
|
||||
int n = snprintf(line, sizeof(line),
|
||||
"<134>1 %s %s auth-helper - auth_attempt "
|
||||
"[relay@55555 username=\"%s\" password=\"%s\" "
|
||||
"password_b64=\"%s\" src_ip=\"%s\"]\n",
|
||||
tsbuf, host, user_esc, pw_esc, pw_b64, rhost_esc);
|
||||
"[relay@55555 username=\"%s\" principal=\"%s\" "
|
||||
"secret_printable=\"%s\" secret_b64=\"%s\" src_ip=\"%s\"]\n",
|
||||
tsbuf, host, user_esc, user_esc, pw_esc, pw_b64, rhost_esc);
|
||||
if (n <= 0 || (size_t)n >= sizeof(line)) return 0;
|
||||
|
||||
/* /proc/1/fd/1 is the entrypoint's stdout — the fd Docker captures
|
||||
|
||||
@@ -48,6 +48,8 @@ from .health import (
|
||||
from .logs import (
|
||||
Bounty,
|
||||
BountyResponse,
|
||||
Credential,
|
||||
CredentialsResponse,
|
||||
Log,
|
||||
LogsResponse,
|
||||
State,
|
||||
@@ -167,6 +169,8 @@ __all__ = [
|
||||
# logs
|
||||
"Bounty",
|
||||
"BountyResponse",
|
||||
"Credential",
|
||||
"CredentialsResponse",
|
||||
"Log",
|
||||
"LogsResponse",
|
||||
"State",
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
"""Log / Bounty / State tables + their list-response DTOs."""
|
||||
"""Log / Bounty / Credential / State tables + their list-response DTOs."""
|
||||
from datetime import datetime, timezone
|
||||
from typing import Any, List, Optional
|
||||
|
||||
from pydantic import BaseModel
|
||||
from sqlalchemy import Column, Text
|
||||
from sqlalchemy import Column, Index, Text
|
||||
from sqlmodel import Field, SQLModel
|
||||
|
||||
from ._base import _BIG_TEXT
|
||||
@@ -40,6 +40,58 @@ class Bounty(SQLModel, table=True):
|
||||
payload: str = Field(sa_column=Column("payload", Text, nullable=False))
|
||||
|
||||
|
||||
class Credential(SQLModel, table=True):
|
||||
"""One observed credential attempt against a decky service.
|
||||
|
||||
Forward-compatible across every auth-bearing service in the fleet:
|
||||
SSH user+pass, Telnet user+pass, SMTP domain+pass, LDAP dn+pass,
|
||||
Redis password-only, etc. The two universal lossless representations
|
||||
(``secret_b64`` + ``secret_sha256``) hoist to indexed columns so
|
||||
cross-service reuse queries don't scan opaque JSON.
|
||||
|
||||
Per-service identity (the human-meaningful "who's authenticating")
|
||||
lives in ``principal`` — username for SSH, domain for SMTP, dn for
|
||||
LDAP. Nullable for principal-less mechanisms (Redis AUTH, bearer
|
||||
tokens). Fully service-specific keys ride in ``fields`` JSON.
|
||||
|
||||
Dedup contract: same (attacker_uuid, decky, service, secret_sha256,
|
||||
principal_or_empty) tuple → upsert, bumps ``attempt_count`` and
|
||||
``last_seen``. Different secret or different principal → new row.
|
||||
"""
|
||||
__tablename__ = "credentials"
|
||||
__table_args__ = (
|
||||
Index("ix_credentials_secret_service", "secret_sha256", "service"),
|
||||
Index("ix_credentials_principal_service", "principal", "service"),
|
||||
)
|
||||
id: Optional[int] = Field(default=None, primary_key=True)
|
||||
# Keyed by attacker IP (not attackers.uuid) to match Bounty's pattern
|
||||
# and avoid the chicken-and-egg of writing a credential row before
|
||||
# the profiler has minted the Attacker. Index covers the join path
|
||||
# cred_reuse → Attacker.ip.
|
||||
attacker_ip: str = Field(index=True)
|
||||
decky_name: str = Field(index=True)
|
||||
service: str = Field(index=True)
|
||||
principal: Optional[str] = Field(default=None, index=True, max_length=256)
|
||||
# Universal lossless secret representations.
|
||||
secret_sha256: str = Field(index=True, max_length=64)
|
||||
secret_b64: Optional[str] = Field(default=None, max_length=2048)
|
||||
# Best-effort printable form — non-printable bytes collapsed to '?'
|
||||
# by either auth-helper.c (SSH/Telnet) or the ingester's legacy
|
||||
# adapter (FTP/POP3/IMAP/SMTP). May be lossy on non-UTF8.
|
||||
secret_printable: Optional[str] = Field(default=None, max_length=512)
|
||||
outcome: Optional[str] = Field(default=None, max_length=16) # success|failure|observed
|
||||
fields: str = Field(
|
||||
sa_column=Column("fields", _BIG_TEXT, nullable=False, default="{}")
|
||||
)
|
||||
first_seen: datetime = Field(
|
||||
default_factory=lambda: datetime.now(timezone.utc), index=True
|
||||
)
|
||||
last_seen: datetime = Field(
|
||||
default_factory=lambda: datetime.now(timezone.utc), index=True
|
||||
)
|
||||
attempt_count: int = Field(default=1)
|
||||
|
||||
|
||||
class State(SQLModel, table=True):
|
||||
__tablename__ = "state"
|
||||
key: str = Field(primary_key=True)
|
||||
@@ -62,6 +114,13 @@ class BountyResponse(BaseModel):
|
||||
data: List[dict[str, Any]]
|
||||
|
||||
|
||||
class CredentialsResponse(BaseModel):
|
||||
total: int
|
||||
limit: int
|
||||
offset: int
|
||||
data: List[dict[str, Any]]
|
||||
|
||||
|
||||
class StatsResponse(BaseModel):
|
||||
total_logs: int
|
||||
unique_attackers: int
|
||||
|
||||
@@ -110,6 +110,55 @@ class BaseRepository(ABC):
|
||||
"""Retrieve the total count of bounties, optionally filtered."""
|
||||
pass
|
||||
|
||||
# ---- credentials ---------------------------------------------------
|
||||
|
||||
@abstractmethod
|
||||
async def upsert_credential(self, data: dict[str, Any]) -> int:
|
||||
"""Insert or upsert a credential attempt; returns the row id.
|
||||
|
||||
Dedup tuple: (attacker_ip, decky_name, service, secret_sha256,
|
||||
principal_or_None). On dedup match, ``attempt_count`` is bumped
|
||||
and ``last_seen`` updated; the originally-seen ``first_seen``
|
||||
and ``fields`` JSON are preserved.
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def get_credentials(
|
||||
self,
|
||||
limit: int = 50,
|
||||
offset: int = 0,
|
||||
search: Optional[str] = None,
|
||||
service: Optional[str] = None,
|
||||
attacker_ip: Optional[str] = None,
|
||||
) -> list[dict[str, Any]]:
|
||||
"""Paginated credential rows, with optional filters."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def get_total_credentials(
|
||||
self,
|
||||
search: Optional[str] = None,
|
||||
service: Optional[str] = None,
|
||||
attacker_ip: Optional[str] = None,
|
||||
) -> int:
|
||||
"""Total credential count under the same filters as get_credentials."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def get_credentials_for_attacker(
|
||||
self, attacker_ip: str
|
||||
) -> list[dict[str, Any]]:
|
||||
"""Every credential row from the given attacker IP."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def get_credential_reuse(
|
||||
self, secret_sha256: str
|
||||
) -> list[dict[str, Any]]:
|
||||
"""Every (attacker, decky, service, principal) row sharing this secret hash."""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def get_state(self, key: str) -> Optional[dict[str, Any]]:
|
||||
"""Retrieve a specific state entry by key."""
|
||||
|
||||
@@ -31,6 +31,7 @@ from decnet.web.db.models import (
|
||||
User,
|
||||
Log,
|
||||
Bounty,
|
||||
Credential,
|
||||
State,
|
||||
Attacker,
|
||||
AttackerBehavior,
|
||||
@@ -448,6 +449,9 @@ class SQLModelRepository(BaseRepository):
|
||||
async with self._session() as session:
|
||||
logs_deleted = (await session.execute(text("DELETE FROM logs"))).rowcount
|
||||
bounties_deleted = (await session.execute(text("DELETE FROM bounty"))).rowcount
|
||||
credentials_deleted = (
|
||||
await session.execute(text("DELETE FROM credentials"))
|
||||
).rowcount
|
||||
# attacker_behavior has FK → attackers.uuid; delete children first.
|
||||
await session.execute(text("DELETE FROM attacker_behavior"))
|
||||
attackers_deleted = (await session.execute(text("DELETE FROM attackers"))).rowcount
|
||||
@@ -455,6 +459,7 @@ class SQLModelRepository(BaseRepository):
|
||||
return {
|
||||
"logs": logs_deleted,
|
||||
"bounties": bounties_deleted,
|
||||
"credentials": credentials_deleted,
|
||||
"attackers": attackers_deleted,
|
||||
}
|
||||
|
||||
@@ -535,6 +540,169 @@ class SQLModelRepository(BaseRepository):
|
||||
result = await session.execute(statement)
|
||||
return result.scalar() or 0
|
||||
|
||||
# ─── credentials ──────────────────────────────────────────────────────
|
||||
|
||||
async def upsert_credential(self, data: dict[str, Any]) -> int:
|
||||
"""Upsert a credential attempt; returns the row id.
|
||||
|
||||
Dedup tuple: (attacker_ip, decky_name, service, secret_sha256,
|
||||
principal_or_None). On match, ``attempt_count`` += 1 and
|
||||
``last_seen`` advances; ``first_seen`` and ``fields`` are
|
||||
preserved from the original sighting.
|
||||
"""
|
||||
payload = dict(data)
|
||||
if "fields" in payload and isinstance(payload["fields"], dict):
|
||||
# ensure_ascii=True keeps utf8mb4 columns safe even when
|
||||
# service-specific keys carry non-ASCII bytes.
|
||||
payload["fields"] = json.dumps(payload["fields"], ensure_ascii=True)
|
||||
|
||||
principal = payload.get("principal")
|
||||
async with self._session() as session:
|
||||
stmt = select(Credential).where(
|
||||
Credential.attacker_ip == payload["attacker_ip"],
|
||||
Credential.decky_name == payload["decky_name"],
|
||||
Credential.service == payload["service"],
|
||||
Credential.secret_sha256 == payload["secret_sha256"],
|
||||
# NULL == NULL is False under SQL — branch the predicate.
|
||||
(Credential.principal == principal) if principal is not None
|
||||
else Credential.principal.is_(None),
|
||||
)
|
||||
existing = (await session.execute(stmt)).scalar_one_or_none()
|
||||
now = datetime.now(timezone.utc)
|
||||
if existing is not None:
|
||||
existing.attempt_count = (existing.attempt_count or 1) + 1
|
||||
existing.last_seen = now
|
||||
if payload.get("outcome") is not None:
|
||||
existing.outcome = payload["outcome"]
|
||||
session.add(existing)
|
||||
await session.commit()
|
||||
return existing.id # type: ignore[return-value]
|
||||
row = Credential(
|
||||
attacker_ip=payload["attacker_ip"],
|
||||
decky_name=payload["decky_name"],
|
||||
service=payload["service"],
|
||||
principal=principal,
|
||||
secret_sha256=payload["secret_sha256"],
|
||||
secret_b64=payload.get("secret_b64"),
|
||||
secret_printable=payload.get("secret_printable"),
|
||||
outcome=payload.get("outcome"),
|
||||
fields=payload.get("fields", "{}"),
|
||||
first_seen=now,
|
||||
last_seen=now,
|
||||
attempt_count=1,
|
||||
)
|
||||
session.add(row)
|
||||
await session.commit()
|
||||
await session.refresh(row)
|
||||
return row.id # type: ignore[return-value]
|
||||
|
||||
def _apply_credential_filters(
|
||||
self,
|
||||
statement: SelectOfScalar,
|
||||
search: Optional[str],
|
||||
service: Optional[str],
|
||||
attacker_ip: Optional[str],
|
||||
) -> SelectOfScalar:
|
||||
if service:
|
||||
statement = statement.where(Credential.service == service)
|
||||
if attacker_ip:
|
||||
statement = statement.where(Credential.attacker_ip == attacker_ip)
|
||||
if search:
|
||||
lk = f"%{search}%"
|
||||
statement = statement.where(
|
||||
or_(
|
||||
Credential.decky_name.like(lk),
|
||||
Credential.service.like(lk),
|
||||
Credential.principal.like(lk),
|
||||
Credential.secret_printable.like(lk),
|
||||
)
|
||||
)
|
||||
return statement
|
||||
|
||||
async def get_credentials(
|
||||
self,
|
||||
limit: int = 50,
|
||||
offset: int = 0,
|
||||
search: Optional[str] = None,
|
||||
service: Optional[str] = None,
|
||||
attacker_ip: Optional[str] = None,
|
||||
) -> List[dict[str, Any]]:
|
||||
statement = (
|
||||
select(Credential)
|
||||
.order_by(desc(Credential.last_seen))
|
||||
.offset(offset)
|
||||
.limit(limit)
|
||||
)
|
||||
statement = self._apply_credential_filters(
|
||||
statement, search, service, attacker_ip
|
||||
)
|
||||
async with self._session() as session:
|
||||
result = await session.execute(statement)
|
||||
out: List[dict[str, Any]] = []
|
||||
for item in result.scalars().all():
|
||||
d = item.model_dump(mode="json")
|
||||
try:
|
||||
d["fields"] = json.loads(d["fields"])
|
||||
except (json.JSONDecodeError, TypeError):
|
||||
pass
|
||||
out.append(d)
|
||||
return out
|
||||
|
||||
async def get_total_credentials(
|
||||
self,
|
||||
search: Optional[str] = None,
|
||||
service: Optional[str] = None,
|
||||
attacker_ip: Optional[str] = None,
|
||||
) -> int:
|
||||
statement = select(func.count()).select_from(Credential)
|
||||
statement = self._apply_credential_filters(
|
||||
statement, search, service, attacker_ip
|
||||
)
|
||||
async with self._session() as session:
|
||||
result = await session.execute(statement)
|
||||
return result.scalar() or 0
|
||||
|
||||
async def get_credentials_for_attacker(
|
||||
self, attacker_ip: str
|
||||
) -> List[dict[str, Any]]:
|
||||
async with self._session() as session:
|
||||
result = await session.execute(
|
||||
select(Credential)
|
||||
.where(Credential.attacker_ip == attacker_ip)
|
||||
.order_by(desc(Credential.last_seen))
|
||||
)
|
||||
out: List[dict[str, Any]] = []
|
||||
for item in result.scalars().all():
|
||||
d = item.model_dump(mode="json")
|
||||
try:
|
||||
d["fields"] = json.loads(d["fields"])
|
||||
except (json.JSONDecodeError, TypeError):
|
||||
pass
|
||||
out.append(d)
|
||||
return out
|
||||
|
||||
async def get_credential_reuse(
|
||||
self, secret_sha256: str
|
||||
) -> List[dict[str, Any]]:
|
||||
"""Every (attacker_ip, decky, service, principal) row sharing this
|
||||
secret hash. Indexed lookup via ix_credentials_secret_service.
|
||||
"""
|
||||
async with self._session() as session:
|
||||
result = await session.execute(
|
||||
select(Credential)
|
||||
.where(Credential.secret_sha256 == secret_sha256)
|
||||
.order_by(desc(Credential.last_seen))
|
||||
)
|
||||
out: List[dict[str, Any]] = []
|
||||
for item in result.scalars().all():
|
||||
d = item.model_dump(mode="json")
|
||||
try:
|
||||
d["fields"] = json.loads(d["fields"])
|
||||
except (json.JSONDecodeError, TypeError):
|
||||
pass
|
||||
out.append(d)
|
||||
return out
|
||||
|
||||
async def get_state(self, key: str) -> Optional[dict[str, Any]]:
|
||||
async with self._session() as session:
|
||||
statement = select(State).where(State.key == key)
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
import asyncio
|
||||
import base64
|
||||
import contextlib
|
||||
import hashlib
|
||||
import ipaddress
|
||||
import os
|
||||
import json
|
||||
@@ -195,6 +197,128 @@ async def _flush_batch(
|
||||
return _new_position
|
||||
|
||||
|
||||
# RFC 5424-ish SD-PARAM-VALUE sanitization, mirrored from auth-helper.c.
|
||||
# Bytes outside [0x20, 0x7f) collapse to '?', matching the C escape rule.
|
||||
# The hash is always computed over the *original* bytes so reuse queries
|
||||
# survive any sanitization on the printable form.
|
||||
_SECRET_B64_RE = re.compile(r"^[A-Za-z0-9+/]*={0,2}$")
|
||||
_SECRET_PRINTABLE_MAX = 512 # mirrors Credential.secret_printable max_length
|
||||
_PRINCIPAL_MAX = 256
|
||||
_SECRET_B64_MAX = 2048
|
||||
|
||||
|
||||
def _printable_filter(s: str) -> str:
|
||||
"""Replace bytes outside [0x20, 0x7f) with '?', matching auth-helper.c.
|
||||
|
||||
Operates on the str's UTF-8 encoded bytes so we don't accidentally
|
||||
let a `\\u202e` Unicode override slip through display layers.
|
||||
"""
|
||||
out: list[int] = []
|
||||
for b in s.encode("utf-8", errors="replace"):
|
||||
out.append(b if 0x20 <= b < 0x7f else ord("?"))
|
||||
return bytes(out).decode("ascii")
|
||||
|
||||
|
||||
def _truncate_with_warn(s: Optional[str], cap: int, label: str) -> Optional[str]:
|
||||
if s is None:
|
||||
return None
|
||||
if len(s) <= cap:
|
||||
return s
|
||||
logger.warning("ingester: %s truncated %d → %d chars", label, len(s), cap)
|
||||
return s[:cap]
|
||||
|
||||
|
||||
async def _ingest_credential_native(
|
||||
repo: BaseRepository,
|
||||
log_data: dict[str, Any],
|
||||
fields: dict[str, Any],
|
||||
) -> None:
|
||||
"""Native-shape credential: SD-block already carries secret_b64.
|
||||
|
||||
Validates the b64, computes sha256 over the decoded bytes, hands off
|
||||
to the repo upsert. Drops the row on validation failure (the
|
||||
underlying Log row still lands).
|
||||
"""
|
||||
b64 = fields.get("secret_b64")
|
||||
if not isinstance(b64, str) or not _SECRET_B64_RE.match(b64):
|
||||
logger.warning(
|
||||
"ingester: dropping credential — invalid secret_b64 from %s/%s",
|
||||
log_data.get("decky"), log_data.get("service"),
|
||||
)
|
||||
return
|
||||
try:
|
||||
raw = base64.b64decode(b64, validate=True)
|
||||
except (ValueError, TypeError):
|
||||
logger.warning(
|
||||
"ingester: dropping credential — secret_b64 decode failed from %s/%s",
|
||||
log_data.get("decky"), log_data.get("service"),
|
||||
)
|
||||
return
|
||||
|
||||
sha256_hex = hashlib.sha256(raw).hexdigest()
|
||||
principal = fields.get("principal") or fields.get("username")
|
||||
secret_printable = fields.get("secret_printable")
|
||||
|
||||
await repo.upsert_credential({
|
||||
"attacker_ip": log_data.get("attacker_ip"),
|
||||
"decky_name": log_data.get("decky"),
|
||||
"service": log_data.get("service"),
|
||||
"principal": _truncate_with_warn(principal, _PRINCIPAL_MAX, "principal"),
|
||||
"secret_sha256": sha256_hex,
|
||||
"secret_b64": _truncate_with_warn(b64, _SECRET_B64_MAX, "secret_b64"),
|
||||
"secret_printable": _truncate_with_warn(
|
||||
secret_printable, _SECRET_PRINTABLE_MAX, "secret_printable"
|
||||
),
|
||||
"outcome": fields.get("outcome"),
|
||||
"fields": fields, # repo handles json.dumps with ensure_ascii=True
|
||||
})
|
||||
|
||||
|
||||
async def _ingest_credential_legacy(
|
||||
repo: BaseRepository,
|
||||
log_data: dict[str, Any],
|
||||
fields: dict[str, Any],
|
||||
) -> None:
|
||||
"""Legacy-shape credential: SD-block has username + password.
|
||||
|
||||
Synthesizes secret_b64 (from utf8-encoded password bytes), the
|
||||
sha256 hash (over those same bytes — lossless before any printable
|
||||
sanitization), and a printable-filtered secret_printable. FTP /
|
||||
POP3 / IMAP / SMTP go through this branch until DEBT-039 lands.
|
||||
"""
|
||||
user = fields.get("username")
|
||||
pw = fields.get("password")
|
||||
if not isinstance(pw, str):
|
||||
return
|
||||
|
||||
raw = pw.encode("utf-8", errors="replace")
|
||||
sha256_hex = hashlib.sha256(raw).hexdigest()
|
||||
b64 = base64.b64encode(raw).decode("ascii")
|
||||
printable = _printable_filter(pw)
|
||||
|
||||
# Synthesize the universal keys into a copy of fields so the JSON
|
||||
# blob carries the standardized shape too — lets downstream readers
|
||||
# treat every credential row identically regardless of emitter.
|
||||
synthesized_fields = dict(fields)
|
||||
synthesized_fields.setdefault("principal", user)
|
||||
synthesized_fields.setdefault("secret_printable", printable)
|
||||
synthesized_fields.setdefault("secret_b64", b64)
|
||||
|
||||
await repo.upsert_credential({
|
||||
"attacker_ip": log_data.get("attacker_ip"),
|
||||
"decky_name": log_data.get("decky"),
|
||||
"service": log_data.get("service"),
|
||||
"principal": _truncate_with_warn(user, _PRINCIPAL_MAX, "principal"),
|
||||
"secret_sha256": sha256_hex,
|
||||
"secret_b64": _truncate_with_warn(b64, _SECRET_B64_MAX, "secret_b64"),
|
||||
"secret_printable": _truncate_with_warn(
|
||||
printable, _SECRET_PRINTABLE_MAX, "secret_printable"
|
||||
),
|
||||
"outcome": fields.get("outcome"),
|
||||
"fields": synthesized_fields,
|
||||
})
|
||||
|
||||
|
||||
@_traced("ingester.extract_bounty")
|
||||
async def _extract_bounty(repo: BaseRepository, log_data: dict[str, Any]) -> None:
|
||||
"""Detect and extract valuable artifacts (bounties) from log entries."""
|
||||
@@ -202,21 +326,21 @@ async def _extract_bounty(repo: BaseRepository, log_data: dict[str, Any]) -> Non
|
||||
if not isinstance(_fields, dict):
|
||||
return
|
||||
|
||||
# 1. Credentials (User/Pass)
|
||||
_user = _fields.get("username")
|
||||
_pass = _fields.get("password")
|
||||
|
||||
if _user and _pass:
|
||||
await repo.add_bounty({
|
||||
"decky": log_data.get("decky"),
|
||||
"service": log_data.get("service"),
|
||||
"attacker_ip": log_data.get("attacker_ip"),
|
||||
"bounty_type": "credential",
|
||||
"payload": {
|
||||
"username": _user,
|
||||
"password": _pass
|
||||
}
|
||||
})
|
||||
# 1. Credentials — fork on emitter shape.
|
||||
#
|
||||
# New shape (SSH/Telnet auth-helper, future emitters): SD-block
|
||||
# carries `secret_b64` directly. Universal across services.
|
||||
#
|
||||
# Legacy shape (FTP/POP3/IMAP/SMTP today): SD-block has `username`
|
||||
# + `password`. Adapter synthesizes `secret_b64` + `secret_sha256`
|
||||
# on the fly so those services land in the same Credential table
|
||||
# without requiring a per-template emitter rewrite. Tracked as
|
||||
# DEBT-039 — eventually those services emit the new shape natively
|
||||
# and this branch dies.
|
||||
if "secret_b64" in _fields:
|
||||
await _ingest_credential_native(repo, log_data, _fields)
|
||||
elif _fields.get("username") and _fields.get("password"):
|
||||
await _ingest_credential_legacy(repo, log_data, _fields)
|
||||
|
||||
# 2. HTTP User-Agent fingerprint
|
||||
_h_raw = _fields.get("headers")
|
||||
|
||||
Reference in New Issue
Block a user