Files
DECNET/decnet/prober/osfp/base.py
anti 41ff6b4b03 feat(prober/osfp): p0f v2 .fp parser + Signature scoring
First code layer of the OS-fingerprinting work on top of yesterday's
vendored p0f v2 database. Three new modules, all pure (no I/O outside
of the parser's file read):

- decnet/prober/osfp/base.py — Provider protocol + OsMatch dataclass
  matching the established Provider convention in decnet/geoip and
  decnet/bus. Docstring spells out the never-raise invariant: malformed
  input returns None, so a single bad event can't wedge a whole
  attacker-profile rebuild.

- decnet/prober/osfp/p0f/signature.py — Signature dataclass + three
  predicate helpers (WindowSpec / IntSpec / OptionToken) encoding the
  p0f v2 DSL's wildcard / modulo / MSS-multiple / MTU-multiple
  semantics. Scoring is our extension on top of upstream p0f's
  first-match-wins policy: each signature carries a precomputed
  specificity in [0, 1] so the factory can pick the most-specific
  match when multiple signatures fire against one observation.

- decnet/prober/osfp/p0f/format.py — .fp line parser. Every shipped
  field variant from the DSL spec at the top of p0f.fp is covered
  (Snn / Tnn / %nnn / * for window; T0 vs T; -/@/* os-genre prefixes;
  quirks as concatenated single-letter flags; '.' sentinels for
  no-options / no-quirks). Malformed lines log a warning and skip
  instead of aborting the whole file — 1 bad row must not cost the
  other 374.

20 parser tests + 14 scoring tests. Full vendored-DB smoke tests
confirm all 375 signatures parse round-trip (262 SYN + 61 SYN-ACK +
46 RST + 6 stray) and every computed specificity lands in [0, 1].
2026-04-24 11:47:54 -04:00

60 lines
2.1 KiB
Python

"""OS-fingerprint provider protocol + OsMatch result shape.
Each concrete provider (p0f v2 today; nmap-osdb / DECNET-observed DB
later) implements `Provider`. Callers go through
:func:`decnet.prober.osfp.factory.get_provider` or
:func:`decnet.prober.osfp.factory.get_all_providers` — direct imports
of a concrete class are forbidden, mirroring the convention in
``decnet/geoip`` and ``decnet/bus``.
"""
from __future__ import annotations
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Any, Optional
@dataclass(frozen=True)
class OsMatch:
"""The result of matching an observation against a provider's DB.
Consumers should prefer higher ``confidence``. Providers compute
confidence as the fraction of signature fields that matched exactly
(vs. wildcard / modulo / "any" predicates) — a signature with every
field constrained scoring 1.0, one with every field wildcarded
approaching 0.0. This is explicit so the profiler can pick the
most-specific match when multiple providers fire.
"""
os: str
flavor: str
confidence: float
provider: str
is_userland: bool = False
def __str__(self) -> str:
tag = "userland" if self.is_userland else self.os
return f"{tag} {self.flavor} ({self.confidence:.2f} via {self.provider})"
class Provider(ABC):
"""Abstract OS-fingerprint source.
Providers consume a dict of observed TCP/IP quirks (``window``,
``wscale``, ``mss``, ``options_sig``, ``ttl``, ``df``,
``total_len``, ``quirks`` — not all fields required) and return a
best-match :class:`OsMatch` or ``None`` when nothing matches.
Providers MUST NOT raise on malformed or partial input — the
upstream caller (`profiler/fingerprint.py::sniffer_rollup`) runs
on data that may be missing any or all fields depending on the
event mix, and a raising provider would wedge every attacker
profile rebuild. Return ``None`` instead.
"""
name: str
@abstractmethod
def match(self, obs: dict[str, Any]) -> Optional[OsMatch]:
"""Return best-match OsMatch for *obs*, or None."""