Commit Graph

2 Commits

Author SHA1 Message Date
41ff6b4b03 feat(prober/osfp): p0f v2 .fp parser + Signature scoring
First code layer of the OS-fingerprinting work on top of yesterday's
vendored p0f v2 database. Three new modules, all pure (no I/O outside
of the parser's file read):

- decnet/prober/osfp/base.py — Provider protocol + OsMatch dataclass
  matching the established Provider convention in decnet/geoip and
  decnet/bus. Docstring spells out the never-raise invariant: malformed
  input returns None, so a single bad event can't wedge a whole
  attacker-profile rebuild.

- decnet/prober/osfp/p0f/signature.py — Signature dataclass + three
  predicate helpers (WindowSpec / IntSpec / OptionToken) encoding the
  p0f v2 DSL's wildcard / modulo / MSS-multiple / MTU-multiple
  semantics. Scoring is our extension on top of upstream p0f's
  first-match-wins policy: each signature carries a precomputed
  specificity in [0, 1] so the factory can pick the most-specific
  match when multiple signatures fire against one observation.

- decnet/prober/osfp/p0f/format.py — .fp line parser. Every shipped
  field variant from the DSL spec at the top of p0f.fp is covered
  (Snn / Tnn / %nnn / * for window; T0 vs T; -/@/* os-genre prefixes;
  quirks as concatenated single-letter flags; '.' sentinels for
  no-options / no-quirks). Malformed lines log a warning and skip
  instead of aborting the whole file — 1 bad row must not cost the
  other 374.

20 parser tests + 14 scoring tests. Full vendored-DB smoke tests
confirm all 375 signatures parse round-trip (262 SYN + 61 SYN-ACK +
46 RST + 6 stray) and every computed specificity lands in [0, 1].
2026-04-24 11:47:54 -04:00
620e1f5b1d feat(prober): vendor p0f v2 TCP/IP fingerprint database (LGPL-2.1 → GPLv3 via §3)
Ships the p0f v2.0.8 signature database for passive + active OS
fingerprinting. 375 total signatures across four probe contexts:

- p0f.fp  (262 sigs) — passive SYN fingerprints
- p0fa.fp ( 61 sigs) — SYN-ACK response, for active probes
- p0fr.fp ( 46 sigs) — RST response quirks
- p0fo.fp (  6 sigs) — "stray" packet fingerprints

Replaces reliance on the 10-signature hand-rolled p0f-lite table in
decnet/sniffer/p0f.py for any match job the upstream DB covers.
Keeping the hand-rolled table as a fallback for modern kernels the
v2 DB pre-dates — v2 froze in 2006 so post-Win10 / post-Linux-3.x
kernels won't match against upstream directly. DECNET-authored
additions will go in a sibling p0f-decnet.fp under GPLv3 (not yet
committed; added as the ingester observes real honeypot traffic).

Provenance (full chain in data/README.md):

- Source: Debian snapshot of p0f_2.0.8.orig.tar.gz
- SHA1 matches Debian-recorded 7b4d5b2f24af4b5a299979134bc7f6d7b1eaf875
- Files byte-identical to upstream tarball (verified by hash)

License chain:

- Upstream: LGPL-2.1 (doc/COPYING preserved verbatim as
  data/LICENSE.p0f-upstream, Michal Zalewski's copyright intact).
- DECNET uses the LGPL-2.1 §3 explicit permission to convert to any
  version of the GPL. These files, as consumed in DECNET, are
  effectively GPL-3.0. Chain documented in data/README.md so an
  auditor sees the full reasoning.
- LGPL-2.1 → GPL-3.0 §3 conversion is a settled compat path; same
  mechanism the kernel uses for LGPL userland glue and many other
  projects apply daily.

Rejected path — nmap-os-db under NPSL — because NPSL adds
restrictions GPLv3 §7 prohibits us from accepting. An email is out
to Fyodor requesting an open-source-author exception grant, but we
don't block on it: p0f v2 is a genuine accuracy improvement in
its own right, and adding nmap-osdb later (if granted) plugs into
the same provider interface with zero refactor.

Directory layout mirrors the established provider-subpackage pattern
(see decnet/geoip/, decnet/bus/) per the feedback_provider_
subpackages memory: base + factory + impl/ subpackages, no flat
files. Parser + matcher + factory wiring land in the next commit
sequence.
2026-04-24 11:39:33 -04:00