feat(profiler): extract motor.digraph_simhash keystroke biometric

Per-session 64-bit SimHash of inter-keystroke digraph flight times:
walk single-char input events, accumulate flight time per (c1,c2),
bucket the median, Charikar-SimHash the bucketed pairs. Locality-
sensitive so the same typist is Hamming-close across sessions; pastes
and think-pauses break the chain; silent below the sample-size floor.

New shared decnet/util/simhash.py (simhash64/hamming64/bytes helpers).
Registered as a conditional Tier-A primitive (count 37->38); requires
behave-shell>=0.1.2.
This commit is contained in:
2026-06-16 16:59:57 -04:00
parent 372375194c
commit 66c73ce59d
12 changed files with 283 additions and 5 deletions

View File

@@ -1139,6 +1139,25 @@ own plan.
---
## Post-v0 addition — `motor.digraph_simhash` (38th Tier-A primitive)
Added in behave-shell 0.1.2 (the v0 corpus above was 37). It is the
**keystroke-rhythm biometric**: a 64-bit Charikar SimHash of the
operator's per-digraph (two-key) flight times, bucketed per character
pair. Locality-sensitive — the same typist lands Hamming-close across
sessions and decoys, so it links one human behind multiple identities.
- **Extractor:** `_features/motor.py:digraph_simhash`, `ValueKind.HASH`,
conditional (rides `MIN_DIGRAPHS_FOR_SIMHASH` / `MIN_DIGRAPH_SAMPLES`
floors; lives in `PHASE_G_CONDITIONAL_PRIMITIVES`). Live-typed input
only — pastes/escape bursts break the digraph chain.
- **Rollup:** the identity clusterer folds the session SimHashes into a
bitwise-majority centroid written to `AttackerIdentity.kd_digraph_simhash`;
the campaign clusterer adds a Hamming-proximity edge. STIX export
carries the centroid (hex). Tier-A count is now **38**.
---
**Owner:** ANTI.
**Implementation gate:** Step 0 starts after this doc is reviewed +
Phase 1 of `BEHAVE-INTEGRATION.md` lands (storage table exists).