DECNET

Author	SHA1	Message	Date
anti	92632d7afd	feat(pr2): HTTP/2+HTTP/3 fingerprint extractors — JA4H, H2 SETTINGS, JA4-QUIC	2026-05-10 00:47:19 -04:00
anti	65ddaaa681	fix(behave_shell/F.0): tighten prompt detector — log lines ending in '>' no longer vote _detect_prompt_suffix accepted ANY line ending in $#%> as a PS1 prompt, so a single `cat /var/log/dpkg.log` (195 lines closing in `<none>`) flooded environmental.shell_type votes and flipped a plainly-bash session to fish. A prompt line now requires either a trailing space after the suffix (default PS1 shape across bash/zsh/fish/PowerShell) or a PS1-shape token (user@host, "PS " prefix, or a Windows drive-letter prefix). Regression tests pin the dpkg.log false-positive and a $-terminated prose line.	2026-05-09 02:57:40 -04:00
anti	e94ab608d9	fix(profiler/behave_shell): tolerate non-UTF-8 bytes in shard reads Real-world bug surfaced on the first live decky run: sessrec.c's json_escape (decnet/templates/_shared/sessrec/sessrec.c:111-141) only escapes bytes < 0x20 + DEL — bytes >= 0x80 pass through raw. An attacker pasting Latin-1 / GB18030 / any non-UTF-8 8-bit text yields a shard line that chokes Python's default UTF-8 text-mode read with 'utf-8 codec can't decode byte 0xac'. Three changes: 1. _events_for_sid now opens with errors='surrogateescape', preserving byte fidelity through the JSON parse. Surrogate-half chars correctly fail isascii() / isalpha() so the typed-letter histograms filter them out automatically. Tightening sessrec.c to escape >= 0x80 is filed for v0.2 — that's the proper forensic-data fix; the surrogateescape read makes the engine robust meanwhile. 2. Regression test (test_handler_tolerates_non_utf8_bytes_in_shard) builds a shard with raw 0xAC bytes inside a JSON 'data' string and asserts the handler still persists observations. 3. Collector's _emit_session now logs at WARNING (was DEBUG) when find_shard_with_sid returns None, citing the three usual causes (ARTIFACTS_ROOT perms, _SERVICE_RE whitelist, sessrec/collector race). Surfaces the silent-skip class of bug in seconds instead of hours — the first live run hid a perm mismatch (User=anti without SupplementaryGroups=decnet) for an entire session window before the symptom was traced upstream.	2026-05-08 22:52:46 -04:00
anti	5116023bf7	feat(profiler/behave_shell): stamp attacker_uuid on bus payload (Phase 5 prep) The profiler worker's per-observation publish now re-merges attacker_uuid into the bus payload alongside id/ts/v. Same shape as the existing DECNET-side deviation from BEHAVE's wire-format docstring (BEHAVE-INTEGRATION.md §339-366) — widens the deviation by one DECNET denorm field. Phase 5's per-attacker SSE route can now filter attacker.observation.* events to one attacker in O(1) without a repo round-trip per event. identity_ref stays None today (until the attribution engine ships); attacker_uuid is independent. Two test changes: * test_happy_path_persists_and_publishes asserts attacker_uuid is in every published payload. * New test_attacker_uuid_in_payload_for_filter pins the field explicitly and confirms it doesn't conflate with identity_ref.	2026-05-08 20:18:32 -04:00
anti	5ff89eefe7	feat(profiler): wire BEHAVE-SHELL extraction onto attacker.session.ended The profiler worker now consumes attacker.session.ended on the bus AND walks unprofiled session_recorded log rows on every tick. Both paths converge on a single handler that: 1. Validates required payload fields (session_id, decky_id, service, attacker_ip, shard_path). 2. Builds evidence_ref shard:{decky}/{service}/{shard_basename}#{sid} and skips when has_observations_for_evidence is True (idempotent re-runs). 3. Resolves attacker_uuid via get_attacker_uuid_by_ip; defers if the profiler tick hasn't materialised the row yet. 4. Reads the asciinema shard, slices events for the sid, calls extract_session, persists each Observation via upsert_observation (per-row; batch transaction filed as follow-up), then publishes each on the bus best-effort (fire-and-forget per DEBT-029 §6). Architecture: * Handler lives in decnet/profiler/behave_shell/_handler.py — pure function, unit-tested in isolation. * Worker.py adds _behave_pump (queue feed), _drain_behave_queue (per-tick drain), _behave_poll_tick (cursor scan over session_recorded logs), and _payload_from_log_row (Log → bus-shape payload projection). * Poll cursor uses a separate state key (attacker_worker_session_cursor) so the correlation tick's cursor doesn't conflate. * has_observations_for_evidence promoted to BaseRepository abstract. 22 new tests across handler / drain / poll layers covering happy path, all skip paths, isolation against handler exceptions, idempotency on re-run, and cursor key separation. TTP worker bus tests still green — payload field is purely additive. Closes BEHAVE-INTEGRATION.md Phase 4.	2026-05-08 18:57:45 -04:00
anti	834aa613b1	feat(pyproject): pin decnet-behave-{core,shell} >=0.1.0,<0.2 Lock the BEHAVE library versions per BEHAVE-INTEGRATION.md §Versioning. The profiler worker (Phase 4 wiring) imports `Observation`/`Window` from `decnet_behave_core.spec.envelope` and `event_topic_for`/`to_event_payload` from `decnet_behave_shell.spec.event_adapter`; without the pin a broken wheel or missing install would only show up on first publish. Four-test smoke pins the public surface: envelope construction, registry import non-empty, event-adapter topic shape, and the adapter's id/ts/v exclusion contract.	2026-05-08 18:51:30 -04:00
anti	aba1e37389	feat(profiler/behave_shell): H.5-pre extractor version marker (0.1.0-pre) decnet.profiler.behave_shell.__version__ = '0.1.0-pre'. The -pre suffix is honest: the extractor is feature-complete (37/37 Tier-A primitives emit, calibration grid honest), but the engine package — worker wiring, observations writes, AttackerDetail panel — still rides BEHAVE-INTEGRATION.md Phase 4. The actual 0.1.0 tag lands when Phase 4 lands. The marker version-tracks the engine, not the spec library (decnet-behave-shell already at 0.1.0); they version independently.	2026-05-08 18:34:23 -04:00
anti	9ebaca410a	test(profiler/behave_shell): H.2 calibration grid full sweep Run the five-class calibration grid (HUMAN / YOU-sim / LW-sim / CLAUDE-FF / CLAUDE-CL) against the 2026-05-02 shards. * Hard gate green for 27 primitives across all 5 shards. * environmental.keyboard_layout moved from hard gate to PHASE_F_CONDITIONAL_PRIMITIVES — short SSH-recon corpus maxes at ~90 typed letters per session, well below the LAYOUT_MIN_TYPED_LETTERS (200) floor. The 200-floor stays per the per-phase "v0 ships when honest" rule; longer-text corpora will surface the layout signal. * Three primitives never fire on the 2026-05-02 corpus, all already conditional and all expected: - cognitive.error_resilience.frustration_typing - environmental.locale - environmental.keyboard_layout No D / F / G threshold re-tunes needed; only the keyboard_layout binding-set move. Phase H step log appended to BEHAVE-EXTRACTOR.md with per-class observation counts.	2026-05-08 18:33:51 -04:00
anti	ac04751c18	test(profiler/behave_shell): H.1 registry-coverage test Static assertion that every Tier-A primitive in PRIMITIVE_REGISTRY has a slot in the calibration grid (hard gate or conditional set). Excludes Tier B (8 cross-session primitives) and Tier C (toolchain.) by explicit allow-list and prefix filter. Three checks: every Tier-A primitive is covered (forward direction) * no extractor set drifts from the registry (reverse, catches typos) * Tier-A count == 37 (design doc invariant) CI now fails before a registry addition ships without a feature function.	2026-05-08 18:30:50 -04:00
anti	f10931f24d	test(profiler/behave_shell): Phase G grid lockdown + completion log Widen calibration binding from PHASE_ABCDEF_PRIMITIVES (25) to PHASE_ABCDEFG_PRIMITIVES (28 hard). Three Phase G primitives that emit on any session-with-commands ride the hard gate: * operational.opsec_discipline * operational.cleanup_behavior * emotional_valence.stress_response The remaining five Phase G primitives ride a new PHASE_G_CONDITIONAL_PRIMITIVES because their sample-size floors make them legitimately absent from short shards: * operational.objective (≥ 3 classified commands) * operational.multi_actor_indicators (≥ 8 commands) * emotional_valence.arousal (typing bursts) * emotional_valence.valence (≥ 80 typed letters) * emotional_valence.frustration_venting (≥ 30 typed letters) Backwards-compat alias PHASE_ABCDEF_PRIMITIVES kept. Phase G completion log + checkbox flips in BEHAVE-EXTRACTOR.md. Tier-A corpus delta: all 37 Tier-A primitives now emit. Phase H (full-corpus lockdown + v0 release) is next.	2026-05-08 16:40:13 -04:00
anti	79f253c969	feat(profiler/behave_shell): G.8 emotional_valence.frustration_venting Binary read of ctx.obscenity_hits (G.0 lexical counter): * detected — obscenity_hits ≥ 1 * none — zero hits Skip below FRUST_VENT_MIN_TYPED_CHARS (30). Confidence hard-capped at 0.5: 0.40 when detected, 0.50 only when cleanly absent over ≥ 200 typed letters, 0.30 otherwise.	2026-05-08 16:37:29 -04:00
anti	40a283a7ec	feat(profiler/behave_shell): G.7 emotional_valence.stress_response Compare median post-error intra-command IATs against baseline (commands not immediately following an errored command): * ratio ≥ STRESS_EUSTRESS_RATIO_MIN (1.20) → eustress_positive * ratio ≤ 1/STRESS_DISTRESS_RATIO_MIN → distress_negative * otherwise → none Confidence hard-capped at 0.5; 0.30 below STRESS_MIN_ERRORED_WITH_IATS (2).	2026-05-08 16:36:34 -04:00
anti	d4dc7dff81	feat(profiler/behave_shell): G.6 emotional_valence.arousal high_agitated when any of: * caps_run_max ≥ 5 * bang_run_max ≥ 3 * fastest typing burst median IAT < 0.06s with ≥ 30 IATs total low_calm when slowest qualifying burst median IAT > 0.30s with ≥ 30 IATs. Else medium_engaged. Confidence hard-capped at 0.5; 0.30 below AROUSAL_MIN_IATS.	2026-05-08 16:35:29 -04:00
anti	3ba7e22b71	feat(profiler/behave_shell): G.5 emotional_valence.valence Soft primitive — pure ratio over G.0 lexical counters: * positive — positive_lex_hits > negative + obscenity, ≥ VALENCE_MIN_HITS * negative — (negative + obscenity) > positive, sum ≥ VALENCE_MIN_HITS * neutral — fall-through Skip below VALENCE_MIN_TYPED_CHARS (80). Confidence hard-capped at EMOTIONAL_VALENCE_CONFIDENCE_CAP (0.5) inside the feature function; 0.30 below VALENCE_FULL_CONFIDENCE_MIN (200). Cap is registry convention.	2026-05-08 16:34:27 -04:00
anti	acf8382bcf	feat(profiler/behave_shell): G.4 operational.multi_actor_indicators Compare median intra-command IATs of the two temporal halves of the session. ≥ MULTI_ACTOR_HALF_MIN_COMMANDS (4) per half required; relative delta > MULTI_ACTOR_HANDOFF_DELTA (0.5) → handoff_detected. team_coordinated is Tier B (cross-session); never emitted from a single session. Confidence 0.55 with both halves ≥ 8 commands; 0.40 otherwise.	2026-05-08 16:33:15 -04:00
anti	17b53dad4d	feat(profiler/behave_shell): G.3 operational.cleanup_behavior * thorough — ≥ CLEANUP_THOROUGH_MIN_DISTINCT (3) distinct cleanup-family hashes in tail-CLEANUP_TAIL_K (5). * partial — 1-2 distinct. * none — zero hits. Adjacent to E.4's binary exit_behavior=cleanup; G.3 graduates the intensity. Confidence 0.55 above 8 commands; 0.35 below.	2026-05-08 16:32:08 -04:00
anti	09f598ce47	feat(profiler/behave_shell): G.2 operational.opsec_discipline * careful — operator hits OPSEC_HISTORY_TOKENS AND tail-K commands include _CLEANUP_TOKEN_HASHES (re-imported from temporal.py). * learning — history hit without cleanup-tail follow-through. * careless — no history-clearing vocabulary at all. Confidence 0.45 (small lexicon, soft); 0.30 below MIN_COMMANDS_FOR_FULL_CONFIDENCE.	2026-05-08 16:29:48 -04:00
anti	c11f3605be	feat(profiler/behave_shell): G.1 operational.objective Per-command intent classification via the G.0 lexicon (`destructive > persistence > exfil > lateral > recon` precedence); majority vote across classified commands. Skip emission below INTENT_MIN_COMMANDS=3 classified hits. Confidence 0.40 below INTENT_FULL_CONFIDENCE_MIN=6, 0.60 above.	2026-05-08 16:28:45 -04:00
anti	289a64014c	feat(profiler/behave_shell): G.0 intent lexicon + lexical counter pass Phase G shared infrastructure (no primitive yet emitted): * New `_intent.py` — five precomputed first-token-hash sets (recon / exfil / persistence / lateral / destructive) with documented precedence, plus opsec-history and three lexeme sets (positive / negative / obscenity) for the typed-text counter pass. Stop words that collide with registry value vocabulary (`no`, `hell`, `ok`) are deliberately excluded — the PII regression test catches such collisions. * `_typed_char_histograms()` extended with five integer counters populated in the same single-pass walk: `obscenity_hits`, `positive_lex_hits`, `negative_lex_hits`, `caps_run_max`, `bang_run_max`. Longest-suffix match against bounded lexicon (`LEXEME_MAX_LEN`); paste-class events excluded. * `SessionContext` widened by the same five fields. Drives G.5 (valence), G.6 (arousal), G.8 (frustration_venting) without retaining raw operator text. * Bump twisted >= 26.4.0rc2 to clear CVE-2026-42304 (pre-existing, caught by pre-commit pip-audit). Adjust ftp template type-ignore code from attr-defined to misc to match the new Twisted typing. PII discipline: same shape as F.4 — fixed-vocabulary integer counters on ctx, never on observations.	2026-05-08 16:27:25 -04:00
anti	a25f4a890d	test(profiler/behave_shell): Phase F + E.4 grid lockdown + completion log Widens the binding calibration set from PHASE_ABCDE_PRIMITIVES (20) to PHASE_ABCDEF_PRIMITIVES (25). The five new entries: * environmental.shell_type (per-shard hard gate) * environmental.terminal_multiplexer (per-shard hard gate) * environmental.keyboard_layout (per-shard hard gate; PII boundary lifted by ANTI; emits all 4 registry values) * environmental.numpad_usage (per-shard hard gate) * temporal.lifecycle_markers.exit_behavior (resolution of the E.4 hold; uses Command.followed_by_prompt from F.0) environmental.locale joins a new PHASE_F_CONDITIONAL_PRIMITIVES set (only fires on shards with an env / locale dump in the output). Phase F completion log appended to BEHAVE-EXTRACTOR.md. The original F.0 row hinted at D.0 subsumption; reversed in the log — D.0 is enriched, not subsumed (regex catches errors when PS1 is suppressed). Tier-A corpus delta: 25 of 37 primitives now emit. Phase G is next.	2026-05-04 00:44:22 -04:00
anti	51ecd0924e	feat(profiler/behave_shell): emit temporal.lifecycle_markers.exit_behavior Resolves the E.4 hold from Phase E. F.0's Command.followed_by_prompt gives us the exit-code proxy (prompt-after-last-command) we couldn't get in Phase E. Logic: last command without trailing prompt → abrupt; first_token_hash in {exit, logout, quit, logoff} → graceful; any of the last K=3 commands' first_token_hash in {history, unset, rm, shred, clear, kill} → cleanup; else → graceful (clean Ctrl-D / window close).	2026-05-04 00:42:25 -04:00
anti	c8166a6071	feat(profiler/behave_shell): emit environmental.numpad_usage Sliding-window scan over single-char digit input events. A run of NUMPAD_RUN_MIN (4) consecutive digit events whose pairwise IATs are all ≤ NUMPAD_FAST_IAT_S (50ms) → detected. Otherwise → not_detected. Skips below NUMPAD_MIN_TYPED_CHARS (50) typed chars. Confidence cap 0.50 per the registry's weak-signal flag.	2026-05-04 00:40:42 -04:00
anti	cd7c7ea5a2	feat(profiler/behave_shell): emit environmental.keyboard_layout ANTI authorised dropping the PII boundary for this primitive. ctx gains typed_unigram_counts / typed_bigram_counts / typed_letter_count populated during the existing single-pass input walk (paste-class events excluded). Two-axis classifier: * layout-artefact unigrams take priority — q rate above floor with low English saturation → azerty; z above floor with y below → qwertz * fallback to English-bigram saturation: ≥ floor → qwerty, else other Sample-size floor 200 typed letters; bigram histogram capped at top-64 to bound memory. Confidence cap stays moderate (0.40-0.55) — heuristic discriminator.	2026-05-04 00:38:24 -04:00
anti	b7ff5d2cc1	feat(profiler/behave_shell): emit environmental.locale Searches ANSI-stripped output for LANG / LC_ALL / LC_CTYPE envvar substrings emitted by env / locale / printenv. Highest-priority key wins (LC_ALL > LANG > LC_CTYPE); POSIX value normalised to BCP-47: en_US.UTF-8 → en-US, pt_BR.UTF-8 → pt-BR, C/POSIX → und. Free-string registry value emitted directly. PII discipline: only the parsed locale value enters observations; surrounding output is read once for matching and dropped.	2026-05-04 00:35:31 -04:00
anti	4257f7b6e2	feat(profiler/behave_shell): emit environmental.terminal_multiplexer Scans RAW output (multiplexer escapes are themselves ANSI; never strip first) for tmux markers (DCS passthrough, focus-reporting, window-title with tmux marker) and screen markers (DCS, screen-OSC). Detected → tmux/screen at 0.85; otherwise → none at 0.55. Skips emission entirely when no commands — silence on a pure-echo or empty session, per the smoke gates. When both detected (nested mux), prefer tmux.	2026-05-04 00:33:44 -04:00
anti	07ff5ff0c9	feat(profiler/behave_shell): emit environmental.shell_type Per-prompt classification mode over ctx.prompt_lines. $/# → bash; % → zsh; > with 'PS ' prefix → powershell; > with 'C:\' substring → cmd.exe; > otherwise → fish. New _features/environmental.py module opens Phase F.	2026-05-04 00:30:24 -04:00
anti	1ff02f0c77	feat(profiler/behave_shell): F.0 prompt-line detector Adds PromptLine dataclass + extract_prompt_lines() helper. PromptLine carries ts, suffix_char ($/#/%/>), raw_line (ANSI-stripped, capped), is_root flag. Populated during the existing single-pass output-window walk; SessionContext gains prompt_lines, Command gains followed_by_prompt. PII trade-off (ANTI-authorised at Phase F): PS1 text retained on ctx so F.1 / F.3 / E.4 can read it. Capped at PROMPT_LINE_MAX_CHARS=256. Observations still only carry derived primitive values. D.0's regex error helpers stay alongside (NOT subsumed) — they fire even when PS1 echo is suppressed. F.0 enriches D.0 rather than replacing it.	2026-05-04 00:29:08 -04:00
anti	96a4039366	test(profiler/behave_shell): Phase E grid lockdown + completion log (E.4 held) Widens the binding calibration set from PHASE_ABCD_PRIMITIVES (17) to PHASE_ABCDE_PRIMITIVES (20). The three shipped Phase E primitives (session_duration, escalation_pattern, landing_ritual) join the per-shard hard gate. E.4 (temporal.lifecycle_markers.exit_behavior) is held at ANTI's direction pending Phase F.0's prompt parser — abrupt-vs-cleanup needs exit-code visibility to be honest, and first-token membership alone over-fires on benign rm / clear mid-session. E.4 picks up at the tail of Phase F. Phase E completion log appended to BEHAVE-EXTRACTOR.md; E.1-E.3 checkboxes flipped, E.4 left unchecked with a held note.	2026-05-04 00:16:33 -04:00
anti	1341df2705	feat(profiler/behave_shell): emit temporal.lifecycle_markers.landing_ritual Inspect the first N commands; if at least K of their first_token_hashes match the recon-survey vocabulary (uname/id/whoami/pwd/hostname/w/who), emit present, else absent. Hashes precomputed at module load; PII-safe. v0.1 N=5, K=2.	2026-05-04 00:15:05 -04:00
anti	d40495d71b	feat(profiler/behave_shell): emit temporal.escalation_pattern Bin commands into non-overlapping windows of width max(ESCALATION_WINDOW_MIN_S, duration_s / ESCALATION_WINDOW_TARGET). CV of per-window counts + zero-window fraction classify bursty / sustained / erratic. v0.1; corpus re-tune deferred.	2026-05-04 00:13:45 -04:00
anti	627fa59c15	feat(profiler/behave_shell): emit temporal.session_duration Bucket ctx.duration_s against SESSION_DURATION_SHORT_MAX (60s) / MEDIUM_MAX (600s) / LONG_MAX (3600s); else marathon. Direct measurement, confidence 0.85. Skip emission only when no commands and zero duration. New _features/temporal.py module opens Phase E.	2026-05-04 00:10:57 -04:00
anti	46775fc0e5	test(profiler/behave_shell): Phase D calibration-grid lockdown + completion log Widens the binding calibration set from PHASE_ABC_PRIMITIVES (13) to PHASE_ABCD_PRIMITIVES (17). The four unconditional Phase D primitives (cognitive_load, exploration_style, planning_depth, tool_vocabulary) join the per-shard hard gate. The three error_resilience.* primitives are conditional on at least one errored command in the shard and tracked in PHASE_D_CONDITIONAL_PRIMITIVES — excluded from the per-shard required-emission set, included in the cross-class discrimination check. cognitive_load empirical re-tune deferred to the next BEHAVE_CALIBRATION_DIR run; v0.1 thresholds ship. Phase D completion log appended to BEHAVE-EXTRACTOR.md; Phase D checkboxes flipped to [x].	2026-05-04 00:03:46 -04:00
anti	0fba6b6113	feat(profiler/behave_shell): emit cognitive.error_resilience.fallback_to_man For each errored command, check whether the next command's first_token_hash is in {man, help, info} (precomputed at module load). At least one match → present, else absent. The --help / -h flag forms aren't first tokens; v0.2 will reconsider once arg-token hashing is justified by corpus.	2026-05-04 00:01:45 -04:00
anti	8183218d29	feat(profiler/behave_shell): emit cognitive.error_resilience.frustration_typing Compares median within-command IAT for commands following an errored command vs commands following a successful one. Relative absolute delta buckets to low / moderate / high. Skips when either group is empty (no errors, or no clean baseline). v0.1; D.8 re-tunes.	2026-05-04 00:00:36 -04:00
anti	b704352783	feat(profiler/behave_shell): emit cognitive.error_resilience.retry_tactic Modal response across Command.errored=True commands: * same first_token_hash on next command → rerun * different first_token_hash → switch * no next command → abort Tiebreak in registry order. The fourth registry value 'modify' requires within-command arg diffing (PII boundary); deferred to v0.2.	2026-05-03 23:58:58 -04:00
anti	f286c84d95	feat(profiler/behave_shell): emit cognitive.tool_vocabulary Absolute distinct first_token_hash count, bucketed against TOOL_VOCAB_NARROW_MAX / TOOL_VOCAB_BROAD_MIN. v0.1; D.8 re-tunes.	2026-05-03 23:56:22 -04:00
anti	6c2e4ada83	feat(profiler/behave_shell): emit cognitive.planning_depth Distribution of inter-command IATs bucketed against IKI_THINK_MAX_S (deep) and INTER_CMD_INSTANT_MAX (reactive); fall-through is shallow. v0.1 thresholds; D.8 re-tunes.	2026-05-03 23:55:16 -04:00
anti	2254651270	feat(profiler/behave_shell): emit cognitive.exploration_style Two-axis classification over the first_token_hash sequence: repetition_rate (drilling) vs backtrack_rate (jumping among prior tools). chaotic/targeted/methodical buckets. v0.1 thresholds; D.8 re-tunes.	2026-05-03 23:54:03 -04:00
anti	f948e10830	feat(profiler/behave_shell): emit cognitive.cognitive_load Composite over three [0, 1]-clipped sub-signals (chunking variance, error rate from D.0's Command.errored, pace variability), mean-aggregated and bucketed against COGNITIVE_LOAD_LOW_MAX / COGNITIVE_LOAD_MEDIUM_MAX. Components missing data drop out of the mean rather than zeroing it. v0.1 thresholds; D.8 re-tunes once D.2-D.7 are stable. Confidence held at 0.60 (composite over soft sub-signals) and halved below the 5-command sample-size floor.	2026-05-03 23:52:29 -04:00
anti	601986bd6d	feat(profiler/behave_shell): output error-signal helper for Phase D Lifts the error-signal slice of F.0 forward as a D.0 prelude. ANSI strip + canonical bash/sh error fingerprints classify each command's post-execution output window; Command gains errored / output_bytes fields. PII discipline preserved — only a bool and an int leave the helper, the stripped output text is dropped on return. Drives D.1 (cognitive_load error_rate term) and D.5–D.7 (error_resilience family). Phase F.0 will subsume this with PS1 + exit-code parsing.	2026-05-03 23:46:31 -04:00
anti	bc62e42ce1	feat(profiler/behave_shell): emit motor.shell_mastery.pipe_chaining_depth	2026-05-03 23:34:54 -04:00
anti	4fc980e968	feat(profiler/behave_shell): emit motor.shell_mastery.shortcut_usage	2026-05-03 23:33:07 -04:00
anti	a077cf67c8	feat(profiler/behave_shell): emit motor.shell_mastery.tab_completion	2026-05-03 23:31:20 -04:00
anti	8161c67ec5	feat(profiler/behave_shell): emit motor.command_chunking BEHAVE-EXTRACTOR.md Phase B Step B.4. First implementation — prototype doesn't ship this primitive. * SessionContext gains intra_command_iats: per-command tuple of IATs between consecutive input events whose timestamps fall inside [cmd.start_ts, cmd.end_ts). Excludes the terminator IAT. Built by _per_command_iats. * _features/motor.py:command_chunking(ctx) emits one Observation in {fluent, fragmented, single_command}. - 0 commands → skip emit - 1 command → single_command (registry-allowed point) - ≥2 commands → median CV across per-command typed-IATs; < CMD_CHUNKING_FLUENT_CV_MAX (0.50) → fluent, else fragmented - paste-only sessions (no command has ≥3 typed IATs) → skip emit (no honest within-command rhythm to measure) Confidence 0.80 / 0.65 / 0.60. * Calibration grid widened to include motor.command_chunking; green across all five shards. Phase B primitive set complete. Tests: no commands → skip, 1 command → single_command, uniform typing → fluent, alternating fast/slow → fragmented, paste-only multi-command → skip emit.	2026-05-03 21:29:31 -04:00
anti	d04f91cd8c	feat(profiler/behave_shell): emit motor.error_correction BEHAVE-EXTRACTOR.md Phase B Step B.3. Replaces the prototype's two-line "0 vs >0 backspaces" placeholder with a backspace-timing classifier that honours the registry's full vocabulary. * SessionContext gains backspace_count, backspace_iats (IAT from each backspace back to the preceding non-backspace input event), and kill_line_count (^U / ^W). Built by _scan_correction_signals, which retains only counts and timing aggregates — no character data leaves the helper, in line with the BEHAVE PII discipline. * _features/motor.py:error_correction(ctx) emits one Observation in {immediate, deferred, absent, route_around}. - 0 backspaces + ≥1 ^U/^W → route_around (rewrite, not correct) - 0 backspaces + 0 kill-lines → absent - backspaces with median IAT ≤ 500 ms → immediate - slower → deferred Confidence 0.65 / 0.65 / 0.55 / 0.55. * < 3 inputs → skip emit. * Calibration grid widened to include motor.error_correction; green across all five shards. Tests cover all four buckets, the < 3 inputs skip, and the PII regression (raw command body never appears in the serialised observation).	2026-05-03 21:27:46 -04:00
anti	0737fcfe93	feat(profiler/behave_shell): emit motor.motor_stability BEHAVE-EXTRACTOR.md Phase B Step B.2. First principled implementation — the prototype doesn't ship this primitive at all. * _features/motor.py:motor_stability(ctx) emits one Observation in {steady, variable, tremor}. Reuses ctx.typing_bursts from B.1. * Tremor proxy: fraction of within-burst IATs below TREMOR_FAST_FLOOR_S (30 ms — humans can't sustain sub-50 ms IATs). ≥ TREMOR_RATE_MIN (10%) sub-floor → tremor (double-press / motor twitch / stuck-key). * Otherwise median burst CV decides: < CV_STEADY_MAX → steady, else → variable. Confidence 0.70 / 0.60 / 0.65. * No typing bursts or fewer than 5 within-burst IATs → skip emit. * Calibration grid widened to include motor.motor_stability; green across all five shards. Tests cover all three buckets + skip paths.	2026-05-03 21:25:54 -04:00
anti	d90c8b70ce	feat(profiler/behave_shell): emit motor.keystroke_cadence BEHAVE-EXTRACTOR.md Phase B Step B.1. * SessionContext gains typing_bursts: tuple[tuple[float, ...], ...] built by _split_typing_bursts(iats) — splits at gaps > IKI_THINK_MAX_S (1.5s) and drops bursts of fewer than 3 IATs. Mirrors prototype's _split_into_bursts at BEHAVE/prototype_extractors/shell/extract.py:275. * _features/motor.py:keystroke_cadence(ctx) emits one Observation in {steady, bursty, hunt_and_peck, machine}. Median CV across typing bursts; mean IKI < IKI_MACHINE_MAX_S paired with CV < CV_MACHINE_MAX → machine. Confidence 0.85/0.70/0.65/0.60 per the prototype's calibration history. * < MIN_INPUTS_FOR_CADENCE inputs or zero typing bursts → skip emission. v0.1 emits only the burst-CV variant; the prototype's NAIVE session-CV variant is parked for v0.2. * Calibration grid widened (PHASE_A_PRIMITIVES → PHASE_AB_PRIMITIVES) to include motor.keystroke_cadence. Grid green across all five shards. Tests: too-few-inputs → no emit, all-think-pauses → no burst → no emit, uniform IATs → steady, sub-5ms → machine, mixed-pace → bursty, extreme bimodal → hunt_and_peck.	2026-05-03 21:24:13 -04:00
anti	640294f3dc	test(profiler/behave_shell): five-class calibration grid lockdown BEHAVE-EXTRACTOR.md Phase A Step 9 — the gate. Runs the pure engine against each of the five 2026-05-02 calibration shards and pins the contract that all subsequent Phase B-G PRs must keep green: every Phase A primitive (motor.input_modality, motor.paste_burst_rate, cognitive.inter_command_latency_class, cognitive.command_branch_diversity, cognitive.feedback_loop_engagement, cognitive.inter_command_consistency) fires at least once per shard. * tests/profiler/behave_shell/test_calibration_grid.py parametrized over (shard_file, class_label) for HUMAN / YOU-sim / LW-sim / CLAUDE-FF / CLAUDE-CL. Skips entirely when BEHAVE_CALIBRATION_DIR is unset (CI provides the path; local dev doesn't have to). * Plus a discrimination-smoke check: at least one primitive produces different majority values across present classes — catches the "constant-output regression" failure mode where the engine quietly degenerates to a stub. Calibration tweak: BRANCH_DIVERSITY_LINEAR_MIN dropped from 0.80 to 0.70 to align with the prototype's empirical anchors (CLAUDE-CL ≈ 0.55-0.60 adaptive; YOU-sim / CLAUDE-FF scripted recon ≈ 0.75+ linear). Test for the middle band re-pinned at the new boundary. Per-class value pinning (e.g. HUMAN must emit inter_command_consistency=bimodal) is intentionally NOT a hard gate yet — v0.1 thresholds put real human sessions in "variable", and true bimodal detection (Hartigan dip / two-peak) is registry-flagged for v0.2. Tighter pinning lands as the corpus grows.	2026-05-03 08:00:50 -04:00
anti	842b7de950	feat(profiler/behave_shell): emit cognitive.inter_command_consistency BEHAVE-EXTRACTOR.md Phase A Step 8. Dispersion / bimodality of inter-command pauses. HUMAN-bimodal vs LLM-metronomic. * _features/cognitive.py:inter_command_consistency(ctx) emits one Observation in {metronomic, variable, bimodal}. * CV = stdev / mean of ctx.inter_cmd_iats. CV < 0.40 → metronomic (LLM-pure; corpus anchor 0.24); CV ≥ 1.50 → bimodal heuristic (LLM-assisted human; v0.1 placeholder, true bimodal via Hartigan dip is registry-flagged for v0.2); else → variable (human; corpus anchor 0.94). * < 2 IATs or zero mean → skip emission. < 5 commands halves confidence (0.40 vs 0.75) per sample-size honesty. Tests: too-few IATs → no emission, uniform → metronomic, human-like dispersion → variable, extreme bursts+gaps → bimodal, low-sample-count → reduced confidence. Step 8 closes the six-primitive calibration floor for Phase A. Step 9 (calibration grid lockdown) is the gate that pins it.	2026-05-03 07:56:49 -04:00
anti	2f8c107e70	feat(profiler/behave_shell): emit cognitive.feedback_loop_engagement BEHAVE-EXTRACTOR.md Phase A Step 7. The orthogonal axis — does the operator's pause-after-command correlate with bytes of output they just saw? Splits HUMAN/CLAUDE-CL (closed_loop) from LW-sim/CLAUDE-FF (fire_and_forget); cuts ACROSS the LLM/human axis. * _features/cognitive.py:feedback_loop_engagement(ctx) emits one Observation in {closed_loop, fire_and_forget, unknown}. * Pearson correlation between ctx.output_per_cmd[i] and ctx.inter_cmd_iats[i] (paired by construction in Step 4); via statistics.correlation with constant-series fallback to "unknown". * r > FEEDBACK_CORRELATION_MIN (0.30) → closed_loop; otherwise (zero, negative, or undefined) → fire_and_forget. * First primitive that depends on output events: zero output events in the shard or fewer than FEEDBACK_MIN_PAIRS (5) pairs → emit "unknown" at confidence 1.0 (the absence-of-data is itself a high-confidence answer). Zero-command session skips entirely. Tests: no-output → unknown, few-pairs → unknown, strong positive r → closed_loop, constant pace → fire_and_forget/unknown, negative r → fire_and_forget.	2026-05-03 07:55:38 -04:00

1 2

64 Commits