test(profiler/behave_shell): H.2 calibration grid full sweep
Run the five-class calibration grid (HUMAN / YOU-sim / LW-sim / CLAUDE-FF / CLAUDE-CL) against the 2026-05-02 shards. * Hard gate green for 27 primitives across all 5 shards. * environmental.keyboard_layout moved from hard gate to PHASE_F_CONDITIONAL_PRIMITIVES — short SSH-recon corpus maxes at ~90 typed letters per session, well below the LAYOUT_MIN_TYPED_LETTERS (200) floor. The 200-floor stays per the per-phase "v0 ships when honest" rule; longer-text corpora will surface the layout signal. * Three primitives never fire on the 2026-05-02 corpus, all already conditional and all expected: - cognitive.error_resilience.frustration_typing - environmental.locale - environmental.keyboard_layout No D / F / G threshold re-tunes needed; only the keyboard_layout binding-set move. Phase H step log appended to BEHAVE-EXTRACTOR.md with per-class observation counts.
This commit is contained in:
@@ -59,10 +59,10 @@ PHASE_ABCDEFG_PRIMITIVES: frozenset[str] = frozenset({
|
||||
"temporal.escalation_pattern",
|
||||
"temporal.lifecycle_markers.landing_ritual",
|
||||
# Phase F — environmental.* output-stream block + carry-over E.4
|
||||
# (locale is conditional, see PHASE_F_CONDITIONAL_PRIMITIVES)
|
||||
# (locale and keyboard_layout are conditional — see
|
||||
# PHASE_F_CONDITIONAL_PRIMITIVES)
|
||||
"environmental.shell_type",
|
||||
"environmental.terminal_multiplexer",
|
||||
"environmental.keyboard_layout",
|
||||
"environmental.numpad_usage",
|
||||
"temporal.lifecycle_markers.exit_behavior",
|
||||
# Phase G — operational.* + emotional_valence.* (hard subset)
|
||||
@@ -85,12 +85,18 @@ PHASE_D_CONDITIONAL_PRIMITIVES: frozenset[str] = frozenset({
|
||||
"cognitive.error_resilience.fallback_to_man",
|
||||
})
|
||||
|
||||
# Phase F primitives conditional on shard content. ``environmental.locale``
|
||||
# fires only when the shard's output contains an env / locale dump
|
||||
# (LANG=, LC_ALL=, LC_CTYPE=). It's tracked here, not in the per-shard
|
||||
# hard gate.
|
||||
# Phase F primitives conditional on shard content.
|
||||
# * ``environmental.locale`` fires only when the shard's output contains
|
||||
# an env / locale dump (LANG=, LC_ALL=, LC_CTYPE=).
|
||||
# * ``environmental.keyboard_layout`` requires LAYOUT_MIN_TYPED_LETTERS
|
||||
# (200) typed letters per session — short SSH-recon shards (the
|
||||
# 2026-05-02 calibration corpus) max out around 90 typed letters
|
||||
# per session because most input is pasted rather than typed.
|
||||
# v0 keeps the 200-floor honesty rather than tuning to pass; longer-
|
||||
# text corpora will surface it.
|
||||
PHASE_F_CONDITIONAL_PRIMITIVES: frozenset[str] = frozenset({
|
||||
"environmental.locale",
|
||||
"environmental.keyboard_layout",
|
||||
})
|
||||
|
||||
# Phase G primitives that ride sample-size floors and may legitimately
|
||||
|
||||
Reference in New Issue
Block a user