diff --git a/development/BEHAVE-EXTRACTOR.md b/development/BEHAVE-EXTRACTOR.md
index 8fa2afde..91d8ef03 100644
--- a/development/BEHAVE-EXTRACTOR.md
+++ b/development/BEHAVE-EXTRACTOR.md
@@ -1058,6 +1058,85 @@ primitives, and the `team_coordinated` value of
 `operational.multi_actor_indicators`) remains the attribution
 engine's job — never the extractor's.
 
+## Phase H step log (extractor-slice)
+
+Phase H ships in two slices: an **extractor slice** (H.1, H.2, and a
+`0.1.0-pre` version marker) closing the engine itself, and an
+**integration slice** (H.3 live smoke + H.4 worker wiring + H.5
+proper v0 tag) that rides `BEHAVE-INTEGRATION.md` Phase 4. This log
+covers the extractor slice.
+
+### H.1 — Registry-coverage test
+
+`tests/profiler/behave_shell/test_registry_coverage.py` walks
+`PRIMITIVE_REGISTRY` and asserts every Tier-A primitive has a slot
+in the calibration grid (hard or conditional). Tier B
+(8 cross-session primitives) is excluded by an explicit allow-list;
+Tier C (`toolchain.*`) is excluded by prefix. Three checks ride:
+forward (every Tier-A covered), reverse (no extractor-set drift from
+the registry), and a `len(tier_a) == 37` invariant. CI now fails
+before a registry addition can ship without a feature function.
+
+### H.2 — Calibration grid full sweep (2026-05-02 corpus)
+
+Shards at `/home/anti/Tools/BEHAVE/prototype_extractors/shell/` —
+five classes, 15 sessions total. Per-class observation counts:
+
+| Class | Sessions | Observations | Distinct primitives |
+|---|---|---|---|
+| HUMAN | 1 | 34 | 34 |
+| YOU-sim | 2 | 59 | 34 |
+| LW-sim | 5 | 136 | 34 |
+| CLAUDE-FF | 3 | 84 | 34 |
+| CLAUDE-CL | 4 | 111 | 34 |
+
+**One real-shard regression surfaced and fixed:**
+`environmental.keyboard_layout` was on the per-shard hard gate but
+the calibration corpus maxes at ~90 typed letters per session — well
+below `LAYOUT_MIN_TYPED_LETTERS=200`. Most input on these
+SSH-recon shards is *pasted*, not typed. The honest fix per the
+per-phase rule "v0 ships when honest, not when convenient" is to
+move `environmental.keyboard_layout` from `PHASE_ABCDEFG_PRIMITIVES`
+to `PHASE_F_CONDITIONAL_PRIMITIVES`, alongside `environmental.locale`.
+The 200-letter floor stays — the keyboard-layout signal genuinely
+needs richer typed text than this corpus has, and tuning the
+threshold to pass would corrupt the signal.
+
+**Three Tier-A primitives never fire across the 2026-05-02 corpus,
+all conditional and all expected:**
+
+* `cognitive.error_resilience.frustration_typing` — needs ≥ 2
+  errored commands plus successful baseline (Phase D conditional).
+* `environmental.locale` — needs an `env` / `locale` / `printenv`
+  dump in the output stream (Phase F conditional).
+* `environmental.keyboard_layout` — needs ≥ 200 typed letters
+  per session (Phase F conditional after H.2).
+
+The hard gate (28 primitives) fires on every shard with commands;
+the discrimination smoke-check passes too. No threshold re-tunes
+needed for D / F / G — the corpus surfaced the keyboard_layout shape
+mismatch only.
+
+**Calibration-grid binding update.** `PHASE_ABCDEFG_PRIMITIVES`
+shrinks 28 → 27 (keyboard_layout moves to conditional);
+`PHASE_F_CONDITIONAL_PRIMITIVES` grows 1 → 2.
+
+### H.5-pre — Extractor version marker
+
+`decnet/profiler/behave_shell/__init__.py` exports
+`__version__ = "0.1.0-pre"`. The `-pre` suffix is honest: the
+**extractor** is feature-complete (37/37 Tier-A primitives emit, all
+green tests, calibration grid honest), but the *engine package* —
+worker wiring, observations-table writes, AttackerDetail panel —
+still rides the integration track. The actual `0.1.0` tag bumps the
+suffix off only after `BEHAVE-INTEGRATION.md` Phase 4 lands.
+
+**Tier-A corpus delta (final extractor-slice):** all 37 of 37
+primitives have a feature function; **27** ride the per-shard hard
+gate (28 pre-H.2; keyboard_layout moved out); **10** ride conditional
+sets. Phase H integration slice (H.3 + H.4 + proper H.5 tag) is its
+own plan.
+
 ---
 
 **Owner:** ANTI.
diff --git a/tests/profiler/behave_shell/test_calibration_grid.py b/tests/profiler/behave_shell/test_calibration_grid.py
index bf965451..f435255b 100644
--- a/tests/profiler/behave_shell/test_calibration_grid.py
+++ b/tests/profiler/behave_shell/test_calibration_grid.py
@@ -59,10 +59,10 @@ PHASE_ABCDEFG_PRIMITIVES: frozenset[str] = frozenset({
     "temporal.escalation_pattern",
     "temporal.lifecycle_markers.landing_ritual",
     # Phase F — environmental.* output-stream block + carry-over E.4
-    # (locale is conditional, see PHASE_F_CONDITIONAL_PRIMITIVES)
+    # (locale and keyboard_layout are conditional — see
+    # PHASE_F_CONDITIONAL_PRIMITIVES)
     "environmental.shell_type",
     "environmental.terminal_multiplexer",
-    "environmental.keyboard_layout",
     "environmental.numpad_usage",
     "temporal.lifecycle_markers.exit_behavior",
     # Phase G — operational.* + emotional_valence.* (hard subset)
@@ -85,12 +85,18 @@ PHASE_D_CONDITIONAL_PRIMITIVES: frozenset[str] = frozenset({
     "cognitive.error_resilience.fallback_to_man",
 })
 
-# Phase F primitives conditional on shard content. ``environmental.locale``
-# fires only when the shard's output contains an env / locale dump
-# (LANG=, LC_ALL=, LC_CTYPE=). It's tracked here, not in the per-shard
-# hard gate.
+# Phase F primitives conditional on shard content.
+# * ``environmental.locale`` fires only when the shard's output contains
+#   an env / locale dump (LANG=, LC_ALL=, LC_CTYPE=).
+# * ``environmental.keyboard_layout`` requires LAYOUT_MIN_TYPED_LETTERS
+#   (200) typed letters per session — short SSH-recon shards (the
+#   2026-05-02 calibration corpus) max out around 90 typed letters
+#   per session because most input is pasted rather than typed.
+#   v0 keeps the 200-floor honesty rather than tuning to pass; longer-
+#   text corpora will surface it.
 PHASE_F_CONDITIONAL_PRIMITIVES: frozenset[str] = frozenset({
     "environmental.locale",
+    "environmental.keyboard_layout",
 })
 
 # Phase G primitives that ride sample-size floors and may legitimately