docs: reframe BEHAVE-SHELL as a spec, not a DECNET component — add stylometry/lexicometry scope, BEHAVE-TEXT/EYENET cross-reference

2026-05-10 04:31:36 -04:00
parent 58915d8115
commit 5a39211645
1 changed files with 254 additions and 377 deletions
--- a/BEHAVE-SHELL.md
+++ b/BEHAVE-SHELL.md
@@ -1,95 +1,106 @@
 # BEHAVE-SHELL
-BEHAVE-SHELL is DECNET's behavioural biometrics engine for interactive shell
+BEHAVE-SHELL is a **behavioural biometrics specification** for interactive
-sessions.  It transforms raw PTY recordings into 37 attribution primitives
+shell sessions.  It defines a set of attribution primitives — observable,
-that fingerprint *how* an operator works — their motor patterns, cognitive
+computable signals — that characterise *how* an operator works at a terminal,
-style, OPSEC habits, and emotional state — independently of what IP address
+independently of what IP address, credential, or tooling they use.
 or tooling they use.
-The primitives feed the [Identity-Resolution](Identity-Resolution) attribution
+The spec was born out of DECNET's need to correlate attackers across sessions
-state machine, which accumulates evidence across sessions to answer: *is this
+and IP changes, but it is not DECNET-specific.  Any system that records PTY
-the same hands?*
+sessions can implement BEHAVE-SHELL extraction and feed the resulting
 primitives into an attribution engine.  DECNET is the reference implementation.
 A sibling specification, **BEHAVE-TEXT**, defines equivalent primitives for
 written text — stylometry, lexicometry, and discourse structure — and is
 implemented by [EYENET](https://github.com/xmartlab/eyenet).
 ---
 ## Scope
 BEHAVE-SHELL has grown beyond its original keystroke-dynamics focus.  The
 current specification covers three broad domains:
 | Domain | What it captures |
 |---|---|
 | **Motor biometrics** | Keystroke timing, error correction, paste vs. type habits, shell mastery signals |
 | **Cognitive / behavioural** | Command planning depth, feedback loop engagement, tool vocabulary, exploration style, response to failure |
 | **Stylometry / lexicometry** | Lexical choices, sentiment, OPSEC vocabulary, keyboard layout fingerprinting from bigram distributions |
 The emotional valence cluster (`valence`, `arousal`, `stress_response`,
 `frustration_venting`) sits at the boundary of motor and stylometric signal —
 it measures both typing speed changes and lexical content after stress events.
 ---
 ## Design principles
- **Pure extraction library.** `extract_session()` takes an iterable of
+- **Extraction is pure.** The spec defines a function
-  asciinema events and yields `Observation` envelopes.  No I/O, no DB access,
+  `extract_session(events) → Observations` that takes an iterable of timestamped
-  no bus calls.  The worker owns all side effects.
+  PTY events and yields structured observations.  No I/O.  No database.
- **PII by design.** Command text is never stored in plain form — only the
+  No side effects.  Implementations are free to run this in any context.
 - **PII by design.** Command text is never stored in plain form.  Only the
  SHA-256 of the first token is retained.  Output is reduced to a byte count
  and an error verdict.  Prompt lines are ANSI-stripped and capped at 256
-  characters.
+  characters.  Raw bigram/unigram counts are used for layout fingerprinting —
- **Idempotent persistence.** `UniqueConstraint(evidence_ref, primitive)`
+  not the text itself.
-  on the observations table means replaying a shard never duplicates rows.
+
- **Confidence capping.** Emotional-valence features carry a hard confidence
+- **Confidence is explicit.** Every observation carries a confidence value
-  cap of 0.50 — they contribute, but never dominate an attribution decision.
+  [0.0–1.0].  Features that are inherently noisier have hard confidence caps
  (emotional valence: 0.50).  Attribution engines must propagate confidence
  rather than treating all observations as equal.
 - **Skip conditions over imputation.** A feature that cannot be computed on a
  given session (e.g. `error_resilience` features when no errors occurred)
  yields no observation rather than a default value.  Attribution engines
  treat absence of an observation differently from an `unknown` state.
 ---
-## Data flow
+## Input format
-```
+BEHAVE-SHELL operates on **asciinema-compatible event streams**: sequences of
-PTY session
+`(t: float, ch: "i"|"o", d: str)` tuples representing timestamped input and
-    │
+output chunks from a PTY session.  `"i"` is operator input; `"o"` is terminal
-    ▼
+output.  Non-UTF-8 bytes are handled via surrogateescape.
-sessrec.c — writes JSONL shard per session
+
-    │  {"sid": id, "t": ts, "ch": "i"|"o", "d": data}
+The DECNET implementation records these as JSONL shards via `sessrec.c`:
-    │  Non-UTF-8 bytes handled via surrogateescape
+
-    ▼
+```json
-attacker.session.ended bus event
+{"sid": "abc123", "t": 1.234, "ch": "i", "d": "ls -la\r"}
-    │
+{"sid": "abc123", "t": 1.891, "ch": "o", "d": "total 48\r\n..."}
    ▼
 _handler.handle_session_ended()
    │  Reads shard from disk → parse_shard_line() → AsciinemaEvent tuples
    ▼
 build_session_context()   (_ctx.py, ~573 lines)
    │  Seven derivation steps (see below)
    ▼
 extract_session()   (extract.py)
    │  Fan-out across 37 registered feature functions (FEATURES registry)
    │  Each yields 0..N Observation envelopes
    ▼
 Upsert ObservationRow  →  publish attacker.observation.*
    │
    ▼
 attribution_worker  (attribution_worker.py)
    │  Consumes attacker.observation.> bus events
    │  Runs aggregate() per (identity_uuid, primitive)
    ▼
 AttributionStateRow   state ∈ {unknown, stable, drifting, conflicted, multi_actor}
 ```
 ---
 ## Session context derivation
-`build_session_context()` performs a single-pass walk over the raw event
+Before feature extraction, a single-pass walk over the event stream builds a
-stream and produces a `SessionContext` that all 37 feature functions read.
+`SessionContext` — a set of derived signals that all feature functions share.
-The seven derivation steps, in order:
+The derivation steps, in order:
-| Step | What it computes |
+| Step | Output |
 |---|---|
-| **Paste-burst detection** | Groups consecutive paste-class events (≥4 chars within 200 ms) into `paste_bursts` |
+| **Paste-burst detection** | Groups consecutive paste-class events (≥4 chars, within 200 ms) into `paste_bursts` |
-| **Typing-burst segmentation** | Splits the keystroke stream at think-pauses > 2.0 s into `typing_bursts[][]` (dropped if < 3 IATs) |
+| **Typing-burst segmentation** | Splits keystroke stream at think-pauses > 2.0 s into `typing_bursts[][]`; drops bursts < 3 IKIs |
-| **Correction signals** | Counts backspaces (`0x7f`, `0x08`) and kill-line sequences (`0x15`, `0x17`); records IATs between each backspace and the preceding keystroke |
+| **Correction signals** | Counts backspaces (`0x7f`, `0x08`) and kill-line (`0x15`, `0x17`); records IKI between each backspace and the preceding keystroke |
-| **Per-command intra-typing IATs** | For each command, extracts keystroke inter-arrival times from that command's span only |
+| **Per-command intra-typing IKIs** | For each command, IKIs from that command's span only |
-| **Command segmentation** | Splits on `\r`/`\n`; per command records `first_token_hash` (SHA-256), tab count, readline shortcut count, and pipe count |
+| **Command segmentation** | Splits on `\r`/`\n`; per command: `first_token_hash` (SHA-256), tab count, readline shortcut count, pipe count |
-| **Inter-command IAT gaps** | Time between consecutive commands |
+| **Inter-command IKI gaps** | Time between consecutive commands |
-| **Error detection** | Scans output between commands for canonical error patterns (`"command not found"`, `"Permission denied"`, `"No such file"`) to set `command.errored` |
+| **Error detection** | Scans output for canonical error patterns (`"command not found"`, `"Permission denied"`, `"No such file"`) to set `command.errored` |
-| **PS1 prompt detection** | Regex for `$`, `#`, `%`, `>` suffix after ANSI stripping; caps at 256 chars |
+| **PS1 prompt detection** | Regex for `$`, `#`, `%`, `>` suffix; ANSI-stripped, capped at 256 chars |
-| **Keyboard layout fingerprinting** | Builds unigram and bigram histograms from typed letters |
+| **Keyboard layout fingerprinting** | Unigram and bigram histograms from typed letters |
 | **Lexical counters** | Obscenity hits, positive/negative sentiment tokens, max caps run, max consecutive `!` run |
-### Key data structures
+### Key structures
 ```
 SessionContext
  sid: str
  t_start, t_end, duration_s: float
-  input_events, output_events: tuple[AsciinemaEvent]
+  input_events, output_events: tuple[Event]
-  iats: tuple[float]                      # inter-keystroke intervals
+  iats: tuple[float]                       # inter-keystroke intervals
  paste_bursts: tuple[PasteBurst]
  typing_bursts: tuple[tuple[float]]
  backspace_count, kill_line_count: int
@@ -104,7 +115,7 @@ SessionContext
 Command
  start_ts, end_ts: float
-  first_token_hash: str    # SHA-256 of first token only
+  first_token_hash: str    # SHA-256, first token only
  tab_count, shortcut_count, pipe_count: int
  errored: bool
  output_bytes: int
@@ -121,507 +132,373 @@ PromptLine
 ## The 37 primitives
-### Motor (9) — muscle memory and physical interaction style
+### Motor (9)
-These primitives capture *how* an operator's fingers interact with the
+Motor primitives capture muscle memory and physical interaction patterns.
-keyboard — patterns that persist across sessions, accounts, and even
+They are among the most stable signals across sessions and across different
-operating systems.
+machines used by the same operator.
-#### 1. `input_modality`
+#### `input_modality`
 Values: `typed` | `pasted` | `mixed`
-Ratio of paste events to total input events.  ≥40 % pasted and ≤5 %
+Ratio of paste events to total input events.  ≥40 % pasted and ≤5 % typed
-typed → `pasted`; ≤5 % pasted → `typed`; otherwise `mixed`.
+→ `pasted`.  ≤5 % pasted → `typed`.  Otherwise `mixed`.
 A script kiddie running pre-written one-liners pastes habitually.  A
 seasoned operator types most commands from memory.
-#### 2. `paste_burst_rate`
+#### `paste_burst_rate`
 Values: `none` | `occasional` | `habitual`
-Coarser bucketing of the paste ratio.  ≥50 % → `habitual`,
+Coarser paste-ratio bucketing.  ≥50 % → `habitual`, ≥10 % → `occasional`.
 ≥10 % → `occasional`.
-#### 3. `keystroke_cadence`
+#### `keystroke_cadence`
 Values: `steady` | `bursty` | `hunt_and_peck` | `machine`
 Median coefficient of variation (CV) of within-burst inter-keystroke
-intervals (IKIs).
+intervals (IKIs):
 | CV | Mean IKI | Label |
 |---|---|---|
-| < 0.30 | < 30 ms | `machine` |
+| < 0.30 | < 30 ms | `machine` — inhumanly uniform |
-| < 0.45 | any | `steady` |
+| < 0.45 | any | `steady` — trained touch typist |
-| < 0.70 | any | `bursty` |
+| < 0.70 | any | `bursty` — thinks between phrases |
 | ≥ 0.70 | any | `hunt_and_peck` |
-`machine` catches automated input that passes as human visually but has
+#### `motor_stability`
 inhumanly uniform inter-key timing.
 #### 4. `motor_stability`
 Values: `steady` | `variable` | `tremor`
-Fraction of IKIs below the tremor floor (30 ms).  ≥20 % → `tremor`
+Fraction of IKIs below 30 ms.  ≥20 % → `tremor` (physiological or
-(physiological or tool-simulated).  Otherwise the median CV classifies
+tool-simulated).  Otherwise CV classifies `steady` vs `variable`.
 `steady` vs `variable`.
-#### 5. `error_correction`
+#### `error_correction`
 Values: `immediate` | `deferred` | `absent` | `route_around`
 Timing of backspace relative to the preceding keystroke.  Median ≤500 ms
-→ `immediate` (noticed fast, muscle-memory correction).  Median > 500 ms
+→ `immediate`.  Median > 500 ms → `deferred`.  No backspaces but kill-line
-→ `deferred` (reads output then corrects).  Zero backspaces but kill-line
+present → `route_around` (ctrl-u / ctrl-w).  Nothing → `absent`.
 present → `route_around` (ctrl-u / ctrl-w).  No corrections at all →
 `absent`.
-#### 6. `command_chunking`
+#### `command_chunking`
 Values: `fluent` | `fragmented` | `single_command`
 Median CV of per-command intra-typing IKIs.  < 0.40 → `fluent` (commands
-typed as rehearsed phrases).  Otherwise `fragmented`.  Only one command
+typed as rehearsed phrases).
 in session → `single_command`.
-#### 7. `shell_mastery.tab_completion`
+#### `shell_mastery.tab_completion`
 Values: `none` | `occasional` | `habitual`
-Fraction of commands containing at least one `0x09` (tab) keystroke.
+Fraction of commands containing ≥1 tab keystroke.  0 → `none`,
-0 → `none`, < 50 % → `occasional`, ≥ 50 % → `habitual`.
+< 50 % → `occasional`, ≥50 % → `habitual`.
-Operators who tab-complete heavily know the filesystem; those who never do
+#### `shell_mastery.shortcut_usage`
 either memorise paths or are running a prepared script.
 #### 8. `shell_mastery.shortcut_usage`
 Values: `none` | `moderate` | `heavy`
-Readline control-byte count (ctrl-a, ctrl-e, ctrl-r, etc.) per command.
+Readline control-byte count per command.  < 0.05 → `none`,
-< 0.05 → `none`, < 0.15 → `moderate`, ≥ 0.15 → `heavy`.
+< 0.15 → `moderate`, ≥0.15 → `heavy`.
-#### 9. `shell_mastery.pipe_chaining_depth`
+#### `shell_mastery.pipe_chaining_depth`
 Values: `shallow` | `moderate` | `deep`
 Median pipe count per command.  ≤1 → `shallow`, 2 → `moderate`, ≥3 → `deep`.
 ---
-### Cognitive (11) — decision-making and planning style
+### Cognitive (11)
-These primitives capture *how* an operator thinks — their command repertoire,
+Cognitive primitives capture decision-making style, planning depth, and how
-response to failure, and how much they read output before acting.
+the operator processes feedback.
-#### 10. `inter_command_latency_class`
+#### `inter_command_latency_class`
 Values: `instant` | `typing_speed` | `deliberate` | `llm_lightweight` | `llm_heavyweight` | `long`
 Median inter-command pause bucketed against calibrated thresholds:
-| Threshold | Label | What it suggests |
+| Threshold | Label | Interpretation |
 |---|---|---|
 | ≤ 0.30 s | `instant` | Scripted or replay |
 | ≤ 1.50 s | `typing_speed` | Commands prepared, typing only |
-| ≤ 2.00 s | `deliberate` | Reads output before next command |
+| ≤ 2.00 s | `deliberate` | Reads output before acting |
-| ≤ 8.00 s | `llm_lightweight` | May be consulting a fast LLM / notes |
+| ≤ 8.00 s | `llm_lightweight` | Consulting a fast LLM or notes |
 | ≤ 30.00 s | `llm_heavyweight` | Consulting a slow LLM or manual reference |
-| > 30.00 s | `long` | Long pauses — possibly interrupted or cautious |
+| > 30.00 s | `long` | Interrupted or cautious |
-`llm_lightweight` and `llm_heavyweight` were calibrated against Claude
+The `llm_*` thresholds were calibrated against real sessions of Claude-assisted
-Free (fast) and Claude (slow) assisted operator sessions — a novel class
+operators — a novel adversary class BEHAVE-SHELL is explicitly designed to
-of adversary DECNET is designed to detect.
+detect.
-#### 11. `command_branch_diversity`
+#### `command_branch_diversity`
 Values: `linear_playbook` | `adaptive_branching` | `unknown`
-Unique first-token / total command ratio.  < 5 commands → `unknown`.
+Unique first-token ratio.  < 5 commands → `unknown`.  ≥70 % unique →
-≥ 70 % unique → `linear_playbook` (each command is different — following
+`linear_playbook` (following a prepared list).  < 70 % →
-a prepared list).  < 70 % → `adaptive_branching` (repeating tools,
+`adaptive_branching` (iterating on a problem).
 iterating on a problem).
-#### 12. `feedback_loop_engagement`
+#### `feedback_loop_engagement`
 Values: `closed_loop` | `fire_and_forget` | `unknown`
 Pearson correlation between per-command output bytes and the following
 inter-command pause.  r > 0.30 → `closed_loop` (pauses longer when there
-is more output to read).  Otherwise `fire_and_forget`.  Requires ≥5
+is more to read).  Requires ≥5 triples.
 command/output/pause triples.
-#### 13. `inter_command_consistency`
+#### `inter_command_consistency`
 Values: `metronomic` | `variable` | `bimodal`
 CV of inter-command IKIs.  < 0.40 → `metronomic` (scripts, beacons).
-> 1.50 → `bimodal` (two distinct paces — often short commands interleaved
+> 1.50 → `bimodal` (short commands interleaved with long waits for
-with long waits for a compile or download).  Otherwise `variable`.
+compiles or downloads).
-#### 14. `cognitive_load`
+#### `cognitive_load`
 Values: `low` | `medium` | `high`
-Composite score: mean of (intra-typing CV / 1.0, error rate, pause CV / 1.5).
+Composite: mean(intra-typing CV / 1.0, error rate, pause CV / 1.5).
 < 0.33 → `low`, < 0.67 → `medium`, otherwise `high`.
-High cognitive load across multiple sessions on the same identity is a
+#### `exploration_style`
 signal of an operator working outside their comfort zone — new target OS,
 unfamiliar tooling, or time pressure.
 #### 15. `exploration_style`
 Values: `methodical` | `targeted` | `chaotic`
-`repetition_rate` = 1 − unique/total commands.
+`backtrack_rate` ≥30 % → `chaotic`.  `repetition_rate` ≥50 % → `targeted`.
 `backtrack_rate` = fraction of commands that jump back to a previously used
 tool category.  Backtrack ≥30 % → `chaotic`.  Repetition ≥50 % → `targeted`
 (narrow focus, known objective).  Otherwise `methodical`.
-#### 16. `planning_depth`
+#### `planning_depth`
 Values: `deep` | `reactive` | `shallow`
-`deep_pause_frac` = fraction of inter-command IKIs > 2.0 s.
+Fraction of inter-command IKIs > 2.0 s (deep) vs ≤ 0.30 s (reactive).
 `reactive_frac` = fraction ≤ 0.30 s.  ≥40 % deep pauses → `deep`.
 ≥50 % reactive → `reactive`.  Otherwise `shallow`.
-#### 17. `tool_vocabulary`
+#### `tool_vocabulary`
 Values: `narrow` | `moderate` | `broad`
-Distinct first-token count (absolute).  ≤3 → `narrow`, ≥10 → `broad`.
+Distinct first-token count.  ≤3 → `narrow`, ≥10 → `broad`.
-#### 18. `error_resilience.retry_tactic`
+#### `error_resilience.retry_tactic`
 Values: `retry_same` | `pivot` | `fallback`
-Post-error behaviour: does the operator retry the same command, switch to
+Post-error behaviour pattern.  Skipped if no errors.
 a different approach, or fall back to reconnaissance?  Skipped if no errors
 occurred in the session.
-#### 19. `error_resilience.frustration_typing`
+#### `error_resilience.frustration_typing`
 Values: `low` | `moderate` | `high`
 Delta between median intra-IKI after an error vs. after a success.
 < 10 % delta → `low`, < 30 % → `moderate`, ≥30 % → `high`.
-Fast typing after errors suggests frustration; slow typing suggests
+#### `error_resilience.fallback_to_man`
 deliberation.
 #### 20. `error_resilience.fallback_to_man`
 Values: `present` | `absent`
-After an error, does the next command start with `man`, `help`, or `info`?
+After an error, does the next command start with `man`/`help`/`info`?
 Skipped if no errors.  `present` indicates an operator consulting
 documentation — less automated, less rehearsed.
 ---
-### Temporal (4) — session rhythm and pacing
+### Temporal (4)
-#### 21. `session_duration`
+#### `session_duration`
 Values: `short` | `medium` | `long` | `marathon`
-| Duration | Label |
+< 60 s / < 600 s / < 3600 s / ≥ 3600 s.
 |---|---|
 | < 60 s | `short` — single recon or scan |
 | < 600 s | `medium` — targeted interaction |
 | < 3600 s | `long` — sustained operation |
 | ≥ 3600 s | `marathon` — extended presence / slow-burn APT |
-#### 22. `escalation_pattern`
+#### `escalation_pattern`
 Values: `bursty` | `sustained`
-Dynamic window analysis (window width = max(10 s, duration / target)).
+Dynamic window analysis of activity density over the session lifetime.
 CV and zero-window fraction classify whether activity clusters into bursts
 separated by idle periods, or maintains a consistent level throughout.
-#### 23. `landing_ritual`
+#### `landing_ritual`
 Values: `cleanup` | `exploration` | `passive`
-First ~5 commands classified by intent tokens.  `cleanup` if the operator
+Intent of the first ~5 commands.
 immediately starts removing evidence; `exploration` if they run
 reconnaissance commands (`id`, `whoami`, `uname`, `ls`); `passive` if
 they do nothing that reveals intent.
-#### 24. `exit_behavior`
+#### `exit_behavior`
 Values: `cleanup` | `standard` | `anomalous`
-Last ~5 commands.  `cleanup` if history/log deletion or `exit`/`logout`
+Intent of the last ~5 commands.
 appears.  `anomalous` if the session ends abruptly with no recognisable
 closing pattern.
 ---
-### Environmental (5) — operator's local setup
+### Environmental (5)
-These are stable across an operator's career and change only when they
+Environmental primitives are stable across an operator's career — they change
-switch machines or retool.
+only when the operator switches machines or deliberately retools.
-#### 25. `shell_type`
+#### `shell_type`
 Values: `bash` | `sh` | `zsh` | `fish` | `unknown`
-Detected from PS1 prompt regex patterns after ANSI stripping.
+Detected from PS1 prompt regex patterns.
-#### 26. `terminal_multiplexer`
+#### `terminal_multiplexer`
 Values: `tmux` | `screen` | `none`
-Detected from PS1 markers and characteristic escape sequences.
+Detected from PS1 markers and escape sequences.
-#### 27. `locale`
+#### `locale`
 Values: `en-US` | `en` | `other` | `unknown`
 Language-specific keywords in prompt lines and error messages.
-#### 28. `keyboard_layout`
+#### `keyboard_layout`
 Values: `qwerty` | `dvorak` | `colemak` | `other`
-Bigram frequency analysis of the typed character stream.  Operators who
+Bigram frequency analysis of the typed character stream.  An operator who
-touch-type on Dvorak produce a statistically distinct bigram distribution
+touch-types on Dvorak produces a statistically distinct bigram distribution
-that persists even when typing non-English commands.
+that persists even when typing non-English commands — this is a pure
 stylometric signal derived from motor habit.
-#### 29. `numpad_usage`
+#### `numpad_usage`
 Values: `occasional` | `frequent` | `none`
 Keystroke pattern detection for numpad-originated digits.
 ---
-### Operational (4) — mission and OPSEC posture
+### Operational (4)
-#### 30. `objective`
+#### `objective`
 Values: `recon` | `exfil` | `persistence` | `lateral` | `destructive`
-Token-based intent classification of command first-tokens.  Majority vote
+Token-based intent classification.  Majority vote; skipped if < 3
-across classified tokens; precedence order applied for ties.  Skipped if
+classified tokens.
 fewer than 3 classified tokens.
 Example token mappings:
 - `recon`: `id`, `whoami`, `uname`, `cat`, `find`, `ls`, `ps`, `netstat`
 - `exfil`: `scp`, `curl`, `wget`, `base64`, `nc`, `rsync`
- `persistence`: `crontab`, `echo`, `tee`, `systemctl`, `rc.local`
+- `persistence`: `crontab`, `echo >> ~/.bashrc`, `systemctl enable`
 - `lateral`: `ssh`, `xfreerdp`, `psexec`, `wmiexec`
 - `destructive`: `rm`, `shred`, `dd`, `mkfs`, `kill`
-#### 31. `opsec_discipline`
+#### `opsec_discipline`
 Values: `careful` | `learning` | `careless`
-Presence of history-disabling tokens (`unset HISTFILE`, `HISTSIZE=0`,
+Presence of history-disabling tokens and cleanup activity.  Both →
-`history -c`) and cleanup activity in the session tail.  Both → `careful`.
+`careful`.  History only → `learning`.  Neither → `careless`.
 History-only → `learning` (knows to cover tracks but forgets cleanup).
 Neither → `careless`.
-#### 32. `cleanup_behavior`
+#### `cleanup_behavior`
 Values: `thorough` | `partial` | `none`
-Distinct cleanup tokens in the last 5 commands.  ≥3 → `thorough`,
+Distinct cleanup tokens in the session tail.  ≥3 → `thorough`,
-1–2 → `partial`, 0 → `none`.
+1–2 → `partial`.
-#### 33. `multi_actor_indicators`
+#### `multi_actor_indicators`
 Values: `solo` | `handoff_detected`
-Splits commands at the session's temporal midpoint and compares the median
+Splits commands at the session midpoint and compares median intra-IKI of
-intra-IKI of each half.  If the delta exceeds 50 % and both halves have
+each half.  Delta > 50 % with both halves having ≥4 commands →
-≥4 commands, `handoff_detected` is emitted — the session was likely shared
+`handoff_detected`.  Suggests the session was shared between two operators
-between two operators (e.g. initial access handed to a post-exploitation
+(initial access handed to a post-exploitation specialist, or a shared
-specialist).
+credential).
 ---
-### Emotional valence (4) — stress and cognitive state
+### Emotional valence (4)
-These features have a hard confidence cap of **0.50** — they contribute to
+These primitives sit at the boundary of motor and stylometric signal.  They
-attribution but cannot dominate it.  They require ≥80 typed letters to emit.
+require ≥80 typed letters and carry a hard confidence cap of **0.50** —
 they contribute to attribution but cannot dominate it.
-#### 34. `valence`
+#### `valence`
 Values: `positive` | `neutral` | `negative`
-Lexical positive/negative token counts.  `positive` if positive count >
+Lexical positive/negative token counts.  `positive` requires positive count
-(negative + obscenity) and ≥2 positive tokens.
+> (negative + obscenity) with ≥2 positive tokens.
-#### 35. `arousal`
+#### `arousal`
 Values: `low_calm` | `medium_engaged` | `high_agitated`
 `high_agitated` if ≥5 consecutive caps, ≥3 consecutive `!`, or fastest
-IKI < 60 ms on ≥30 keystrokes.  `low_calm` if slowest IKI > 300 ms.
+IKI < 60 ms on ≥30 keystrokes.
 Otherwise `medium_engaged`.
-#### 36. `stress_response`
+#### `stress_response`
 Values: `none` | `eustress_positive` | `distress_negative`
-Post-error vs baseline typing speed ratio.  ≥1.20 → `eustress` (types
+Post-error vs baseline typing speed ratio.  ≥1.20 → `eustress` (experienced,
-faster under pressure — experienced).  ≤ 1/1.20 → `distress` (types
+types faster under pressure).  ≤ 1/1.20 → `distress`.
 slower — less experienced or genuinely stressed).
-#### 37. `frustration_venting`
+#### `frustration_venting`
 Values: `low` | `moderate` | `high`
-Post-error frustration token count plus obscenity count.
+Post-error frustration token count plus obscenity count.  A purely
 lexicometric signal.
 ---
-## Attribution state machine
+## Attribution
-Primitives feed a per-`(identity_uuid, primitive)` state machine in
+BEHAVE-SHELL does not define how observations are aggregated — that is the
-`decnet/correlation/attribution/aggregate.py`.
+responsibility of the implementing system's attribution engine.  The DECNET
 reference implementation uses a five-state machine per
 `(identity_uuid, primitive)`:
-### States
+| State | Condition |
 |---|---|
 | `unknown` | < 3 observations |
 | `stable` | Recent N agree, no drift from older N |
 | `drifting` | Recent N agree but differ from older N |
 | `conflicted` | Recent N are split |
 | `multi_actor` | `conflicted` + cross-session alternation |
-| State | Meaning | Condition |
+Window N = 5 for categorical primitives.  When ≥2 primitives independently
-|---|---|---|
+reach `multi_actor` for the same identity, the engine emits a
-| `unknown` | Insufficient data | < 3 observations |
+`multi_actor_suspected` signal — a strong indicator of a shared credential
-| `stable` | Consistent value | Recent N agree AND no drift from older N |
+or a compromised operator account.
 | `drifting` | Recently changed | Recent N agree BUT differ from older N |
 | `conflicted` | Contradictory values | Recent N are split (high CV) |
 | `multi_actor` | Multiple operators | `conflicted` + cross-session alternation |
 Window size N = 5 (categorical primitives).  EWMA is used for numeric
 primitives (Phase 3).
 ### Multi-actor detection
 The attribution worker runs a `_multi_actor_tick` every 60 seconds.  For
 every `(identity, primitive)` pair in `conflicted` state, it checks whether
 the alternation pattern across sessions is consistent with a credential
 being shared between two distinct operators.  When ≥2 primitives
 independently flag `multi_actor` for the same identity, the bus emits:
 ```
 attribution.profile.multi_actor_suspected
  {identity_uuid, primitives: [...], evidence_summary, confidence, ts}
 ```
 `confidence` is capped at 0.60 — cross-primitive agreement is the real
 signal, but a hard cap prevents over-alarming on noisy primitives.
 ---
 ## Database tables
 ### `ObservationRow`
 One row per `(evidence_ref, primitive)`.  `evidence_ref` is the session
 shard identifier — the `UniqueConstraint` makes re-processing idempotent.
 | Column | Type | Description |
 |---|---|---|
 | `id` | UUID PK | |
 | `identity_uuid` | FK → `attacker_identities` | |
 | `attacker_uuid` | FK → `attackers` | Direct link for pre-clusterer path |
 | `evidence_ref` | TEXT | Shard ID |
 | `primitive` | TEXT | e.g. `keystroke_cadence` |
 | `value` | TEXT | Categorical label or serialised numeric |
 | `confidence` | FLOAT | 0.0–1.0 |
 | `observed_at` | DATETIME | Session end time |
 ### `AttributionStateRow`
 One row per `(identity_uuid, primitive)`.  Updated by the attribution
 worker each time a new observation arrives.
 | Column | Type | Description |
 |---|---|---|
 | `identity_uuid` | FK → `attacker_identities` | |
 | `primitive` | TEXT | |
 | `state` | TEXT | `unknown`/`stable`/`drifting`/`conflicted`/`multi_actor` |
 | `current_value` | TEXT | Most recent or EWMA value |
 | `confidence` | FLOAT | |
 | `observation_count` | INT | Total observations aggregated |
 | `last_observation_ts` | DATETIME | |
 ---
 ## Key thresholds
 All calibration constants live in `decnet/profiler/behave_shell/_thresholds.py`
 (416 lines).  The values below are the defaults; they can be overridden per
 deployment without touching feature code.
 | Constant | Value | Used by |
 |---|---|---|
 | `PASTE_MIN_CHARS_PER_EVENT` | 4 | Paste detection |
 | `PASTE_BURST_MAX_IAT_S` | 0.20 | Paste burst grouping |
 | `MODALITY_PASTED_MIN` | 0.40 | `input_modality` |
 | `CV_STEADY_MAX` | 0.45 | `keystroke_cadence` |
 | `TREMOR_FAST_FLOOR_S` | 0.030 | `motor_stability` |
 | `IKI_THINK_MAX_S` | 2.0 | Typing-burst split |
 | `INTER_CMD_INSTANT_MAX` | 0.30 s | `inter_command_latency_class` |
 | `INTER_CMD_LLM_LIGHTWEIGHT_MAX` | 8.0 s | LLM-assisted detection |
 | `INTER_CMD_LLM_HEAVYWEIGHT_MAX` | 30.0 s | LLM-assisted detection |
 | `BRANCH_DIVERSITY_LINEAR_MIN` | 0.70 | `command_branch_diversity` |
 | `FEEDBACK_CORRELATION_MIN` | 0.30 | `feedback_loop_engagement` |
 | `PAUSE_CV_METRONOMIC_MAX` | 0.40 | `inter_command_consistency` |
 | `PAUSE_CV_BIMODAL_MIN` | 1.50 | `inter_command_consistency` |
 | `SESSION_DURATION_SHORT_MAX` | 60 s | `session_duration` |
 | `SESSION_DURATION_MEDIUM_MAX` | 600 s | `session_duration` |
 | `SESSION_DURATION_LONG_MAX` | 3600 s | `session_duration` |
 | `MIN_OBSERVATIONS_FOR_STATE` | 3 | Attribution state machine |
 | `CATEGORICAL_WINDOW_N` | 5 | Attribution window |
 | `MULTI_ACTOR_TICK_SECS` | 60 | Multi-actor tick |
 | `EMOTIONAL_VALENCE_CONFIDENCE_CAP` | 0.50 | All `emotional_valence` features |
 ---
 ## Calibration
-The system was calibrated against five behavioural classes across 15 sessions
+The reference thresholds were calibrated against five behavioural classes
-(424 total observations):
+across 15 sessions (424 total observations):
 | Class | Sessions | Observations | Description |
 |---|---|---|---|
-| `HUMAN` | 1 | 34 | Human operator, no assistance |
+| `HUMAN` | 1 | 34 | Human operator, unassisted |
 | `YOU-sim` | 2 | 59 | Human-simulated scripted attacker |
 | `LW-sim` | 5 | 136 | Lightweight LLM-assisted operator |
-| `CLAUDE-FF` | 3 | 84 | Claude (fast/free tier) assisted |
+| `CLAUDE-FF` | 3 | 84 | Claude (fast) assisted |
-| `CLAUDE-CL` | 4 | 111 | Claude (standard tier) assisted |
+| `CLAUDE-CL` | 4 | 111 | Claude (standard) assisted |
-All classes emit ≥27 distinct primitives (pass threshold).
+All classes emit ≥27 distinct primitives.  The `inter_command_latency_class`
-
+LLM buckets are the primary discriminator between unassisted and
-The `inter_command_latency_class` thresholds `llm_lightweight` (≤8 s) and
+LLM-assisted operators in single-session analysis; cross-session attribution
-`llm_heavyweight` (≤30 s) were derived from timing measurements of these
+uses the full primitive set.
 sessions — DECNET can distinguish a human-with-fast-LLM from an unassisted
 human in a single session with moderate confidence, and with high confidence
 across 3+ sessions.
 ---
-## Testing
+## Key thresholds (reference implementation)
-```bash
+All constants live in `_thresholds.py`.
 # Offline smoke test — 5 shards, mock bus, must emit ≥27 distinct per class
 scripts/behave_shell/smoke.sh
-# Live round-trip — replay calibration shards through a running DECNET
+| Constant | Value |
-scripts/behave_shell/replay_calibration.py
+|---|---|
-```
+| `PASTE_MIN_CHARS_PER_EVENT` | 4 |
 | `PASTE_BURST_MAX_IAT_S` | 0.20 |
 | `IKI_THINK_MAX_S` | 2.0 (typing-burst split) |
 | `TREMOR_FAST_FLOOR_S` | 0.030 |
 | `CV_STEADY_MAX` | 0.45 |
 | `INTER_CMD_INSTANT_MAX` | 0.30 s |
 | `INTER_CMD_LLM_LIGHTWEIGHT_MAX` | 8.0 s |
 | `INTER_CMD_LLM_HEAVYWEIGHT_MAX` | 30.0 s |
 | `BRANCH_DIVERSITY_LINEAR_MIN` | 0.70 |
 | `FEEDBACK_CORRELATION_MIN` | 0.30 |
 | `PAUSE_CV_METRONOMIC_MAX` | 0.40 |
 | `PAUSE_CV_BIMODAL_MIN` | 1.50 |
 | `SESSION_DURATION_SHORT_MAX` | 60 s |
 | `SESSION_DURATION_MEDIUM_MAX` | 600 s |
 | `SESSION_DURATION_LONG_MAX` | 3600 s |
 | `EMOTIONAL_VALENCE_CONFIDENCE_CAP` | 0.50 |
 | `MIN_OBSERVATIONS_FOR_STATE` | 3 |
 | `CATEGORICAL_WINDOW_N` | 5 |
 ---
-## File reference
+## DECNET implementation
-```
+In DECNET, BEHAVE-SHELL extraction is invoked by the profiler worker on every
-decnet/profiler/behave_shell/
+`attacker.session.ended` bus event.  The worker reads the PTY shard from disk,
-  __init__.py               Public API: extract_session()
+runs `extract_session()`, and upserts one `ObservationRow` per primitive per
-  extract.py                Entry point — fans out to FEATURES registry (51 lines)
+session.  A `UniqueConstraint(evidence_ref, primitive)` makes re-processing
-  _ctx.py                   SessionContext builder (573 lines)
+idempotent.
  _parse.py                 Asciinema JSONL parsing (272 lines)
  _handler.py               Bus subscriber — disk I/O, persistence, publish (235 lines)
  _intent.py                Token → intent classification (115 lines)
  _thresholds.py            All calibration constants (416 lines)
  _features/
    __init__.py             FEATURES registry — list of 37 functions (104 lines)
    motor.py                Primitives 1–9 (422 lines)
    cognitive.py            Primitives 10–20 (593 lines)
    temporal.py             Primitives 21–24 (237 lines)
    environmental.py        Primitives 25–29 (352 lines)
    operational.py          Primitives 30–33 (218 lines)
    emotional_valence.py    Primitives 34–37 (223 lines)
-decnet/correlation/
+The attribution worker consumes `attacker.observation.*` bus events and
-  attribution_worker.py     Bus loop: consume observations, run tick
+maintains one `AttributionStateRow` per `(identity_uuid, primitive)`.
  attribution/
    aggregate.py            State machine: unknown→stable→drifting→conflicted→multi_actor
    _thresholds.py          Attribution-layer thresholds
-decnet/web/db/models/
+Source: `decnet/profiler/behave_shell/` (~3 868 lines across 12 files).
  observations.py           ObservationRow schema
  attribution_state.py      AttributionStateRow schema
 ```
 ---
-## Related pages
+## See also
- [Fingerprinting](Fingerprinting) — all fingerprint layers, including the
+- **BEHAVE-TEXT** — sibling spec for written-text stylometry and lexicometry,
-  BEHAVE-SHELL summary
+  implemented by [EYENET](https://github.com/xmartlab/eyenet)
- [Identity-Resolution](Identity-Resolution) — how observations are clustered
+- [Fingerprinting](Fingerprinting) — all DECNET fingerprint layers
-  into attacker identities and how state machine transitions propagate
+- [Identity-Resolution](Identity-Resolution) — how observations feed the
- [Service-Personas](Service-Personas) — enabling session recording and
+  identity clusterer
  BEHAVE-SHELL per service