Files

anti 7f585027b3 docs: per-package READMEs with full primitive catalog and registry notes backfill

- core/README.md: envelope contract, field table, PII discipline, quickstart
- BEHAVE-SHELL/README.md: all 76 primitives documented across 9 categories;
  TLS/SSH/C2 fingerprint sections with [DRAFT — verify] markers on uncertain entries
- BEHAVE-TEXT/README.md: all 35 primitives across 6 categories; Rutify calibration
  notes inline; content.* layer marked EXPERIMENTAL throughout
- primitives.py (SHELL): backfilled notes for all previously undocumented primitives
- primitives.py (TEXT): backfilled notes for capitalization_habit, emoji_*, length,
  linebreak_style, sentence_complexity_class, question_formation_style,
  imperative_style, response_latency_class, message_burst_rate

License: CC-BY-SA-4.0 (prose) / GPL-3.0-or-later (code)

2026-05-10 08:33:02 -04:00

24 KiB

Raw Blame History

behave-shell

← repo

Shell-session behavioral observation registry. Defines what can be observed about an operator through their terminal interaction — typing mechanics, cognitive style, operational patterns, infrastructure fingerprints, and cultural timing signals.

BEHAVE-SHELL does not read command content. It measures how someone operates a terminal, not what they type. The observations are categorical labels, numeric aggregates, and cryptographic hashes — never raw keystrokes or command text.

Install

pip install -e ../core/ -e .
# development (pytest + ruff):
pip install -e ../core/ -e ".[dev]"

Quickstart

from behave_shell.spec import Observation, Window, TOPIC_PREFIX, event_topic_for

obs = Observation(
    primitive="motor.keystroke_cadence",
    value="bursty",
    confidence=0.87,
    window=Window(start_ts=1714000000.0, end_ts=1714003600.0),
    source="behave/shell-sensor/timing.py",
)
# Serialize to an event bus topic + payload:
topic = event_topic_for("motor.keystroke_cadence")
# → "attacker.observation.shell.motor.keystroke_cadence"

Public API (`behave_shell.spec`)

Symbol	Description
`Observation`	Registry-aware subclass of `behave_core.spec.Observation`. Validates `primitive` against `PRIMITIVE_REGISTRY` and `value` against the primitive's type spec.
`Window`	Re-exported from `behave_core` — measurement time window.
`ObservationValue`	Re-exported union type for valid value shapes.
`PRIMITIVE_REGISTRY`	`dict[str, ValueTypeSpec]` — the full primitive catalog (76 entries).
`ValueKind`	Enum: `CATEGORICAL`, `NUMERIC`, `HASH`, `ARRAY`, `FREE_STRING`, `BOOL`.
`ValueTypeSpec`	Pydantic model holding a primitive's kind, allowed values, bounds, and notes.
`is_known(primitive)`	`bool` — whether a primitive path is registered.
`get(primitive)`	Returns the `ValueTypeSpec` for a primitive; raises `KeyError` if unknown.
`TOPIC_PREFIX`	`"attacker.observation.shell"`
`event_topic_for(primitive)`	Returns the full event bus topic string.
`to_event_payload(obs)`	Serializes an `Observation` to a bus-ready `dict`.
`from_event_payload(payload)`	Reconstructs an `Observation` from a bus payload.

Primitives

76 primitives across 9 categories. Each observation captures one measured value for one primitive over one time window. A behavioral profile is built by collecting many observations across many sessions.

`motor.*` — Physical typing mechanics (9 primitives)

Motor primitives capture the physical mechanics of keyboard interaction: rhythm, precision, and habitual movements that are hard to fake and stable across sessions even when operators change tools or objectives. These are the closest BEHAVE comes to biometrics — they exploit the fact that typing style is unconscious and consistent.

Primitive	Kind	Description
`motor.keystroke_cadence`	categorical	Overall rhythm of key input. `steady` = metronomic confident typist. `bursty` = fast bursts with thinking pauses. `hunt_and_peck` = search-first-type. `machine` = mechanically regular, suggesting scripted input.
`motor.motor_stability`	categorical	Consistency of key hold/flight times. `steady` = low variance. `variable` = high variance (cognitive load or unfamiliar keyboard). `tremor` = rhythmic instability distinct from load-induced variance.
`motor.error_correction`	categorical	Response to typing mistakes. `immediate` = backspace within ~1s (automatic monitoring). `deferred` = corrects after reading output. `absent` = proceeds despite errors (scripted behavior). `route_around` = uses history or rewrites rather than backspacing.
`motor.command_chunking`	categorical	Flow of command composition. `fluent` = typed in one pass from memory. `fragmented` = chunks with mid-command pauses (composing while typing). `single_command` = one complete command at a time, no inline pipelines.
`motor.paste_burst_rate`	categorical	Frequency of large clipboard-paste events relative to typed input. `habitual` = primarily works by pasting pre-prepared blocks.
`motor.input_modality`	categorical	Dominant input mode. `typed` = character-by-character. `pasted` = pre-prepared blocks. `mixed` = both substantially.
`motor.shell_mastery.tab_completion`	categorical	Tab completion usage. `habitual` = operator relies on it constantly (inferred from short pause then rapid continuation). Strong indicator of shell familiarity.
`motor.shell_mastery.shortcut_usage`	categorical	Use of shell shortcuts (Ctrl+R, Ctrl+A/E, Ctrl+L, Alt+.). `heavy` = deep shell muscle memory.
`motor.shell_mastery.pipe_chaining_depth`	categorical	Maximum pipeline depth (cmd \| cmd \| cmd). `shallow` = 0-1 pipes. `deep` = 4+. Reflects tool-composition fluency.

`cognitive.*` — Decision-making and cognition (11 primitives)

Cognitive primitives capture how the operator thinks: their planning style, how they respond to uncertainty and failure, and whether their timing patterns are consistent with a human, a script, or an LLM agent. These are among the most attribution-relevant primitives — they're stable per-operator and hard to sustain as deliberate deception.

Primitive	Kind	Description
`cognitive.cognitive_load`	categorical	Inferred mental workload from timing, error rate, and inter-command variance. `high` = long pauses, frequent error-retry cycles, fragmented chunking. Composite feature for downstream attribution.
`cognitive.exploration_style`	categorical	Navigation style in unfamiliar environments. `methodical` = systematic enumeration (ls→cat→id→uname). `chaotic` = non-sequential jumps. `targeted` = straight to objective without exploring.
`cognitive.planning_depth`	categorical	Whether the operator works from a pre-formed plan. `deep` = visible logical sequence (recon→pivot→exfil). `shallow` = opportunistic. `reactive` = responds only to errors.
`cognitive.tool_vocabulary`	categorical	Breadth of tools used. `narrow` = fixed small toolset. `broad` = reaches for the best tool per subtask.
`cognitive.inter_command_latency_class`	categorical	Time between commands. `instant` (<200ms), `typing_speed` (200ms-2s), `deliberate` (2s), `llm_lightweight` (2-8s, small model agent), `llm_heavyweight` (8-30s, reasoning-class agent), `long` (>30s, human-supervised LLM).
`cognitive.inter_command_consistency`	categorical	Dispersion of inter-command pauses. `metronomic` = LLM-pure. `variable` = human. `bimodal` = LLM-assisted human (LLM-paced bursts + human thinking gaps).
`cognitive.command_branch_diversity`	categorical	Content-based script vs. adaptive discriminator. `linear_playbook` = low first-token repetition (each step uses a different tool). `adaptive_branching` = high repetition of the same tool with varying arguments (operator following a thread).
`cognitive.feedback_loop_engagement`	categorical	Whether pace correlates with output volume. `closed_loop` = pause grows with preceding output (reading before continuing). `fire_and_forget` = paces independently of output (scripted or unread). Cuts across the LLM/human axis.
`cognitive.error_resilience.retry_tactic`	categorical	Response to command failure. `rerun` = identical retry. `modify` = adjusts before retrying. `switch` = tries a different tool. `abort` = gives up on objective.
`cognitive.error_resilience.frustration_typing`	categorical	Speed/error spike immediately after failure. `high` = sharp burst post-failure. Strong human indicator; absent in scripts.
`cognitive.error_resilience.fallback_to_man`	categorical	Whether the operator invokes `man`/`--help` when stuck. `present` signals unfamiliarity with the specific tool.

`temporal.*` — Session timing and lifecycle (7 primitives)

When and how long an operator works. These signals are stable per-campaign and hard to fake consistently across many sessions, because they reflect biological and social rhythms (sleep, work hours, habits) rather than conscious technical choices.

Primitive	Kind	Description
`temporal.session_timing`	categorical	Hour-of-day distribution. `diurnal` = business-hours peaks. `nocturnal` = late-night peaks. `irregular` = no discernible daily pattern. Requires a known timezone from `cultural.*` to interpret.
`temporal.session_duration`	categorical	Typical session length. `short` <15min, `medium` 15-90min, `long` 90min-4hr, `marathon` >4hr. Stable per-operator characteristic.
`temporal.escalation_pattern`	categorical	Activity intensity across a session. `sustained` = constant rate. `bursty` = concentrated activity then silence (waiting for long-running processes). `erratic` = unpredictable spikes.
`temporal.persistence`	categorical	Cross-session return behavior. `hit_and_run` = few sessions then disappears. `return_visitor` = periodic return. `resident` = near-continuous presence.
`temporal.lifecycle_markers.landing_ritual`	categorical	Whether a recognizable start-of-session sequence is detected (whoami → id → uname → hostname → ip addr). `present` = fingerprinted checklist habit.
`temporal.lifecycle_markers.exit_behavior`	categorical	Session end pattern. `graceful` = explicit logout. `abrupt` = dropped connection. `cleanup` = deletes logs/tools before exiting — strongest opsec signal in this category.
`temporal.lifecycle_markers.idle_periodicity`	categorical	Whether in-session idle gaps (>30s) are statistically periodic or random. `periodic` = heartbeat-like — may indicate an LLM polling loop, an automated keepalive, or a human following a timed workflow.

`operational.*` — Mission and opsec (4 primitives)

Operational primitives are coarser inferences from command patterns — what the operator is trying to accomplish and how carefully they're hiding their footprint.

Primitive	Kind	Description
`operational.opsec_discipline`	categorical	Forensic footprint management. `careful` = history disabled, tools removed, proxy/VPN confirmed. `careless` = no precautions. `learning` = inconsistent and improving mid-campaign.
`operational.cleanup_behavior`	categorical	Artifact handling at session end. `thorough` = removes tools, temp files, bash history. `partial` = removes some but misses others. `none` = leaves everything.
`operational.objective`	categorical	Inferred mission from command patterns: `recon`, `exfil`, `persistence`, `lateral` (pivoting), `destructive`.
`operational.multi_actor_indicators`	categorical	Signs of multiple operators. `handoff_detected` = detectable style break mid-session. `team_coordinated` = multiple signatures interleaved or simultaneous.

`environmental.*` — Physical and software context (5 primitives)

Environmental primitives describe where the operator works from. Stable per-campaign; often reveals national origin or infrastructure choices.

Primitive	Kind	Description
`environmental.keyboard_layout`	categorical	Inferred layout from characteristic key-sequence errors. An AZERTY-trained typist on QWERTY makes specific substitutions (q↔a, z↔w, m→,) that are statistically distinguishable from random errors. Reliable when error volume is sufficient (>50 errors).
`environmental.locale`	free_string	BCP-47 tag (e.g. `en-US`, `pt-BR`). Inferred from layout, cultural timing, and command-line encoding artifacts. Free string — locale is not a closed enum.
`environmental.numpad_usage`	categorical	Numeric keypad use inferred from keycode patterns. `detected` signals a desktop keyboard.
`environmental.terminal_multiplexer`	categorical	Presence of tmux/screen, inferred from escape sequences (Ctrl+B / Ctrl+A prefixes) and window-switching patterns.
`environmental.shell_type`	categorical	Shell environment inferred from syntax (array syntax, quoting style, builtin names). `powershell`/`cmd.exe` immediately flags a Windows-native operator.

`cultural.*` — Social and biological rhythms (5 primitives)

Cultural primitives exploit the fact that human work patterns are shaped by local time, religion, and social convention. These signals are hard to sustain as deliberate deception across a long campaign because they reflect unconscious biological rhythms.

Primitive	Kind	Description
`cultural.meal_break_gaps`	categorical	Whether activity gaps align with regional meal times (`morning`, `midday`, `evening`, `late_night`). Requires a known timezone to interpret.
`cultural.periodic_micro_pauses`	categorical	Short rhythmic pauses of 5-15 min recurring at consistent intervals. May correspond to Salah prayer times (5 daily, spaced ~2-3hr), smoke breaks, or other cultural micro-rituals. `regular_intervals_detected` rejects the null hypothesis of random pauses at p<0.05.
`cultural.dst_behavior`	categorical	Whether the operator's active hours shift by 1 hour at DST transitions. `shifts_with_dst` = follows local civil time. `anchored_to_utc` = schedule is clock-fixed (automated infrastructure or deliberate counter-analysis).
`cultural.weekend_cadence`	categorical	Which two-day block is low-activity. `fri_sat` = Middle Eastern/Israeli pattern. `sat_sun` = Western/East Asian. Reliable national-origin signal across multiple weeks.
`cultural.holiday_gaps`	categorical	Whether multi-day inactivity gaps align with public holiday calendars. Requires a multi-session corpus spanning calendar events.

`emotional_valence.*` — Affective state (4 primitives)

Emotional valence primitives infer affective state from typing dynamics — pace, error rate, and key-input aggression. BEHAVE-SHELL is content-blind; these observations are derived entirely from timing and motor signals, not from what was typed.

Primitive	Kind	Description
`emotional_valence.valence`	categorical	Overall affective tone: `positive` (fluent, low-error), `neutral`, `negative` (error-heavy, erratic). Coarse aggregate; see `arousal` and `stress_response` for finer breakdown.
`emotional_valence.arousal`	categorical	Activation level. `low_calm` = slow deliberate pace. `high_agitated` = fast error-prone bursts. Orthogonal to valence — a calm script and a calm professional are both `low_calm`.
`emotional_valence.stress_response`	categorical	Whether high arousal is positive (`eustress_positive` = speed-up with low error rate, operator in the zone) or negative (`distress_negative` = speed-up with rising errors, panic).
`emotional_valence.frustration_venting`	categorical	Transient outburst signal: sudden speed spike or rapid backspace/delete bursts after command failures. Absent in scripted runs; strong human indicator.

`toolchain.*` — Infrastructure fingerprints (19 primitives)

Toolchain primitives fingerprint the software stack the operator uses, from TLS handshake parameters to SSH key exchange preferences to C2 beaconing behavior. Even fully encrypted traffic leaves structural fingerprints that identify specific tools, libraries, and operator configurations.

`toolchain.tls.*` — TLS fingerprints (6)

TLS fingerprints identify the client and server stacks by their handshake parameters. Each tool, library, and OS produces recognizable fingerprints even when the payload is encrypted.

Primitive	Kind	Description
`toolchain.tls.ja3_client`	hash	MD5 hash of TLS ClientHello parameters (SSLVersion, Ciphers, Extensions, EllipticCurves, EllipticCurvePointFormats). Salesforce, 2017. Each tool stack (curl, Metasploit, Cobalt Strike) produces a distinct hash. Searchable against databases like ja3er.com. `[DRAFT — verify]`
`toolchain.tls.ja3s_server`	hash	MD5 hash of TLS ServerHello (SSLVersion, Cipher, Extensions). Fingerprints the server stack — useful for identifying C2 servers by TLS response even when IPs rotate. `[DRAFT — verify]`
`toolchain.tls.ja4_client`	hash	FoxIO JA4 (2023): human-readable format (e.g. `t13d1516h2_8daaf6152771_e5627efa2ab1`) robust to TLS extension order randomization. Encodes TLS version, cipher count, extension count, ALPN, cipher hash, extension hash. Preferred over JA3 for new sensors. `[DRAFT — verify]`
`toolchain.tls.ja4s_server`	hash	JA4 server-side: fingerprints ServerHello using chosen cipher, extension list, and ALPN. More stable than JA3S when cipher ordering is randomized server-side. `[DRAFT — verify]`
`toolchain.tls.jarm_server`	hash	62-char JARM hash (Salesforce, 2020). Actively probes the server with 10 crafted ClientHellos and hashes the responses. Reliably detects Cobalt Strike, Metasploit, and major C2 frameworks even with custom certificates.
`toolchain.tls.tls_cert_simhash`	hash	SHA-256 hex of the leaf certificate DER bytes. Tracks a specific certificate across infrastructure — useful for correlating C2 that reuses self-signed certs.

`toolchain.transport.*` — Network stack fingerprints (3)

Primitive	Kind	Description
`toolchain.transport.tcp_stack`	free_string	p0f OS label (e.g. `Linux 5.x`). Inferred from TCP header quirks (TTL, window size, options order, DF bit). Identifies the connecting OS before any application protocol is visible.
`toolchain.transport.h2_akamai_fingerprint`	free_string	HTTP/2 SETTINGS + priority + pseudo-header order hash. Different HTTP/2 libraries emit distinct SETTINGS combinations (curl vs. Python requests vs. Go net/http). `status: planned`
`toolchain.transport.quic_client`	free_string	QUIC initial packet fingerprint from transport parameters and connection ID length. `status: planned`

`toolchain.ssh.*` — SSH fingerprints (4)

Primitive	Kind	Description
`toolchain.ssh.hassh_client`	hash	MD5 hash of SSH client KEX parameters (kex_algorithms, encryption_algorithms, mac_algorithms, compression_algorithms). Salesforce, 2018. Each SSH library (OpenSSH, PuTTY, Paramiko, Impacket) produces a distinct HASSH.
`toolchain.ssh.hassh_server`	hash	MD5 hash of SSH server KEX parameters. Fingerprints the SSH daemon — detects honeypots, implants, or non-standard servers. `status: partial`
`toolchain.ssh.ssh_client_banner`	free_string	RFC 4253 protocol version string (e.g. `SSH-2.0-OpenSSH_8.9p1 Ubuntu-3ubuntu0.6`). Often unmodified even in offensive tooling.
`toolchain.ssh.kex_algorithm_order`	array[free_string]	Ordered KEX algorithm list from the SSH ClientHello. Different clients (OpenSSH, PuTTY, Impacket smbexec) advertise distinct orderings — secondary fingerprint beyond HASSH. `[DRAFT — verify]`

`toolchain.http.*` — HTTP fingerprints (3)

Primitive	Kind	Description
`toolchain.http.user_agent_tool_class`	categorical	Tool class from User-Agent and HTTP behavior. Known offensive tools use default or absent User-Agents. Values: `nmap_nse`, `sqlmap`, `nuclei`, `masscan`, `curl`, `metasploit`, `ffuf`, `gobuster`, `feroxbuster`, `nikto`, `wpscan`, `evilwinrm`, `impacket`, `unknown`.
`toolchain.http.header_order_fingerprint`	free_string	Hash of HTTP request header name order. Different libraries emit distinct sequences. `status: planned`
`toolchain.http.body_oddities`	array[free_string]	Anomalous body characteristics (e.g. `multipart_boundary_static`, `json_key_order_fixed`). `status: planned`

`toolchain.c2.*` — C2 beaconing (6)

C2 primitives characterize implant beaconing behavior. Even fully encrypted C2 traffic leaves timing and structural fingerprints.

Primitive	Kind	Description
`toolchain.c2.beacon_family`	categorical	C2 framework identified from traffic fingerprints: `cobalt_strike`, `sliver`, `havoc`, `mythic`, `merlin` (planned), `brc4` (planned), `nighthawk` (planned), `unknown`.
`toolchain.c2.beacon_interval_ms`	numeric	Median IAT between callbacks, in milliseconds. Cobalt Strike default is 60000ms. Very short intervals (<1000ms) suggest an interactive shell, not a beacon.
`toolchain.c2.beacon_jitter_cv`	numeric	Coefficient of variation (std/mean) of beacon IATs. Higher CV = more randomized jitter. Cobalt Strike default jitter is 0% (CV≈0); operators who understand detection set it to 20-50%.
`toolchain.c2.sleep_skew`	categorical	Jitter type applied to sleep intervals. `none` = fixed (detectable). `gaussian` = normally distributed. `uniform` = flat random range. `walk` = random-walk drift. `status: partial`
`toolchain.c2.c2_callback_endpoint`	free_string	URL or `host:port` of the C2 callback endpoint.
`toolchain.c2.attack_software_id`	free_string	MITRE ATT&CK Software ID (e.g. `S0154` for Cobalt Strike).

`toolchain.protocol_abuse.*` — Protocol abuse (6)

Non-standard or offensive use of standard protocols.

Primitive	Kind	Description
`toolchain.protocol_abuse.dns_exfil_tool`	categorical	DNS tunneling tool. `iodine` = base32-encoded data in subdomains with TYPE NULL queries. `dnscat2` = TYPE TXT queries with specific entropy patterns. `custom_high_entropy` = tunneling-consistent but no known-tool match. `status: planned`
`toolchain.protocol_abuse.smb_dialect`	categorical	SMB dialect negotiated by the client. SMB1 in 2024+ is a strong indicator of legacy tooling or deliberate EternalBlue-era downgrade. `status: planned`
`toolchain.protocol_abuse.kerberos_etype_offer`	hash	Hash of the Kerberos AS-REQ etype list. Clients offering RC4-HMAC (etype 23) alongside modern etypes are candidates for Kerberoasting (Rubeus, Impacket GetUserSPNs). `status: planned [DRAFT — verify]`
`toolchain.protocol_abuse.ldap_bind_pattern`	categorical	LDAP bind mechanism. `simple` = cleartext (immediately suspicious). `sasl_gssapi` = Kerberos-backed (normal). `ntlm`, `ntlmssp_v1`, `responder_like` = NTLM and Responder-class MITM. `status: partial`
`toolchain.protocol_abuse.responder_signature`	free_string	Responder detection. Convention: `'false'` or `'true:llmnr'` / `'true:nbtns'` / `'true:mdns'`. Responder poisons LLMNR/NBNS/mDNS broadcasts to capture Net-NTLMv2 hashes. `status: planned`
`toolchain.protocol_abuse.mitm6_signature`	bool	Whether mitm6 activity is detected. mitm6 abuses IPv6 router advertisement on IPv4-only networks to hijack DNS and enable credential relay attacks. `status: planned`

`toolchain.payload.*` — Payload analysis (3)

Primitive	Kind	Description
`toolchain.payload.payload_simhash`	hash	64-bit SimHash of the payload binary/shellcode. Preserves near-duplicate relationships: payloads that are 90% similar have low Hamming distance (<4 bits on 64-bit), enabling family clustering despite minor obfuscation. 16-char hex.
`toolchain.payload.payload_entropy_class`	categorical	Shannon entropy of payload bytes. `packed` >7.2 bits/byte (UPX, encrypted shellcode, base64-compressed). `high` 6.5-7.2 (unencrypted compiled code). `low` <5.5 (scripts, plaintext). `status: planned`
`toolchain.payload.loader_family`	categorical	Shellcode/loader family from structural signatures. `donut` = Donut framework (TheWover), converts .NET/PE to PIC shellcode. `sgn` = Shikata-Ga-Nai XOR encoder (Metasploit), recognizable feedback register pattern. `pe2sh` = PE-to-shellcode. `nimcrypt` = Nim-based loader with AES-encrypted payload. `status: planned`

Schema

Machine-readable JSON Schema for the observation envelope: json/observation.schema.json

Regenerate after model changes:

python scripts/generate_schema.py

Tests

pytest tests/

Attribution recipes

attribution-recipes.md — out-of-scope reference document describing how an external attribution engine might consume attacker.observation.shell.* topics to build operator profiles. Not part of the BEHAVE spec.

License

Code and schemas: GPL-3.0-or-later Spec prose (this file, attribution-recipes.md): CC-BY-SA-4.0

24 KiB Raw Blame History

behave-shell

Install

Quickstart

Public API (behave_shell.spec)

Primitives

motor.* — Physical typing mechanics (9 primitives)

cognitive.* — Decision-making and cognition (11 primitives)

temporal.* — Session timing and lifecycle (7 primitives)

operational.* — Mission and opsec (4 primitives)

environmental.* — Physical and software context (5 primitives)

cultural.* — Social and biological rhythms (5 primitives)

emotional_valence.* — Affective state (4 primitives)

toolchain.* — Infrastructure fingerprints (19 primitives)

toolchain.tls.* — TLS fingerprints (6)

toolchain.transport.* — Network stack fingerprints (3)

toolchain.ssh.* — SSH fingerprints (4)

toolchain.http.* — HTTP fingerprints (3)

toolchain.c2.* — C2 beaconing (6)

toolchain.protocol_abuse.* — Protocol abuse (6)

toolchain.payload.* — Payload analysis (3)