docs: per-package READMEs with full primitive catalog and registry notes backfill

- core/README.md: envelope contract, field table, PII discipline, quickstart
- BEHAVE-SHELL/README.md: all 76 primitives documented across 9 categories;
  TLS/SSH/C2 fingerprint sections with [DRAFT — verify] markers on uncertain entries
- BEHAVE-TEXT/README.md: all 35 primitives across 6 categories; Rutify calibration
  notes inline; content.* layer marked EXPERIMENTAL throughout
- primitives.py (SHELL): backfilled notes for all previously undocumented primitives
- primitives.py (TEXT): backfilled notes for capitalization_habit, emoji_*, length,
  linebreak_style, sentence_complexity_class, question_formation_style,
  imperative_style, response_latency_class, message_burst_rate

License: CC-BY-SA-4.0 (prose) / GPL-3.0-or-later (code)
This commit is contained in:
2026-05-10 06:39:57 -04:00
parent 22b57307cf
commit 7f585027b3
5 changed files with 1159 additions and 91 deletions

296
BEHAVE-SHELL/README.md Normal file
View File

@@ -0,0 +1,296 @@
<!-- SPDX-License-Identifier: CC-BY-SA-4.0 -->
# behave-shell
[← repo](../README.md)
Shell-session behavioral observation registry. Defines what can be observed
about an operator through their terminal interaction — typing mechanics, cognitive
style, operational patterns, infrastructure fingerprints, and cultural timing signals.
BEHAVE-SHELL does not read command content. It measures *how* someone operates a
terminal, not *what* they type. The observations are categorical labels, numeric
aggregates, and cryptographic hashes — never raw keystrokes or command text.
## Install
```bash
pip install -e ../core/ -e .
# development (pytest + ruff):
pip install -e ../core/ -e ".[dev]"
```
## Quickstart
```python
from behave_shell.spec import Observation, Window, TOPIC_PREFIX, event_topic_for
obs = Observation(
primitive="motor.keystroke_cadence",
value="bursty",
confidence=0.87,
window=Window(start_ts=1714000000.0, end_ts=1714003600.0),
source="behave/shell-sensor/timing.py",
)
# Serialize to an event bus topic + payload:
topic = event_topic_for("motor.keystroke_cadence")
# → "attacker.observation.shell.motor.keystroke_cadence"
```
## Public API (`behave_shell.spec`)
| Symbol | Description |
|---|---|
| `Observation` | Registry-aware subclass of `behave_core.spec.Observation`. Validates `primitive` against `PRIMITIVE_REGISTRY` and `value` against the primitive's type spec. |
| `Window` | Re-exported from `behave_core` — measurement time window. |
| `ObservationValue` | Re-exported union type for valid value shapes. |
| `PRIMITIVE_REGISTRY` | `dict[str, ValueTypeSpec]` — the full primitive catalog (76 entries). |
| `ValueKind` | Enum: `CATEGORICAL`, `NUMERIC`, `HASH`, `ARRAY`, `FREE_STRING`, `BOOL`. |
| `ValueTypeSpec` | Pydantic model holding a primitive's kind, allowed values, bounds, and notes. |
| `is_known(primitive)` | `bool` — whether a primitive path is registered. |
| `get(primitive)` | Returns the `ValueTypeSpec` for a primitive; raises `KeyError` if unknown. |
| `TOPIC_PREFIX` | `"attacker.observation.shell"` |
| `event_topic_for(primitive)` | Returns the full event bus topic string. |
| `to_event_payload(obs)` | Serializes an `Observation` to a bus-ready `dict`. |
| `from_event_payload(payload)` | Reconstructs an `Observation` from a bus payload. |
## Primitives
76 primitives across 9 categories. Each observation captures one measured value
for one primitive over one time window. A behavioral profile is built by collecting
many observations across many sessions.
---
### `motor.*` — Physical typing mechanics (9 primitives)
Motor primitives capture the physical mechanics of keyboard interaction: rhythm,
precision, and habitual movements that are hard to fake and stable across sessions
even when operators change tools or objectives. These are the closest BEHAVE comes
to biometrics — they exploit the fact that typing style is unconscious and consistent.
| Primitive | Kind | Description |
|---|---|---|
| `motor.keystroke_cadence` | categorical | Overall rhythm of key input. `steady` = metronomic confident typist. `bursty` = fast bursts with thinking pauses. `hunt_and_peck` = search-first-type. `machine` = mechanically regular, suggesting scripted input. |
| `motor.motor_stability` | categorical | Consistency of key hold/flight times. `steady` = low variance. `variable` = high variance (cognitive load or unfamiliar keyboard). `tremor` = rhythmic instability distinct from load-induced variance. |
| `motor.error_correction` | categorical | Response to typing mistakes. `immediate` = backspace within ~1s (automatic monitoring). `deferred` = corrects after reading output. `absent` = proceeds despite errors (scripted behavior). `route_around` = uses history or rewrites rather than backspacing. |
| `motor.command_chunking` | categorical | Flow of command composition. `fluent` = typed in one pass from memory. `fragmented` = chunks with mid-command pauses (composing while typing). `single_command` = one complete command at a time, no inline pipelines. |
| `motor.paste_burst_rate` | categorical | Frequency of large clipboard-paste events relative to typed input. `habitual` = primarily works by pasting pre-prepared blocks. |
| `motor.input_modality` | categorical | Dominant input mode. `typed` = character-by-character. `pasted` = pre-prepared blocks. `mixed` = both substantially. |
| `motor.shell_mastery.tab_completion` | categorical | Tab completion usage. `habitual` = operator relies on it constantly (inferred from short pause then rapid continuation). Strong indicator of shell familiarity. |
| `motor.shell_mastery.shortcut_usage` | categorical | Use of shell shortcuts (Ctrl+R, Ctrl+A/E, Ctrl+L, Alt+.). `heavy` = deep shell muscle memory. |
| `motor.shell_mastery.pipe_chaining_depth` | categorical | Maximum pipeline depth (cmd \| cmd \| cmd). `shallow` = 0-1 pipes. `deep` = 4+. Reflects tool-composition fluency. |
---
### `cognitive.*` — Decision-making and cognition (11 primitives)
Cognitive primitives capture how the operator thinks: their planning style, how they
respond to uncertainty and failure, and whether their timing patterns are consistent
with a human, a script, or an LLM agent. These are among the most attribution-relevant
primitives — they're stable per-operator and hard to sustain as deliberate deception.
| Primitive | Kind | Description |
|---|---|---|
| `cognitive.cognitive_load` | categorical | Inferred mental workload from timing, error rate, and inter-command variance. `high` = long pauses, frequent error-retry cycles, fragmented chunking. Composite feature for downstream attribution. |
| `cognitive.exploration_style` | categorical | Navigation style in unfamiliar environments. `methodical` = systematic enumeration (ls→cat→id→uname). `chaotic` = non-sequential jumps. `targeted` = straight to objective without exploring. |
| `cognitive.planning_depth` | categorical | Whether the operator works from a pre-formed plan. `deep` = visible logical sequence (recon→pivot→exfil). `shallow` = opportunistic. `reactive` = responds only to errors. |
| `cognitive.tool_vocabulary` | categorical | Breadth of tools used. `narrow` = fixed small toolset. `broad` = reaches for the best tool per subtask. |
| `cognitive.inter_command_latency_class` | categorical | Time between commands. `instant` (<200ms), `typing_speed` (200ms-2s), `deliberate` (2s), `llm_lightweight` (2-8s, small model agent), `llm_heavyweight` (8-30s, reasoning-class agent), `long` (>30s, human-supervised LLM). |
| `cognitive.inter_command_consistency` | categorical | Dispersion of inter-command pauses. `metronomic` = LLM-pure. `variable` = human. `bimodal` = LLM-assisted human (LLM-paced bursts + human thinking gaps). |
| `cognitive.command_branch_diversity` | categorical | Content-based script vs. adaptive discriminator. `linear_playbook` = low first-token repetition (each step uses a different tool). `adaptive_branching` = high repetition of the same tool with varying arguments (operator following a thread). |
| `cognitive.feedback_loop_engagement` | categorical | Whether pace correlates with output volume. `closed_loop` = pause grows with preceding output (reading before continuing). `fire_and_forget` = paces independently of output (scripted or unread). Cuts across the LLM/human axis. |
| `cognitive.error_resilience.retry_tactic` | categorical | Response to command failure. `rerun` = identical retry. `modify` = adjusts before retrying. `switch` = tries a different tool. `abort` = gives up on objective. |
| `cognitive.error_resilience.frustration_typing` | categorical | Speed/error spike immediately after failure. `high` = sharp burst post-failure. Strong human indicator; absent in scripts. |
| `cognitive.error_resilience.fallback_to_man` | categorical | Whether the operator invokes `man`/`--help` when stuck. `present` signals unfamiliarity with the specific tool. |
---
### `temporal.*` — Session timing and lifecycle (7 primitives)
When and how long an operator works. These signals are stable per-campaign and
hard to fake consistently across many sessions, because they reflect biological and
social rhythms (sleep, work hours, habits) rather than conscious technical choices.
| Primitive | Kind | Description |
|---|---|---|
| `temporal.session_timing` | categorical | Hour-of-day distribution. `diurnal` = business-hours peaks. `nocturnal` = late-night peaks. `irregular` = no discernible daily pattern. Requires a known timezone from `cultural.*` to interpret. |
| `temporal.session_duration` | categorical | Typical session length. `short` <15min, `medium` 15-90min, `long` 90min-4hr, `marathon` >4hr. Stable per-operator characteristic. |
| `temporal.escalation_pattern` | categorical | Activity intensity across a session. `sustained` = constant rate. `bursty` = concentrated activity then silence (waiting for long-running processes). `erratic` = unpredictable spikes. |
| `temporal.persistence` | categorical | Cross-session return behavior. `hit_and_run` = few sessions then disappears. `return_visitor` = periodic return. `resident` = near-continuous presence. |
| `temporal.lifecycle_markers.landing_ritual` | categorical | Whether a recognizable start-of-session sequence is detected (whoami → id → uname → hostname → ip addr). `present` = fingerprinted checklist habit. |
| `temporal.lifecycle_markers.exit_behavior` | categorical | Session end pattern. `graceful` = explicit logout. `abrupt` = dropped connection. `cleanup` = deletes logs/tools before exiting — strongest opsec signal in this category. |
| `temporal.lifecycle_markers.idle_periodicity` | categorical | Whether in-session idle gaps (>30s) are statistically periodic or random. `periodic` = heartbeat-like — may indicate an LLM polling loop, an automated keepalive, or a human following a timed workflow. |
---
### `operational.*` — Mission and opsec (4 primitives)
Operational primitives are coarser inferences from command patterns — what the
operator is trying to accomplish and how carefully they're hiding their footprint.
| Primitive | Kind | Description |
|---|---|---|
| `operational.opsec_discipline` | categorical | Forensic footprint management. `careful` = history disabled, tools removed, proxy/VPN confirmed. `careless` = no precautions. `learning` = inconsistent and improving mid-campaign. |
| `operational.cleanup_behavior` | categorical | Artifact handling at session end. `thorough` = removes tools, temp files, bash history. `partial` = removes some but misses others. `none` = leaves everything. |
| `operational.objective` | categorical | Inferred mission from command patterns: `recon`, `exfil`, `persistence`, `lateral` (pivoting), `destructive`. |
| `operational.multi_actor_indicators` | categorical | Signs of multiple operators. `handoff_detected` = detectable style break mid-session. `team_coordinated` = multiple signatures interleaved or simultaneous. |
---
### `environmental.*` — Physical and software context (5 primitives)
Environmental primitives describe where the operator works from. Stable per-campaign;
often reveals national origin or infrastructure choices.
| Primitive | Kind | Description |
|---|---|---|
| `environmental.keyboard_layout` | categorical | Inferred layout from characteristic key-sequence errors. An AZERTY-trained typist on QWERTY makes specific substitutions (q↔a, z↔w, m→,) that are statistically distinguishable from random errors. Reliable when error volume is sufficient (>50 errors). |
| `environmental.locale` | free_string | BCP-47 tag (e.g. `en-US`, `pt-BR`). Inferred from layout, cultural timing, and command-line encoding artifacts. Free string — locale is not a closed enum. |
| `environmental.numpad_usage` | categorical | Numeric keypad use inferred from keycode patterns. `detected` signals a desktop keyboard. |
| `environmental.terminal_multiplexer` | categorical | Presence of tmux/screen, inferred from escape sequences (Ctrl+B / Ctrl+A prefixes) and window-switching patterns. |
| `environmental.shell_type` | categorical | Shell environment inferred from syntax (array syntax, quoting style, builtin names). `powershell`/`cmd.exe` immediately flags a Windows-native operator. |
---
### `cultural.*` — Social and biological rhythms (5 primitives)
Cultural primitives exploit the fact that human work patterns are shaped by local
time, religion, and social convention. These signals are hard to sustain as deliberate
deception across a long campaign because they reflect unconscious biological rhythms.
| Primitive | Kind | Description |
|---|---|---|
| `cultural.meal_break_gaps` | categorical | Whether activity gaps align with regional meal times (`morning`, `midday`, `evening`, `late_night`). Requires a known timezone to interpret. |
| `cultural.periodic_micro_pauses` | categorical | Short rhythmic pauses of 5-15 min recurring at consistent intervals. May correspond to Salah prayer times (5 daily, spaced ~2-3hr), smoke breaks, or other cultural micro-rituals. `regular_intervals_detected` rejects the null hypothesis of random pauses at p<0.05. |
| `cultural.dst_behavior` | categorical | Whether the operator's active hours shift by 1 hour at DST transitions. `shifts_with_dst` = follows local civil time. `anchored_to_utc` = schedule is clock-fixed (automated infrastructure or deliberate counter-analysis). |
| `cultural.weekend_cadence` | categorical | Which two-day block is low-activity. `fri_sat` = Middle Eastern/Israeli pattern. `sat_sun` = Western/East Asian. Reliable national-origin signal across multiple weeks. |
| `cultural.holiday_gaps` | categorical | Whether multi-day inactivity gaps align with public holiday calendars. Requires a multi-session corpus spanning calendar events. |
---
### `emotional_valence.*` — Affective state (4 primitives)
Emotional valence primitives infer affective state from **typing dynamics** — pace,
error rate, and key-input aggression. BEHAVE-SHELL is content-blind; these
observations are derived entirely from timing and motor signals, not from what
was typed.
| Primitive | Kind | Description |
|---|---|---|
| `emotional_valence.valence` | categorical | Overall affective tone: `positive` (fluent, low-error), `neutral`, `negative` (error-heavy, erratic). Coarse aggregate; see `arousal` and `stress_response` for finer breakdown. |
| `emotional_valence.arousal` | categorical | Activation level. `low_calm` = slow deliberate pace. `high_agitated` = fast error-prone bursts. Orthogonal to valence — a calm script and a calm professional are both `low_calm`. |
| `emotional_valence.stress_response` | categorical | Whether high arousal is positive (`eustress_positive` = speed-up with low error rate, operator in the zone) or negative (`distress_negative` = speed-up with rising errors, panic). |
| `emotional_valence.frustration_venting` | categorical | Transient outburst signal: sudden speed spike or rapid backspace/delete bursts after command failures. Absent in scripted runs; strong human indicator. |
---
### `toolchain.*` — Infrastructure fingerprints (19 primitives)
Toolchain primitives fingerprint the software stack the operator uses, from TLS
handshake parameters to SSH key exchange preferences to C2 beaconing behavior.
Even fully encrypted traffic leaves structural fingerprints that identify specific
tools, libraries, and operator configurations.
#### `toolchain.tls.*` — TLS fingerprints (6)
TLS fingerprints identify the client and server stacks by their handshake parameters.
Each tool, library, and OS produces recognizable fingerprints even when the payload
is encrypted.
| Primitive | Kind | Description |
|---|---|---|
| `toolchain.tls.ja3_client` | hash | MD5 hash of TLS ClientHello parameters (SSLVersion, Ciphers, Extensions, EllipticCurves, EllipticCurvePointFormats). Salesforce, 2017. Each tool stack (curl, Metasploit, Cobalt Strike) produces a distinct hash. Searchable against databases like ja3er.com. `[DRAFT — verify]` |
| `toolchain.tls.ja3s_server` | hash | MD5 hash of TLS ServerHello (SSLVersion, Cipher, Extensions). Fingerprints the server stack — useful for identifying C2 servers by TLS response even when IPs rotate. `[DRAFT — verify]` |
| `toolchain.tls.ja4_client` | hash | FoxIO JA4 (2023): human-readable format (e.g. `t13d1516h2_8daaf6152771_e5627efa2ab1`) robust to TLS extension order randomization. Encodes TLS version, cipher count, extension count, ALPN, cipher hash, extension hash. Preferred over JA3 for new sensors. `[DRAFT — verify]` |
| `toolchain.tls.ja4s_server` | hash | JA4 server-side: fingerprints ServerHello using chosen cipher, extension list, and ALPN. More stable than JA3S when cipher ordering is randomized server-side. `[DRAFT — verify]` |
| `toolchain.tls.jarm_server` | hash | 62-char JARM hash (Salesforce, 2020). Actively probes the server with 10 crafted ClientHellos and hashes the responses. Reliably detects Cobalt Strike, Metasploit, and major C2 frameworks even with custom certificates. |
| `toolchain.tls.tls_cert_simhash` | hash | SHA-256 hex of the leaf certificate DER bytes. Tracks a specific certificate across infrastructure — useful for correlating C2 that reuses self-signed certs. |
#### `toolchain.transport.*` — Network stack fingerprints (3)
| Primitive | Kind | Description |
|---|---|---|
| `toolchain.transport.tcp_stack` | free_string | p0f OS label (e.g. `Linux 5.x`). Inferred from TCP header quirks (TTL, window size, options order, DF bit). Identifies the connecting OS before any application protocol is visible. |
| `toolchain.transport.h2_akamai_fingerprint` | free_string | HTTP/2 SETTINGS + priority + pseudo-header order hash. Different HTTP/2 libraries emit distinct SETTINGS combinations (curl vs. Python requests vs. Go net/http). `status: planned` |
| `toolchain.transport.quic_client` | free_string | QUIC initial packet fingerprint from transport parameters and connection ID length. `status: planned` |
#### `toolchain.ssh.*` — SSH fingerprints (4)
| Primitive | Kind | Description |
|---|---|---|
| `toolchain.ssh.hassh_client` | hash | MD5 hash of SSH client KEX parameters (kex_algorithms, encryption_algorithms, mac_algorithms, compression_algorithms). Salesforce, 2018. Each SSH library (OpenSSH, PuTTY, Paramiko, Impacket) produces a distinct HASSH. |
| `toolchain.ssh.hassh_server` | hash | MD5 hash of SSH server KEX parameters. Fingerprints the SSH daemon — detects honeypots, implants, or non-standard servers. `status: partial` |
| `toolchain.ssh.ssh_client_banner` | free_string | RFC 4253 protocol version string (e.g. `SSH-2.0-OpenSSH_8.9p1 Ubuntu-3ubuntu0.6`). Often unmodified even in offensive tooling. |
| `toolchain.ssh.kex_algorithm_order` | array[free_string] | Ordered KEX algorithm list from the SSH ClientHello. Different clients (OpenSSH, PuTTY, Impacket smbexec) advertise distinct orderings — secondary fingerprint beyond HASSH. `[DRAFT — verify]` |
#### `toolchain.http.*` — HTTP fingerprints (3)
| Primitive | Kind | Description |
|---|---|---|
| `toolchain.http.user_agent_tool_class` | categorical | Tool class from User-Agent and HTTP behavior. Known offensive tools use default or absent User-Agents. Values: `nmap_nse`, `sqlmap`, `nuclei`, `masscan`, `curl`, `metasploit`, `ffuf`, `gobuster`, `feroxbuster`, `nikto`, `wpscan`, `evilwinrm`, `impacket`, `unknown`. |
| `toolchain.http.header_order_fingerprint` | free_string | Hash of HTTP request header name order. Different libraries emit distinct sequences. `status: planned` |
| `toolchain.http.body_oddities` | array[free_string] | Anomalous body characteristics (e.g. `multipart_boundary_static`, `json_key_order_fixed`). `status: planned` |
#### `toolchain.c2.*` — C2 beaconing (6)
C2 primitives characterize implant beaconing behavior. Even fully encrypted C2
traffic leaves timing and structural fingerprints.
| Primitive | Kind | Description |
|---|---|---|
| `toolchain.c2.beacon_family` | categorical | C2 framework identified from traffic fingerprints: `cobalt_strike`, `sliver`, `havoc`, `mythic`, `merlin` *(planned)*, `brc4` *(planned)*, `nighthawk` *(planned)*, `unknown`. |
| `toolchain.c2.beacon_interval_ms` | numeric | Median IAT between callbacks, in milliseconds. Cobalt Strike default is 60000ms. Very short intervals (<1000ms) suggest an interactive shell, not a beacon. |
| `toolchain.c2.beacon_jitter_cv` | numeric | Coefficient of variation (std/mean) of beacon IATs. Higher CV = more randomized jitter. Cobalt Strike default jitter is 0% (CV≈0); operators who understand detection set it to 20-50%. |
| `toolchain.c2.sleep_skew` | categorical | Jitter type applied to sleep intervals. `none` = fixed (detectable). `gaussian` = normally distributed. `uniform` = flat random range. `walk` = random-walk drift. `status: partial` |
| `toolchain.c2.c2_callback_endpoint` | free_string | URL or `host:port` of the C2 callback endpoint. |
| `toolchain.c2.attack_software_id` | free_string | MITRE ATT&CK Software ID (e.g. `S0154` for Cobalt Strike). |
#### `toolchain.protocol_abuse.*` — Protocol abuse (6)
Non-standard or offensive use of standard protocols.
| Primitive | Kind | Description |
|---|---|---|
| `toolchain.protocol_abuse.dns_exfil_tool` | categorical | DNS tunneling tool. `iodine` = base32-encoded data in subdomains with TYPE NULL queries. `dnscat2` = TYPE TXT queries with specific entropy patterns. `custom_high_entropy` = tunneling-consistent but no known-tool match. `status: planned` |
| `toolchain.protocol_abuse.smb_dialect` | categorical | SMB dialect negotiated by the client. SMB1 in 2024+ is a strong indicator of legacy tooling or deliberate EternalBlue-era downgrade. `status: planned` |
| `toolchain.protocol_abuse.kerberos_etype_offer` | hash | Hash of the Kerberos AS-REQ etype list. Clients offering RC4-HMAC (etype 23) alongside modern etypes are candidates for Kerberoasting (Rubeus, Impacket GetUserSPNs). `status: planned [DRAFT — verify]` |
| `toolchain.protocol_abuse.ldap_bind_pattern` | categorical | LDAP bind mechanism. `simple` = cleartext (immediately suspicious). `sasl_gssapi` = Kerberos-backed (normal). `ntlm`, `ntlmssp_v1`, `responder_like` = NTLM and Responder-class MITM. `status: partial` |
| `toolchain.protocol_abuse.responder_signature` | free_string | Responder detection. Convention: `'false'` or `'true:llmnr'` / `'true:nbtns'` / `'true:mdns'`. Responder poisons LLMNR/NBNS/mDNS broadcasts to capture Net-NTLMv2 hashes. `status: planned` |
| `toolchain.protocol_abuse.mitm6_signature` | bool | Whether mitm6 activity is detected. mitm6 abuses IPv6 router advertisement on IPv4-only networks to hijack DNS and enable credential relay attacks. `status: planned` |
#### `toolchain.payload.*` — Payload analysis (3)
| Primitive | Kind | Description |
|---|---|---|
| `toolchain.payload.payload_simhash` | hash | 64-bit SimHash of the payload binary/shellcode. Preserves near-duplicate relationships: payloads that are 90% similar have low Hamming distance (<4 bits on 64-bit), enabling family clustering despite minor obfuscation. 16-char hex. |
| `toolchain.payload.payload_entropy_class` | categorical | Shannon entropy of payload bytes. `packed` >7.2 bits/byte (UPX, encrypted shellcode, base64-compressed). `high` 6.5-7.2 (unencrypted compiled code). `low` <5.5 (scripts, plaintext). `status: planned` |
| `toolchain.payload.loader_family` | categorical | Shellcode/loader family from structural signatures. `donut` = Donut framework (TheWover), converts .NET/PE to PIC shellcode. `sgn` = Shikata-Ga-Nai XOR encoder (Metasploit), recognizable feedback register pattern. `pe2sh` = PE-to-shellcode. `nimcrypt` = Nim-based loader with AES-encrypted payload. `status: planned` |
---
## Schema
Machine-readable JSON Schema for the observation envelope:
[`json/observation.schema.json`](json/observation.schema.json)
Regenerate after model changes:
```bash
python scripts/generate_schema.py
```
## Tests
```bash
pytest tests/
```
## Attribution recipes
[`attribution-recipes.md`](attribution-recipes.md) — out-of-scope reference document
describing how an external attribution engine might consume `attacker.observation.shell.*`
topics to build operator profiles. Not part of the BEHAVE spec.
## License
Code and schemas: [GPL-3.0-or-later](../LICENSE)
Spec prose (this file, attribution-recipes.md): [CC-BY-SA-4.0](../LICENSE.docs)

View File

@@ -2,13 +2,11 @@
"""BEHAVE primitive registry.
Source-of-truth for what `Observation.primitive` may be and what `Observation.value`
must look like. Mirrors every row in the primitive tables of `scratchpad.md`.
must look like.
Adding a new primitive is a deliberate registry edit. Sensors are expected to fail
loudly if they construct an `Observation` with an unknown primitive — that is by
design. Drift between this registry and `scratchpad.md` is a bug; v0.1 keeps the
registry hand-written so PR review catches drift, v0.2 may auto-extract from the
markdown if drift becomes a maintenance issue.
design.
PII discipline: the value-type specs here describe the SHAPE of the value, not
its content. Sensors are still bound by the rules in `spec/envelope.py`'s module
@@ -114,30 +112,115 @@ def _array(of: ValueKind, notes: Optional[str] = None) -> ValueTypeSpec:
# ─── The registry ───────────────────────────────────────────────────────────
#
# Mirrors scratchpad.md row-for-row. If you edit one, edit the other.
PRIMITIVE_REGISTRY: dict[str, ValueTypeSpec] = {
# ── motor.* ────────────────────────────────────────────────────────────
"motor.keystroke_cadence": _cat("steady", "bursty", "hunt_and_peck", "machine"),
"motor.motor_stability": _cat("steady", "variable", "tremor"),
"motor.error_correction": _cat("immediate", "deferred", "absent", "route_around"),
"motor.command_chunking": _cat("fluent", "fragmented", "single_command"),
"motor.paste_burst_rate": _cat("none", "occasional", "habitual"),
# Motor primitives capture the physical mechanics of keyboard interaction —
# rhythm, precision, and habitual movements that are hard to fake and stable
# across sessions even when operators change tools or objectives.
"motor.keystroke_cadence": _cat(
"steady", "bursty", "hunt_and_peck", "machine",
notes="Rhythm of raw key input across the session. steady=metronomic rate "
"matching a confident typist. bursty=fast bursts separated by thinking "
"pauses. hunt_and_peck=search-first-then-type characteristic of unfamiliar "
"keyboard layout or low typing skill. machine=mechanically regular cadence "
"suggesting scripted or pasted input rather than live typing.",
),
"motor.motor_stability": _cat(
"steady", "variable", "tremor",
notes="Consistency of individual key hold and flight times (dwell/flight). "
"steady=low variance, typical of a confident touch-typist. variable=high "
"variance, common under cognitive load or on an unfamiliar keyboard. "
"tremor=rhythmic instability distinct from cognitive-load variance — may "
"indicate physical condition or a non-human input device.",
),
"motor.error_correction": _cat(
"immediate", "deferred", "absent", "route_around",
notes="How the operator corrects typing mistakes. immediate=backspace within ~1s "
"of the error (automatic self-monitoring, muscle memory). deferred=correction "
"after pausing to read output. absent=no correction — operator proceeds "
"despite errors, typical of scripts or operators who know the shell will "
"fail loudly. route_around=operator avoids retyping by using history recall "
"or rewriting the command differently.",
),
"motor.command_chunking": _cat(
"fluent", "fragmented", "single_command",
notes="Whether commands are typed in a single continuous flow or as fragments. "
"fluent=typed in one pass from memory with no mid-command pauses. "
"fragmented=typed in chunks with mid-command pauses — operator is composing "
"while typing, common when adapting a remembered skeleton to the current "
"context. single_command=operator runs exactly one complete command at a "
"time and never constructs pipelines inline.",
),
"motor.paste_burst_rate": _cat(
"none", "occasional", "habitual",
notes="Frequency of large clipboard-paste events relative to typed input. "
"Distinguishes an operator driving a terminal interactively from a script "
"feeding one. habitual=operator primarily works by pasting pre-prepared "
"command blocks; none=entirely typed.",
),
"motor.input_modality": _cat(
"typed", "pasted", "mixed",
notes="dominant input modality across the session — first-class promotion of the paste-vs-type axis",
notes="Dominant input modality across the session — first-class promotion of "
"the paste-vs-type axis. typed=operator types commands character by "
"character. pasted=operator pastes pre-prepared blocks. mixed=substantial "
"use of both.",
),
# motor.shell_mastery.*
"motor.shell_mastery.tab_completion": _cat("none", "occasional", "habitual"),
"motor.shell_mastery.shortcut_usage": _cat("none", "moderate", "heavy"),
"motor.shell_mastery.pipe_chaining_depth": _cat("shallow", "moderate", "deep"),
"motor.shell_mastery.tab_completion": _cat(
"none", "occasional", "habitual",
notes="Tab key completion usage across the session. habitual=operator relies on "
"it constantly (inferred from the latency pattern: short pause then rapid "
"continuation after a partial path or command). none=operator types full "
"paths and commands without completion. Strong indicator of shell familiarity.",
),
"motor.shell_mastery.shortcut_usage": _cat(
"none", "moderate", "heavy",
notes="Use of shell keyboard shortcuts (Ctrl+R for history search, Ctrl+A/E for "
"line navigation, Ctrl+L for clear, Alt+. for last argument, etc.). Heavy "
"usage indicates deep shell muscle memory, reliably stable across sessions.",
),
"motor.shell_mastery.pipe_chaining_depth": _cat(
"shallow", "moderate", "deep",
notes="Maximum depth of pipeline chains observed (cmd | cmd | cmd...). shallow=0-1 "
"pipes, moderate=2-3, deep=4+. Reflects preference for composing Unix tools "
"rather than running one-off commands. Correlates with cognitive.tool_vocabulary.",
),
# ── cognitive.* ────────────────────────────────────────────────────────
"cognitive.cognitive_load": _cat("low", "medium", "high"),
"cognitive.exploration_style": _cat("methodical", "chaotic", "targeted"),
"cognitive.planning_depth": _cat("deep", "shallow", "reactive"),
"cognitive.tool_vocabulary": _cat("narrow", "moderate", "broad"),
# Cognitive primitives capture how the operator thinks and makes decisions —
# their planning style, how they respond to uncertainty, and signs that they
# are human vs. automated.
"cognitive.cognitive_load": _cat(
"low", "medium", "high",
notes="Inferred mental workload derived from timing patterns, error rate, and "
"inter-command variance. high=long pauses before and after commands, "
"frequent error-retry cycles, fragmented command chunking. Collapses "
"multiple temporal and motor signals into a holistic load estimate. "
"Useful as a composite feature for downstream attribution rather than "
"a standalone signal.",
),
"cognitive.exploration_style": _cat(
"methodical", "chaotic", "targeted",
notes="How the operator navigates an unfamiliar environment. methodical=systematic "
"enumeration (ls→cat→id→uname in a logical sequence). chaotic=non-sequential "
"jumps between unrelated commands with no visible thread. targeted=operator "
"knows exactly what they want and goes straight for it without exploring.",
),
"cognitive.planning_depth": _cat(
"deep", "shallow", "reactive",
notes="Whether the operator works from a pre-formed plan. deep=commands follow a "
"visible logical sequence (recon→pivot→exfil) with little backtracking. "
"shallow=opportunistic — follows each output where it leads. reactive=operator "
"responds only to errors or surprises rather than driving toward an objective.",
),
"cognitive.tool_vocabulary": _cat(
"narrow", "moderate", "broad",
notes="Breadth of distinct tools and commands used across the session. narrow=operator "
"relies on a small fixed toolset (e.g. only curl, grep, ls). broad=operator "
"reaches for the best tool for each subtask, suggesting deep familiarity with "
"the Unix ecosystem or the target environment.",
),
"cognitive.inter_command_latency_class": _cat(
"instant", "typing_speed", "deliberate",
"llm_lightweight", "llm_heavyweight", "long",
@@ -150,7 +233,7 @@ PRIMITIVE_REGISTRY: dict[str, ValueTypeSpec] = {
),
"cognitive.inter_command_consistency": _cat(
"metronomic", "variable", "bimodal",
notes="dispersion (CV) of inter-command pauses; metronomic = LLM-pure, "
notes="Dispersion (CV) of inter-command pauses; metronomic = LLM-pure, "
"variable = human, bimodal = LLM-assisted human (LLM-paced bursts + "
"human-thinking gaps). v0.1 uses CV thresholds; true bimodal "
"detection (Hartigan dip / two-peak detection) is v0.2.",
@@ -184,108 +267,460 @@ PRIMITIVE_REGISTRY: dict[str, ValueTypeSpec] = {
"more honest framing.",
),
# cognitive.error_resilience.*
"cognitive.error_resilience.retry_tactic": _cat("rerun", "modify", "switch", "abort"),
"cognitive.error_resilience.frustration_typing": _cat("low", "moderate", "high"),
"cognitive.error_resilience.fallback_to_man": _cat("absent", "present"),
"cognitive.error_resilience.retry_tactic": _cat(
"rerun", "modify", "switch", "abort",
notes="What the operator does when a command fails. rerun=identical retry with "
"no changes (hoping transient error clears). modify=adjusts the command "
"before retrying (flags, paths, arguments). switch=abandons the tool and "
"tries a different one for the same goal. abort=gives up on that objective "
"and moves on.",
),
"cognitive.error_resilience.frustration_typing": _cat(
"low", "moderate", "high",
notes="Elevated typing speed or error rate immediately after a command failure, "
"indicating an emotional response to the setback. high=sharp speed spike "
"and error burst post-failure. A behavioral tell that separates emotionally "
"reactive humans from scripted operators or composed professionals.",
),
"cognitive.error_resilience.fallback_to_man": _cat(
"absent", "present",
notes="Whether the operator invokes man, --help, or -h when stuck. present is a "
"tell for unfamiliarity with the specific tool in use — an operator who "
"knows their tools cold rarely needs to. Absent in scripted runs.",
),
# ── temporal.* ─────────────────────────────────────────────────────────
"temporal.session_timing": _cat("diurnal", "nocturnal", "irregular"),
"temporal.session_duration": _cat("short", "medium", "long", "marathon"),
"temporal.escalation_pattern": _cat("sustained", "erratic", "bursty"),
"temporal.persistence": _cat("hit_and_run", "return_visitor", "resident"),
# Temporal primitives characterize WHEN and HOW LONG an operator works.
# Stable across sessions; hard to fake consistently over a campaign.
"temporal.session_timing": _cat(
"diurnal", "nocturnal", "irregular",
notes="Hour-of-day distribution of the operator's activity. diurnal=activity "
"peaks align with local business hours (09:00-18:00). nocturnal=peaks in "
"local night hours (22:00-06:00). irregular=no discernible daily pattern. "
"The local timezone must be established separately (see cultural.*) to "
"interpret diurnal/nocturnal meaningfully.",
),
"temporal.session_duration": _cat(
"short", "medium", "long", "marathon",
notes="Typical duration of a single continuous session. short=<15min, "
"medium=15-90min, long=90min-4hr, marathon=>4hr. Stable individual "
"characteristic — some operators always work in short sprints, others "
"in long unbroken stretches.",
),
"temporal.escalation_pattern": _cat(
"sustained", "erratic", "bursty",
notes="How activity intensity changes across a session. sustained=constant "
"command rate throughout. erratic=unpredictable spikes and lulls. "
"bursty=concentrated activity followed by extended quiet — common when "
"an operator waits for a long-running process before continuing.",
),
"temporal.persistence": _cat(
"hit_and_run", "return_visitor", "resident",
notes="Cross-session return behavior. hit_and_run=one or very few sessions then "
"disappears. return_visitor=returns periodically (e.g. weekly maintenance). "
"resident=near-continuous presence, behaves as if the compromised host is "
"a persistent workstation.",
),
# temporal.lifecycle_markers.*
"temporal.lifecycle_markers.landing_ritual": _cat("present", "absent"),
"temporal.lifecycle_markers.exit_behavior": _cat("graceful", "abrupt", "cleanup"),
"temporal.lifecycle_markers.idle_periodicity": _cat("random", "periodic"),
"temporal.lifecycle_markers.landing_ritual": _cat(
"present", "absent",
notes="Whether the operator runs a recognizable sequence of commands at session "
"start (e.g. whoami → id → uname -a → hostname → ip addr). present=a "
"fingerprinted landing ritual is detected, suggesting established habit or "
"a pre-written checklist. absent=operator jumps straight to objective work.",
),
"temporal.lifecycle_markers.exit_behavior": _cat(
"graceful", "abrupt", "cleanup",
notes="How the session ends. graceful=explicit logout or exit command. "
"abrupt=connection drops without cleanup (killed, network failure, or "
"scripted timeout). cleanup=operator deletes logs, tools, or temp files "
"before exiting — the strongest opsec signal in this category.",
),
"temporal.lifecycle_markers.idle_periodicity": _cat(
"random", "periodic",
notes="Whether intra-session pauses (idle gaps >30s) occur at statistically "
"regular intervals or at random. periodic=heartbeat-like idle pattern — "
"may indicate an LLM polling loop, an automated keepalive, or a human "
"following a timed workflow. random=human thinking pauses with no "
"detectable rhythm.",
),
# ── operational.* ──────────────────────────────────────────────────────
"operational.opsec_discipline": _cat("careful", "careless", "learning"),
"operational.cleanup_behavior": _cat("thorough", "partial", "none"),
"operational.objective": _cat("recon", "exfil", "persistence", "lateral", "destructive"),
"operational.multi_actor_indicators": _cat("solo", "handoff_detected", "team_coordinated"),
# Operational primitives describe WHAT the operator is trying to do and HOW
# carefully they're hiding it. These are coarser inferences from command patterns
# rather than direct measurements.
"operational.opsec_discipline": _cat(
"careful", "careless", "learning",
notes="How carefully the operator minimizes their forensic footprint. "
"careful=history disabled (HISTFILE=/dev/null), tools removed after use, "
"proxy/VPN confirmed, log entries tampered. careless=no precautions — "
"history on, tools left in /tmp, no timestamp cover. learning=inconsistent "
"and improving across sessions, characteristic of an operator developing "
"their craft mid-campaign.",
),
"operational.cleanup_behavior": _cat(
"thorough", "partial", "none",
notes="What the operator does with artifacts (uploaded tools, compiled binaries, "
"temp files) at session end. thorough=removes everything explicitly, "
"including bash history. partial=removes some artifacts but misses others "
"(common). none=leaves all artifacts — operator either trusts the implant "
"to cover or does not expect forensic review.",
),
"operational.objective": _cat(
"recon", "exfil", "persistence", "lateral", "destructive",
notes="Inferred mission objective from command-pattern analysis. recon=enumeration "
"and data collection without exfiltration. exfil=active data transfer out "
"of scope. persistence=installing mechanisms to survive reboot or session "
"end (cron, systemd, ssh key). lateral=pivoting to adjacent hosts. "
"destructive=wipe, encrypt, or sabotage commands.",
),
"operational.multi_actor_indicators": _cat(
"solo", "handoff_detected", "team_coordinated",
notes="Whether the session shows signs of more than one person operating. "
"handoff_detected=a detectable style break mid-session (motor cadence, "
"vocabulary, or latency class changes sharply at a point in time). "
"team_coordinated=multiple style signatures interleaved or simultaneous "
"activity from the same account across sessions.",
),
# ── environmental.* ────────────────────────────────────────────────────
"environmental.keyboard_layout": _cat("qwerty", "azerty", "qwertz", "other"),
"environmental.locale": _str(notes="BCP-47 tag (e.g. 'en-US', 'pt-BR'); free string by deliberate choice"),
"environmental.numpad_usage": _cat("detected", "not_detected"),
"environmental.terminal_multiplexer": _cat("none", "tmux", "screen"),
"environmental.shell_type": _cat("bash", "zsh", "fish", "cmd.exe", "powershell"),
# Environmental primitives describe the physical and software context the
# operator works from. Stable per-campaign; often reveals national origin
# or infrastructure choices.
"environmental.keyboard_layout": _cat(
"qwerty", "azerty", "qwertz", "other",
notes="Inferred keyboard layout from characteristic key-sequence errors. An "
"AZERTY-trained typist on a QWERTY keyboard makes specific substitutions "
"(q↔a, z↔w, m→,) that are statistically distinguishable from random "
"errors. Reliable when error volume is sufficient (typically >50 errors "
"in the session).",
),
"environmental.locale": _str(
notes="BCP-47 tag (e.g. 'en-US', 'pt-BR'); free string by deliberate choice — "
"locale is not a closed enum. Inferred from keyboard layout, cultural "
"timing patterns, and command-line character encoding artifacts.",
),
"environmental.numpad_usage": _cat(
"detected", "not_detected",
notes="Whether the operator uses a numeric keypad for digit entry, inferred from "
"keycode patterns. detected signals a desktop keyboard rather than a laptop, "
"which narrows the physical environment.",
),
"environmental.terminal_multiplexer": _cat(
"none", "tmux", "screen",
notes="Presence of tmux or screen, inferred from keybinding escape sequences "
"(Ctrl+B or Ctrl+A prefixes) and window-switching patterns. Multiplexer use "
"suggests a persistent, organized working style.",
),
"environmental.shell_type": _cat(
"bash", "zsh", "fish", "cmd.exe", "powershell",
notes="Shell environment, inferred from syntax patterns (array syntax, string "
"quoting style, builtin names). powershell and cmd.exe immediately flag a "
"Windows-native operator, which constraints the likely toolchain.",
),
# ── cultural.* ─────────────────────────────────────────────────────────
"cultural.meal_break_gaps": _cat("none_detected", "morning", "midday", "evening", "late_night"),
"cultural.periodic_micro_pauses": _cat("none_detected", "regular_intervals_detected"),
"cultural.dst_behavior": _cat("shifts_with_dst", "anchored_to_utc", "unknown"),
"cultural.weekend_cadence": _cat("fri_sat", "sat_sun", "no_weekend", "irregular"),
"cultural.holiday_gaps": _cat("none_detected", "specific_dates_detected"),
# Cultural primitives exploit the fact that human work patterns are shaped by
# local time, religion, and social convention. These signals are hard to sustain
# as deception across a long campaign.
"cultural.meal_break_gaps": _cat(
"none_detected", "morning", "midday", "evening", "late_night",
notes="Whether activity gaps align with regional meal times. morning=09:00-10:00 "
"local, midday=12:00-14:00, evening=19:00-21:00, late_night=00:00-02:00. "
"Absent if the operator works through typical meal windows. Requires "
"environmental.locale or a known timezone to interpret.",
),
"cultural.periodic_micro_pauses": _cat(
"none_detected", "regular_intervals_detected",
notes="Short, rhythmic pauses of 5-15 minutes recurring at consistent intervals "
"within a session. May correspond to prayer times (Salah — 5 daily, "
"spaced ~2-3hr in active hours), smoke breaks, or other cultural micro-"
"rituals. regular_intervals_detected means the null hypothesis of random "
"pauses is rejected at p<0.05.",
),
"cultural.dst_behavior": _cat(
"shifts_with_dst", "anchored_to_utc", "unknown",
notes="Whether the operator's active-hours window shifts by 1 hour at daylight "
"saving transitions. shifts_with_dst=schedule follows local civil time "
"(the operator lives there). anchored_to_utc=schedule is clock-fixed, "
"suggesting automated infrastructure or an operator who deliberately anchors "
"to UTC to defeat this analysis.",
),
"cultural.weekend_cadence": _cat(
"fri_sat", "sat_sun", "no_weekend", "irregular",
notes="Which two-day block the operator treats as a weekend (low-activity days). "
"fri_sat=Middle Eastern / Israeli weekend pattern. sat_sun=Western / "
"East Asian pattern. no_weekend=operator works 7 days at uniform intensity. "
"A reliable national-origin signal when observed across multiple weeks.",
),
"cultural.holiday_gaps": _cat(
"none_detected", "specific_dates_detected",
notes="Whether unexplained multi-day inactivity gaps align with known public "
"holiday calendars. specific_dates_detected triggers when a gap of >=2 days "
"falls within ±1 day of a public holiday in at least one candidate locale. "
"Requires a multi-session corpus spanning calendar events.",
),
# ── emotional_valence.* ────────────────────────────────────────────────
"emotional_valence.valence": _cat("positive", "neutral", "negative"),
"emotional_valence.arousal": _cat("low_calm", "medium_engaged", "high_agitated"),
"emotional_valence.stress_response": _cat("none", "eustress_positive", "distress_negative"),
"emotional_valence.frustration_venting": _cat("none", "detected"),
# Emotional valence primitives infer affective state from TYPING DYNAMICS —
# pace, error rate, and aggression in key input. They do NOT read message
# content; BEHAVE-SHELL is content-blind.
"emotional_valence.valence": _cat(
"positive", "neutral", "negative",
notes="Overall affective tone inferred from typing dynamics across the session. "
"Positive=fluent, low-error, engaged pace. Negative=error-heavy, erratic, "
"showing markers of frustration or stress. This is a coarse aggregate; "
"see arousal and stress_response for finer-grained breakdown.",
),
"emotional_valence.arousal": _cat(
"low_calm", "medium_engaged", "high_agitated",
notes="How energized or activated the operator appears. low_calm=slow, deliberate "
"pace with long inter-command gaps. high_agitated=fast, error-prone bursts "
"with short pauses. This dimension is orthogonal to valence: a calm "
"professional and a calm automated script are both low_calm.",
),
"emotional_valence.stress_response": _cat(
"none", "eustress_positive", "distress_negative",
notes="Whether detected high arousal reflects positive challenge or negative overload. "
"eustress_positive=speed-up with low error rate (operator in the zone, engaged "
"problem-solving). distress_negative=speed-up accompanied by rising error rate "
"and frustration-venting markers (overloaded, panicking). none=arousal is "
"insufficient to classify.",
),
"emotional_valence.frustration_venting": _cat(
"none", "detected",
notes="Detectable outburst signal: a sudden spike in typing speed or rapid-fire "
"backspace/delete keys immediately following a string of command failures. "
"Distinct from sustained high arousal — this is a transient, failure-triggered "
"event. Absent in scripted runs; strong human indicator.",
),
# ── toolchain.tls.* ────────────────────────────────────────────────────
"toolchain.tls.ja3_client": _hash(),
"toolchain.tls.ja3s_server": _hash(),
"toolchain.tls.ja4_client": _hash(),
"toolchain.tls.ja4s_server": _hash(),
"toolchain.tls.jarm_server": _hash(notes="62-char JARM hash"),
"toolchain.tls.tls_cert_simhash": _hash(notes="SHA-256 hex of leaf cert"),
# TLS fingerprints identify the client and server stacks by their handshake
# parameters. Each tool, library, and OS tends to produce a recognizable
# fingerprint even when the payload is encrypted.
"toolchain.tls.ja3_client": _hash(
notes="MD5 hash of TLS ClientHello parameters: SSLVersion, Ciphers, Extensions, "
"EllipticCurves, EllipticCurvePointFormats (Salesforce, 2017). Fingerprints "
"the client TLS stack — curl, OpenSSL, Metasploit, Cobalt Strike, and most "
"offensive tools each produce a distinct hash. Searchable against public "
"databases (e.g. ja3er.com). [DRAFT — verify]",
),
"toolchain.tls.ja3s_server": _hash(
notes="MD5 hash of TLS ServerHello parameters: SSLVersion, Cipher, Extensions. "
"Fingerprints the server TLS stack. Useful for identifying C2 servers by "
"their TLS response even when IP addresses rotate — the server library "
"version (e.g. OpenSSL vs. WolfSSL) is often stable. [DRAFT — verify]",
),
"toolchain.tls.ja4_client": _hash(
notes="JA4 fingerprint (FoxIO, 2023): replaces JA3 with a sortable, "
"human-readable format (e.g. t13d1516h2_8daaf6152771_e5627efa2ab1) that "
"is more robust to TLS extension order randomization. Encodes TLS version, "
"cipher count, extension count, ALPN, cipher hash, and extension hash in "
"three underscore-separated fields. Preferred over JA3 for new sensors. "
"[DRAFT — verify]",
),
"toolchain.tls.ja4s_server": _hash(
notes="JA4 server-side fingerprint: encodes the chosen cipher, extension list, "
"and ALPN from the ServerHello. More stable than JA3S when the server "
"randomizes cipher ordering — JA4S hashes the sorted cipher list. "
"[DRAFT — verify]",
),
"toolchain.tls.jarm_server": _hash(
notes="62-char JARM hash (Salesforce, 2020). Actively probes the server by "
"sending 10 specially crafted TLS ClientHellos and hashing the ServerHello "
"responses. Fingerprints the server TLS stack at a deeper level than JA3S — "
"detects Cobalt Strike, Metasploit, and major C2 frameworks reliably even "
"when they use custom certificates.",
),
"toolchain.tls.tls_cert_simhash": _hash(
notes="SHA-256 hex of the leaf certificate's DER-encoded bytes. Tracks the "
"specific certificate in use, not just the stack. Useful for correlating "
"C2 infrastructure that reuses self-signed certs across campaigns.",
),
# ── toolchain.transport.* ──────────────────────────────────────────────
"toolchain.transport.tcp_stack": _str(notes="p0f label, e.g. 'Linux 5.x'"),
"toolchain.transport.h2_akamai_fingerprint": _str(notes="HTTP/2 SETTINGS+priority+pseudo-header order hash; status: planned"),
"toolchain.transport.quic_client": _str(notes="QUIC initial packet fingerprint; status: planned"),
"toolchain.transport.tcp_stack": _str(
notes="p0f label for the TCP/IP stack (e.g. 'Linux 5.x', 'Windows 10'). Inferred "
"from TCP header field quirks (TTL, window size, options order, DF bit). "
"Reveals the OS of the connecting host even before any application-layer "
"protocol is seen.",
),
"toolchain.transport.h2_akamai_fingerprint": _str(
notes="HTTP/2 SETTINGS frame + priority frame + pseudo-header order hash. "
"Different HTTP/2 client libraries produce distinct SETTINGS and priority "
"combinations (curl vs. Python requests vs. Go net/http). "
"status: planned",
),
"toolchain.transport.quic_client": _str(
notes="QUIC initial packet fingerprint derived from transport parameters and "
"connection ID length patterns. Fingerprints the QUIC library in use. "
"status: planned",
),
# ── toolchain.ssh.* ────────────────────────────────────────────────────
"toolchain.ssh.hassh_client": _hash(notes="md5"),
"toolchain.ssh.hassh_server": _hash(notes="md5; status: partial"),
"toolchain.ssh.ssh_client_banner": _str(notes="RFC 4253 banner string"),
"toolchain.ssh.kex_algorithm_order": _array(ValueKind.FREE_STRING),
"toolchain.ssh.hassh_client": _hash(
notes="MD5 hash of SSH client KEX parameters: kex_algorithms, encryption_algorithms, "
"mac_algorithms, compression_algorithms (Salesforce, 2018). Each SSH client "
"library (OpenSSH, PuTTY, libssh, Paramiko, Impacket) produces a distinct "
"HASSH. Stable across versions within a major release.",
),
"toolchain.ssh.hassh_server": _hash(
notes="MD5 hash of SSH server KEX parameters (same field set as HASSH client). "
"Fingerprints the SSH daemon — useful for identifying honeypots, implants, "
"or non-standard SSH servers. status: partial",
),
"toolchain.ssh.ssh_client_banner": _str(
notes="RFC 4253 protocol version string sent by the SSH client (e.g. "
"'SSH-2.0-OpenSSH_8.9p1 Ubuntu-3ubuntu0.6'). Often unmodified even in "
"offensive tooling, providing an easy first-pass fingerprint.",
),
"toolchain.ssh.kex_algorithm_order": _array(
ValueKind.FREE_STRING,
notes="Ordered list of key-exchange algorithms offered in the SSH ClientHello "
"(e.g. ['curve25519-sha256', 'ecdh-sha2-nistp256', 'diffie-hellman-group14-sha256']). "
"Different clients (OpenSSH, PuTTY, Paramiko, Impacket's smbexec) advertise "
"distinct KEX orderings, providing a secondary fingerprint beyond HASSH. "
"[DRAFT — verify]",
),
# ── toolchain.http.* ───────────────────────────────────────────────────
"toolchain.http.user_agent_tool_class": _cat(
"nmap_nse", "sqlmap", "nuclei", "masscan", "curl", "metasploit",
"ffuf", "gobuster", "feroxbuster", "nikto", "wpscan", "evilwinrm",
"impacket", "unknown",
notes="Tool classification from User-Agent string and HTTP behavior fingerprint. "
"Known offensive tools typically use default User-Agent strings or omit the "
"header entirely, making them trivially classifiable. unknown=no match in "
"the known-tool list.",
),
"toolchain.http.header_order_fingerprint": _str(
notes="Hash of the HTTP request header name order. Different HTTP client libraries "
"emit headers in distinct sequences (Host first vs. last, Accept-Encoding "
"presence, etc.). Fingerprints the underlying HTTP library independently of "
"the User-Agent. status: planned",
),
"toolchain.http.body_oddities": _array(
ValueKind.FREE_STRING,
notes="List of anomalous body characteristics (e.g. 'multipart_boundary_static', "
"'json_key_order_fixed', 'soap_envelope_namespace_style'). Captures "
"tool-specific body serialization tics. status: planned",
),
"toolchain.http.header_order_fingerprint": _str(notes="status: planned"),
"toolchain.http.body_oddities": _array(ValueKind.FREE_STRING, notes="status: planned"),
# ── toolchain.c2.* ─────────────────────────────────────────────────────
# C2 (Command and Control) primitives characterize the beaconing and callback
# behavior of implants. Even encrypted C2 traffic leaves timing and structural
# fingerprints.
"toolchain.c2.beacon_family": _cat(
"cobalt_strike", "sliver", "havoc", "mythic",
"merlin", "brc4", "nighthawk", "unknown",
notes="last 3 = status: planned",
notes="C2 framework identified from beacon timing, traffic shape, and protocol "
"fingerprints. cobalt_strike, sliver, havoc, mythic=well-characterized "
"open-source or widely-used commercial frameworks. merlin, brc4, "
"nighthawk=status: planned (less common; less training data).",
),
"toolchain.c2.beacon_interval_ms": _num(
min_val=0,
notes="Median inter-arrival time (IAT) between beacon callbacks, in milliseconds. "
"Cobalt Strike default is 60000ms (60s). Operators often lower this for "
"interactivity. Very short intervals (<1000ms) suggest an interactive shell "
"rather than a true beacon.",
),
"toolchain.c2.beacon_jitter_cv": _num(
min_val=0,
notes="Coefficient of variation (std/mean) of beacon IATs. Higher CV means more "
"randomized jitter — a deliberate evasion technique to defeat fixed-interval "
"detection. Cobalt Strike's default jitter is 0% (CV≈0); operators who "
"understand detection set it to 20-50%.",
),
"toolchain.c2.sleep_skew": _cat(
"none", "gaussian", "uniform", "walk",
notes="Type of jitter applied to beacon sleep intervals. none=fixed interval "
"(detectable by timing analysis). gaussian=normally-distributed jitter "
"(common in Cobalt Strike with jitter set). uniform=flat random range. "
"walk=random-walk drift (each sleep shifts from the previous). "
"status: partial",
),
"toolchain.c2.c2_callback_endpoint": _str(
notes="URL or host:port of the C2 callback endpoint observed in traffic. "
"Plain string — do not store post-decryption content here.",
),
"toolchain.c2.attack_software_id": _str(
notes="MITRE ATT&CK Software ID (e.g. 'S0154' for Cobalt Strike). Provides a "
"stable cross-reference to the MITRE knowledge base for attribution reporting.",
),
"toolchain.c2.beacon_interval_ms": _num(min_val=0, notes="median IAT in milliseconds"),
"toolchain.c2.beacon_jitter_cv": _num(min_val=0, notes="coefficient of variation"),
"toolchain.c2.sleep_skew": _cat("none", "gaussian", "uniform", "walk", notes="status: partial"),
"toolchain.c2.c2_callback_endpoint": _str(notes="url or host:port"),
"toolchain.c2.attack_software_id": _str(notes="MITRE Software ID, e.g. 'S0154'"),
# ── toolchain.protocol_abuse.* ─────────────────────────────────────────
# Protocol abuse primitives capture non-standard or offensive use of standard
# protocols — DNS tunneling, SMB negotiation quirks, Kerberos downgrade attempts,
# and LLMNR/NBNS poisoning tools.
"toolchain.protocol_abuse.dns_exfil_tool": _cat(
"iodine", "dnscat2", "custom_high_entropy", "none", notes="status: planned",
"iodine", "dnscat2", "custom_high_entropy", "none",
notes="DNS tunneling tool identified from query patterns. iodine=base32-encoded "
"data in subdomains with TYPE NULL queries. dnscat2=TYPE TXT queries with "
"specific length/entropy patterns. custom_high_entropy=high-entropy "
"subdomains consistent with tunneling but not matching a known tool signature. "
"status: planned",
),
"toolchain.protocol_abuse.smb_dialect": _cat(
"SMB1", "SMB2.0.2", "SMB2.1", "SMB3.0", "SMB3.0.2", "SMB3.1.1",
notes="status: planned",
notes="SMB protocol dialect negotiated by the client. SMB1 use in 2024+ is a "
"strong indicator of legacy tooling or deliberate downgrade (EternalBlue-era "
"exploits require SMB1). SMB3.1.1 with pre-auth integrity check is the "
"modern hardened default. status: planned",
),
"toolchain.protocol_abuse.kerberos_etype_offer": _hash(
notes="Hash of the set of encryption types offered in the Kerberos AS-REQ etype "
"list. Clients that offer RC4-HMAC (etype 23) alongside modern etypes are "
"candidates for AS-REP roasting or Kerberoasting tooling (Rubeus, Impacket "
"GetUserSPNs). The hash captures the exact etype combination without "
"storing the cleartext list. status: planned [DRAFT — verify]",
),
"toolchain.protocol_abuse.kerberos_etype_offer": _hash(notes="status: planned — hash of supported etypes"),
"toolchain.protocol_abuse.ldap_bind_pattern": _cat(
"simple", "sasl_gssapi", "ntlm", "ntlmssp_v1", "responder_like",
notes="status: partial",
notes="LDAP bind mechanism used by the client. simple=cleartext credentials "
"(dangerous, immediately suspicious in modern environments). "
"sasl_gssapi=Kerberos-backed GSSAPI (normal). ntlm=NTLM challenge-response. "
"ntlmssp_v1=downgraded NTLMv1 (Responder target). responder_like=sequence "
"of binds matching Responder or similar MITM tools. status: partial",
),
"toolchain.protocol_abuse.responder_signature": _str(
notes="bool + variant; convention: 'false' or 'true:llmnr', 'true:nbtns', etc.; status: planned",
notes="Boolean + variant string indicating whether Responder (or a compatible tool) "
"was detected. Convention: 'false' if absent; 'true:llmnr', 'true:nbtns', "
"'true:mdns' for the poisoning protocol detected. Responder poisons LLMNR, "
"NBNS, and mDNS broadcasts to capture Net-NTLMv2 hashes. status: planned",
),
"toolchain.protocol_abuse.mitm6_signature": _bool(
notes="Whether mitm6 (Fox-IT tool) activity is detected. mitm6 abuses IPv6 router "
"advertisement messages on predominantly IPv4 networks to force Windows hosts "
"to use an attacker-controlled DNS server, enabling credential relay attacks. "
"status: planned",
),
"toolchain.protocol_abuse.mitm6_signature": _bool(notes="status: planned"),
# ── toolchain.payload.* ────────────────────────────────────────────────
"toolchain.payload.payload_simhash": _hash(notes="64-bit SimHash, hex string"),
"toolchain.payload.payload_entropy_class": _cat("low", "medium", "high", "packed", notes="status: planned"),
"toolchain.payload.loader_family": _cat("donut", "sgn", "pe2sh", "nimcrypt", "unknown", notes="status: planned"),
"toolchain.payload.payload_simhash": _hash(
notes="64-bit SimHash of the observed payload binary or shellcode. SimHash "
"preserves near-duplicate relationships: two payloads that are 90% similar "
"will have low Hamming distance (<4 bits difference on a 64-bit hash), "
"enabling family clustering even when the operator applies minor obfuscation. "
"Stored as a 16-char hex string.",
),
"toolchain.payload.payload_entropy_class": _cat(
"low", "medium", "high", "packed",
notes="Shannon entropy class of the payload bytes. packed=entropy >7.2 bits/byte, "
"characteristic of UPX or custom packing, encrypted shellcode, or base64-"
"compressed payloads. high=6.5-7.2, typical of unencrypted compiled code. "
"low=<5.5, typical of scripts or plaintext. status: planned",
),
"toolchain.payload.loader_family": _cat(
"donut", "sgn", "pe2sh", "nimcrypt", "unknown",
notes="Shellcode/loader family identified from structural signatures. donut=Donut "
"framework (TheWover) — converts .NET assemblies and PE files to position-"
"independent shellcode with a recognizable header. sgn=Shikata-Ga-Nai encoder "
"(Metasploit) — polymorphic XOR encoder with a distinct feedback register "
"pattern. pe2sh=PE-to-shellcode conversion. nimcrypt=Nim-based loader with "
"AES-encrypted payload. status: planned",
),
}

196
BEHAVE-TEXT/README.md Normal file
View File

@@ -0,0 +1,196 @@
<!-- SPDX-License-Identifier: CC-BY-SA-4.0 -->
# behave-text
[← repo](../README.md)
Text/messaging-domain behavioral observation registry. Defines what can be observed
about an actor through their written messaging activity — stylometric fingerprints,
lexical patterns, interaction rhythms, and governance-role signals.
BEHAVE-TEXT operates on **derived features, not raw text**. Sensors hash, aggregate,
and classify before emitting — the raw message content never enters a BEHAVE
observation. This is a tighter constraint than BEHAVE-SHELL because the source
signal *is* text content; the PII risk is higher.
The topic prefix is `actor.observation.text` (not `attacker.`) because chat groups
include non-attacker roles — admins, buyers, sellers, bots, lurkers. The framing
is deliberately neutral: BEHAVE-TEXT observes actors, not adversaries.
## Install
```bash
pip install -e ../core/ -e .
# development:
pip install -e ../core/ -e ".[dev]"
```
## Quickstart
```python
from behave_text.spec import Observation, Window, TOPIC_PREFIX, event_topic_for
obs = Observation(
primitive="stylometric.capitalization_habit",
value="lowercase",
confidence=0.91,
window=Window(start_ts=1714000000.0, end_ts=1714086400.0),
source="behave/text-sensor/stylometry.py",
)
topic = event_topic_for("stylometric.capitalization_habit")
# → "actor.observation.text.stylometric.capitalization_habit"
```
## Public API (`behave_text.spec`)
| Symbol | Description |
|---|---|
| `Observation` | Registry-aware subclass of `behave_core.spec.Observation`. Validates `primitive` and `value` against `PRIMITIVE_REGISTRY`. |
| `Window` | Re-exported from `behave_core`. |
| `ObservationValue` | Re-exported union type. |
| `PRIMITIVE_REGISTRY` | `dict[str, ValueTypeSpec]` — the full primitive catalog (35 entries). |
| `ValueKind` | Enum: `CATEGORICAL`, `NUMERIC`, `HASH`, `ARRAY`, `FREE_STRING`, `BOOL`. |
| `ValueTypeSpec` | Pydantic model: kind, allowed values, bounds, notes. |
| `is_known(primitive)` | `bool` — whether a primitive path is registered. |
| `get(primitive)` | Returns the `ValueTypeSpec`; raises `KeyError` if unknown. |
| `TOPIC_PREFIX` | `"actor.observation.text"` |
| `event_topic_for(primitive)` | Returns the full event bus topic string. |
Note: `to_event_payload` / `from_event_payload` (full round-trip helpers) are
present in `behave-shell` but not yet implemented here — `status: planned`.
## Primitives
35 primitives across 6 categories.
---
### `stylometric.*` — Writing style fingerprints (12 primitives)
Stylometric primitives capture the unconscious writing habits that distinguish
one author from another. The field goes back to the Mosteller-Wallace Federalist
Papers study (1963): function-word frequencies alone can attribute authorship
with high accuracy in long-form English text. BEHAVE-TEXT adapts these methods
to short-form Spanish chat, which introduces domain-specific challenges (short
messages, informal register, code-switching, emoji). Calibration results from
the Rutify corpus are noted inline where they affect interpretation.
| Primitive | Kind | Description |
|---|---|---|
| `stylometric.punctuation_style` | hash | Canonical punctuation-pattern fingerprint hash. Captures the author's consistent punctuation tics (double spaces, comma habits, no-period endings) as a searchable signature. |
| `stylometric.capitalization_habit` | categorical | Dominant capitalization rule. `lowercase` = no capitals. `proper` = standard sentence/title case. `random_caps` = no consistent rule. `mixed_i` = consistent lowercase 'i' mid-sentence — common in Spanish chat where the standalone-'I' habit doesn't apply but the behavior transfers. |
| `stylometric.emoji_usage` | categorical | Rate of emoji use. `none`, `occasional`, `frequent`, `exclusive` (messages rarely without emoji). Captures tone and register. |
| `stylometric.emoji_placement` | categorical | Emoji position relative to sentence-ending punctuation. `pre_punctuation` = 'Hola 😊.' `post_punctuation` = 'Hola. 😊' Individual authors are strikingly consistent in this micro-habit. |
| `stylometric.message_length_class` | categorical | Median message length bucket: `short` 1-5 words, `medium` 6-20, `long` 21-50, `paragraph` >50. See also `message_length_variance_class` for distribution shape. |
| `stylometric.message_length_variance_class` | categorical | Distribution shape of per-message word counts. `tight` CV<0.5 (always 1-3 words). `varied` 0.5≤CV<1.5 (normal mix). `bimodal` CV≥1.5 (mostly short with occasional rants). Two authors can share the same median length but have wildly different variance. |
| `stylometric.linebreak_style` | categorical | Whether the author sends one complete thought per message or bursts multiple short sequential messages. `multi_line` = habitual 3-5 short messages per turn. `wall_of_text` = dense blocks, rarely uses line breaks. Captures a stylistic rhythm that is hard to consciously alter. |
| `stylometric.typo_signature` | hash | SHA-256 of the canonical persistent-typo set — the specific recurring errors the author makes consistently (e.g. always writes `tener` as `tenet`, or `porque` as `xq`). Persistent typos are strong authorship signals because they reflect keyboard-motor habits. |
| `stylometric.function_word_distribution_top50` | hash | 64-bit SimHash over the 50 most common Spanish function-word frequency vector. Based on the Mosteller-Wallace method. **Calibration note (2026-05-02, Rutify corpus):** within-author and cross-author Hamming distance distributions overlap (within median 8 bits, cross median 10 bits) in short-message chat — this primitive alone cannot discriminate authors. Engines should weight it low and composite with character n-grams and distinctive vocabulary. Kept in v0 for calibration grids. |
| `stylometric.function_word_distribution_top200` | hash | 64-bit SimHash over the 200 most common Spanish function words. The wider list reaches into the long tail (rare-but-individual words like `tampoco`, `aunque`, `mientras`) that carry more discriminating signal in short-message corpora. Not yet emitted by v0 prototype — populated in v0.2. |
| `stylometric.character_ngram_simhash` | hash | 64-bit SimHash over character n-gram frequencies (default n=3), lowercased. Orthogonal to function-word distributions: captures punctuation tics, accent-stripping habits, typo patterns, and idiom fragments that survive paraphrase. Accents are preserved because accent-stripping is itself a stylistic tic. Source label declares n size (e.g. `#char3gram`). |
| `stylometric.distinctive_vocabulary_signature` | hash | 64-bit SimHash over a TF-IDF-weighted top-K rare-word vector. Captures the author's distinctive lexicon — words they use that other authors in the same corpus do not. Complementary to function-word distributions: where `function_word_*` captures common-word style, this captures individual lexical choice. Requires the full corpus for IDF computation. Source label declares top-K and corpus tag (e.g. `#tfidf-top50`). |
---
### `lexical.*` — Vocabulary and linguistic patterns (8 primitives)
Lexical primitives characterize *what* and *how* an actor writes at the word and
sentence level. Where stylometric primitives fingerprint unconscious micro-habits,
lexical primitives capture deliberate linguistic choices — vocabulary richness,
how questions are formed, register.
| Primitive | Kind | Description |
|---|---|---|
| `lexical.vocabulary_richness` | numeric [0,1] | Moving-Average Type-Token Ratio (MATTR) over a sliding window (default 50 tokens). Volume-independent: each window contributes its own unique/total ratio, the value is the mean. Avoids the standard TTR bias where larger corpora mechanically score lower. Source label declares window size. |
| `lexical.slang_density` | numeric [0,1] | Rate of slang terms per message, against a locale-tuned slang corpus. |
| `lexical.code_switching_rate` | numeric [0,1] | Language switches per N tokens (Solorio & Liu metric). A speaker who switches between Spanish and English, or Spanish and lunfardo/caló, will have a higher rate than a monolingual writer. |
| `lexical.code_switching_matrix_language` | free_string | BCP-47 tag of the dominant (matrix) language in code-switching texts (e.g. `es-CL`, `es-AR`). The matrix language is the grammatical scaffold; embedded languages appear as inserts. |
| `lexical.code_switching_embedded_languages` | array[free_string] | BCP-47 list of non-matrix languages observed in the actor's messages. |
| `lexical.sentence_complexity_class` | categorical | Dominant clause structure. `simple` = single-clause. `compound` = two independent clauses joined by coordinating conjunctions (pero, y, o). `complex` = dependent clauses and subordination (aunque, porque, cuando). Reflects education level and cognitive investment. |
| `lexical.question_formation_style` | categorical | How questions are formed. `punctuation_only` = question mark without interrogative words ('¿Cuánto?') — very common in Spanish chat. `lexical` = explicit interrogatives (¿qué, cómo, cuándo). `formal` = inverted subject-verb or formal register. |
| `lexical.imperative_style` | categorical | How commands and requests are framed. `informal_directive` = tú/vos imperative (dame, hazlo). `formal_directive` = usted imperative (hágame el favor). `polite` = conditional/modal softening (¿podría...?). Stable per-author trait in hierarchical contexts. |
---
### `temporal_evolution.*` — Behavioral change over time (1 primitive)
| Primitive | Kind | Description |
|---|---|---|
| `temporal_evolution.lifecycle_phase` | categorical | Auto-classified lifecycle stage from windowed within-corpus analysis. `arrival_burst` = first 24hr, first-window volume dominates (empirically validated against OxPayload's first 12 hours in Rutify). `stable_member` = low drift across the full tenure. `fluctuating_member` = tenure ≥24hr with median drift between stable and inflection thresholds — established noisy regulars (e.g. lamarabitch). `inflection_member` = long-tenure actor with a real behavioral shift in at least one window-pair. `declining_member` = monotonically decreasing per-window message counts. `unknown` = insufficient data. Window size adapts to tenure: <24hr → 2h, <7d → 12h, <30d → 1d, otherwise 7d. |
---
### `network.*` — Governance and role signals (2 primitives)
Network primitives capture the actor's *structural role* in the group — inferred
from interaction patterns rather than content — and a bot detector. These are
heuristic composites built from other primitives; treat them as candidate signals,
not verdicts.
| Primitive | Kind | Description |
|---|---|---|
| `network.is_likely_bot` | categorical | Heuristic bot detector. `likely_bot` when `conversation_initiation_rate` ≥ 0.95 AND `attention_pattern` = `broadcast` AND `vocabulary_richness` < 0.65. Validated (2026-05-03) against SangMata_beta_bot (caught) vs 11 high-volume humans (no false positives). Low-volume bots (e.g. QuotLyBot, 9 messages) sit below the fingerprint threshold. Source label declares heuristic version (e.g. `#bot-heuristic-v1`). |
| `network.governance_role_signal` | categorical | Heuristic role shape from interaction primitives + lifecycle. `admin_pattern` = init_rate ≥ 0.80, attention reciprocal, non-bot, non-arrival_burst. `responder_pattern` = init_rate ≤ 0.45, attention reciprocal. `bot_pattern` = matches `is_likely_bot`. `regular` = everything else above volume threshold. Empirically caught 4/4 high-volume Rutify admins, sebaImlI as responder, SangMata as bot. NOT a ground-truth admin label. |
---
### `interaction.*` — Messaging behavior (6 primitives)
Interaction primitives characterize *how* the actor participates in conversations —
timing, initiation rate, and attention patterns.
| Primitive | Kind | Description |
|---|---|---|
| `interaction.response_latency_class` | categorical | How quickly the actor responds to messages directed at them. `immediate` <30s (suggests active monitoring or automation). `fast` 30s-5min. `normal` 5-60min. `slow` 1-24hr. `sporadic` = no consistent pattern. |
| `interaction.conversation_initiation_rate` | numeric [0,1] | Thread-starting messages / total messages. High rate = the actor drives conversations. |
| `interaction.message_burst_rate` | categorical | Whether the actor sends multiple messages per turn. `habitual` = almost always bursts (3+ messages before any reply). `single` = almost always one message per turn. Tied to `stylometric.linebreak_style multi_line`. |
| `interaction.active_hours_class` | free_string | UTC active-hours window summary (e.g. `05:00-14:00 UTC`). Free string — the window shape varies by actor and doesn't fit a closed enum. |
| `interaction.session_duration_class` | categorical | Typical session length: `short` <15min, `medium` 15-90min, `long` 90min-4hr, `marathon` >4hr. Shares the enum with `behave_shell`'s `temporal.session_duration`. |
| `interaction.attention_pattern` | categorical | Reply-graph centrality shape. `broadcast` = sends to many, replies to few (one-to-many). `focused` = concentrates on a small set of interlocutors. `reciprocal` = balanced give-and-take. |
---
### `content.*` — Content-derived signals, EXPERIMENTAL (6 primitives)
Content primitives are derived from message text through classifiers rather than
structural/timing analysis. They carry the highest risk of false positives, are
brittle to vocabulary drift, and are locale-specific. An attribution engine may
choose to weight these at zero until field-validated against labeled data.
| Primitive | Kind | Description |
|---|---|---|
| `content.role_signal` | categorical | Locale-tuned role-vocabulary classifier. Values: `admin`, `seller`, `buyer`, `lurker`, `newbie`. May be moved to a separate IOC/keyword-detection layer after Rutify testing. `EXPERIMENTAL` |
| `content.transactional_language` | numeric [0,1] | Rate of transactional terms per message. Locale-specific; brittle to vocabulary drift. `EXPERIMENTAL` |
| `content.opsec_awareness` | numeric [0,1] | Rate of security-conscious phrases. **HIGH FALSE-POSITIVE RISK** on casual conversation about deleting files/messages. `EXPERIMENTAL` |
| `content.targeting_language` | array[free_string] | IOC-shaped target patterns (bank names, government portals, RUT ranges). Consider moving to a dedicated IOC layer. `EXPERIMENTAL` |
| `content.boasting_pattern` | categorical | Success-claim frequency: `none`, `occasional`, `frequent`. Corpus-dependent regex. `EXPERIMENTAL` |
| `content.conflict_style` | categorical | Dispute-tone classification: `aggressive`, `defusing`, `appellate`. Needs labelled training data. `EXPERIMENTAL` |
---
## Schema
Machine-readable JSON Schema:
[`json/observation.schema.json`](json/observation.schema.json)
Regenerate after model changes:
```bash
python scripts/generate_schema.py
```
## Tests
```bash
pytest tests/
```
## Attribution recipes
[`attribution-recipes.md`](attribution-recipes.md) — placeholder document sketching
how an external attribution engine would consume `actor.observation.text.*` topics
to build actor profiles (`credential_broker`, `low_skill_buyer`, `group_admin`, etc.).
**Not populated yet** — awaiting Rutify corpus calibration. Not part of the BEHAVE spec.
## License
Code and schemas: [GPL-3.0-or-later](../LICENSE)
Spec prose (this file, attribution-recipes.md): [CC-BY-SA-4.0](../LICENSE.docs)

View File

@@ -114,10 +114,32 @@ def _array(of: ValueKind, notes: Optional[str] = None) -> ValueTypeSpec:
PRIMITIVE_REGISTRY: dict[str, ValueTypeSpec] = {
# ── stylometric.* (motor analog — 8) ──────────────────────────────────
"stylometric.punctuation_style": _hash(notes="canonical punctuation-pattern fingerprint"),
"stylometric.capitalization_habit": _cat("lowercase", "proper", "random_caps", "mixed_i"),
"stylometric.emoji_usage": _cat("none", "occasional", "frequent", "exclusive"),
"stylometric.emoji_placement": _cat("pre_punctuation", "post_punctuation", "no_punctuation", "mixed"),
"stylometric.message_length_class": _cat("short", "medium", "long", "paragraph"),
"stylometric.capitalization_habit": _cat(
"lowercase", "proper", "random_caps", "mixed_i",
notes="Dominant capitalization rule the author applies. lowercase=no capitals except "
"after sentence breaks. proper=standard title/sentence case. random_caps=no "
"consistent rule. mixed_i=author consistently writes 'i' in lowercase even "
"mid-sentence — common in Spanish chat where 'I' is not a standalone word "
"but the habit transfers from the native language's lowercase 'yo'.",
),
"stylometric.emoji_usage": _cat(
"none", "occasional", "frequent", "exclusive",
notes="Rate of emoji use per message. exclusive=messages rarely contain text without "
"emoji. This captures tone and register — heavy emoji use in a criminal-market "
"context is a distinct style trait worth preserving.",
),
"stylometric.emoji_placement": _cat(
"pre_punctuation", "post_punctuation", "no_punctuation", "mixed",
notes="Where emojis appear relative to sentence-ending punctuation. "
"pre_punctuation='Hola 😊.' post_punctuation='Hola. 😊' "
"Individual authors are strikingly consistent in this micro-habit.",
),
"stylometric.message_length_class": _cat(
"short", "medium", "long", "paragraph",
notes="Median message length bucket: short=1-5 words, medium=6-20 words, "
"long=21-50 words, paragraph=>50 words. See also "
"stylometric.message_length_variance_class for the distribution shape.",
),
"stylometric.message_length_variance_class": _cat(
"tight", "varied", "bimodal",
notes="Coefficient of variation of per-message word counts. Captures "
@@ -129,7 +151,14 @@ PRIMITIVE_REGISTRY: dict[str, ValueTypeSpec] = {
"rants). Added in v0.2 after Rutify calibration found median-only "
"bucketing discarded most of the per-author variance signal.",
),
"stylometric.linebreak_style": _cat("single_thought", "multi_line", "wall_of_text"),
"stylometric.linebreak_style": _cat(
"single_thought", "multi_line", "wall_of_text",
notes="Whether the author sends one complete thought per message or breaks a single "
"statement into multiple sequential short messages. multi_line=habitual "
"message-burst style (sends 3-5 short messages in rapid succession instead "
"of one composed message). wall_of_text=rarely uses line breaks, sends dense "
"blocks. Captures a stylistic rhythm that is hard to consciously alter.",
),
"stylometric.typo_signature": _hash(notes="sha256 of canonical persistent-typo set"),
"stylometric.function_word_distribution_top50": _hash(
notes="64-bit simhash over the 50-most-common Spanish function-word frequency "
@@ -188,9 +217,31 @@ PRIMITIVE_REGISTRY: dict[str, ValueTypeSpec] = {
"lexical.code_switching_matrix_language": _str(notes="BCP-47 of dominant language"),
"lexical.code_switching_embedded_languages": _array(ValueKind.FREE_STRING,
notes="BCP-47 list of non-matrix languages observed"),
"lexical.sentence_complexity_class": _cat("simple", "compound", "complex"),
"lexical.question_formation_style": _cat("punctuation_only", "lexical", "formal"),
"lexical.imperative_style": _cat("informal_directive", "formal_directive", "polite"),
"lexical.sentence_complexity_class": _cat(
"simple", "compound", "complex",
notes="Dominant clause structure. simple=single-clause messages (no conjunctions "
"or subordination). compound=two independent clauses joined by coordinating "
"conjunctions (pero, y, o, ni). complex=dependent clauses and subordination "
"(aunque, porque, cuando, que + verb). Reflects education level and "
"cognitive investment in message composition.",
),
"lexical.question_formation_style": _cat(
"punctuation_only", "lexical", "formal",
notes="How questions are formed. punctuation_only=question mark appended without "
"interrogative words ('¿Cuánto?' or 'Mañana?') — very common in Spanish "
"chat. lexical=explicit interrogatives (¿qué, cómo, cuándo, dónde). "
"formal=inverted subject-verb order or formal register ('¿Podría usted...'). "
"Captures register and education level.",
),
"lexical.imperative_style": _cat(
"informal_directive", "formal_directive", "polite",
notes="How commands and requests are framed. informal_directive=tú/vos imperative "
"('dame', 'hazlo', 'mándame'). formal_directive=usted imperative "
"('hágame el favor', 'envíeme'). polite=conditional or modal softening "
"('¿podría...?', 'me gustaría...'). Stable per-author trait in criminal "
"market contexts where hierarchical and peer relationships are expressed "
"through register choice.",
),
# ── temporal_evolution.* (lifecycle / change-over-time — 1) ───────────
"temporal_evolution.lifecycle_phase": _cat(
@@ -247,10 +298,22 @@ PRIMITIVE_REGISTRY: dict[str, ValueTypeSpec] = {
),
# ── interaction.* (temporal analog — 6) ───────────────────────────────
"interaction.response_latency_class": _cat("immediate", "fast", "normal", "slow", "sporadic"),
"interaction.response_latency_class": _cat(
"immediate", "fast", "normal", "slow", "sporadic",
notes="How quickly the actor responds to messages directed at them. "
"immediate=<30s (suggests active monitoring or automated response). "
"fast=30s-5min. normal=5-60min (typical async chat). slow=1-24hr. "
"sporadic=no consistent response latency — appears and disappears.",
),
"interaction.conversation_initiation_rate": _num(min_val=0.0, max_val=1.0,
notes="thread-starting messages / total"),
"interaction.message_burst_rate": _cat("single", "occasional", "habitual"),
"interaction.message_burst_rate": _cat(
"single", "occasional", "habitual",
notes="Whether the actor sends multiple messages in rapid sequence within a "
"conversation turn. habitual=almost always bursts (sends 3+ messages "
"before any reply). single=almost always one message per turn. Tied to "
"stylometric.linebreak_style multi_line.",
),
"interaction.active_hours_class": _str(notes="UTC active-hours window summary"),
"interaction.session_duration_class": _cat("short", "medium", "long", "marathon",
notes="REUSED enum from BEHAVE-SHELL temporal.session_duration"),

78
core/README.md Normal file
View File

@@ -0,0 +1,78 @@
<!-- SPDX-License-Identifier: CC-BY-SA-4.0 -->
# behave-core
[← repo](../README.md)
The shared observation envelope for BEHAVE. Defines the wire format that
`behave-shell` and `behave-text` serialize all behavioral observations into.
Every sensor in the BEHAVE ecosystem emits the same `Observation` structure —
the domain-specific meaning lives in `primitive` and `value`; the envelope
provides identity, provenance, time window, and schema versioning.
## What it provides
| Symbol | Type | Description |
|---|---|---|
| `OBSERVATION_SCHEMA_VERSION` | `int` | Envelope schema version (currently `1`). Bumped when field shapes change; federation gossip receivers reject mismatched versions. |
| `Observation` | Pydantic model | One behavioral observation: a single primitive measured over a time window. The core class is registry-agnostic — it does not validate `primitive` or `value` against any specific domain. Use the registry-aware subclasses in `behave-shell` or `behave-text` for full validation. |
| `ObservationValue` | `Union[str, int, float, bool, list[str], list[int], list[float], dict]` | Type alias covering all valid value shapes. |
| `Window` | Pydantic model | The measurement window: `start_ts` and `end_ts` in epoch seconds. Distinct from `Observation.ts` (the emission time) — a sensor may compute an observation over a past window and emit it later. |
## `Observation` fields
| Field | Type | Required | Description |
|---|---|---|---|
| `primitive` | `str` | ✓ | Fully-qualified primitive path, e.g. `motor.keystroke_cadence` |
| `value` | `ObservationValue` | ✓ | The measured value; shape validated by the domain registry |
| `confidence` | `float [0,1]` | ✓ | Sensor's confidence in this measurement (not in any attribution verdict) |
| `window` | `Window` | ✓ | Measurement time window |
| `source` | `str` | ✓ | Canonical sensor identifier, e.g. `behave/sniffer/timing.py` |
| `evidence_ref` | `str \| None` | — | Pointer to underlying raw evidence (session tape, pcap). **Never** the evidence itself — see PII note below. |
| `identity_ref` | `str \| None` | — | AttackerIdentity UUID if the observation is pre-attributed |
| `ts` | `float` | auto | Emission timestamp, epoch seconds |
| `id` | `str` | auto | UUID hex for deduplication |
| `v` | `int` | auto | Envelope schema version (= `OBSERVATION_SCHEMA_VERSION`) |
## PII discipline (non-negotiable)
BEHAVE observations carry **categorical labels, timing aggregates, and hashes only**.
They must never carry:
- Raw keystroke content or command arguments
- Passwords, tokens, session keys, or any authentication material
- File contents or payload bytes
- Raw message text (especially in `behave-text`)
`evidence_ref` is a **pointer** to underlying evidence held elsewhere. Never the evidence itself.
## Install
```bash
pip install -e .
# or, as a dependency of behave-shell / behave-text:
pip install -e ../core/
```
## Quickstart
```python
from behave_core.spec import Observation, Window, OBSERVATION_SCHEMA_VERSION
obs = Observation(
primitive="motor.keystroke_cadence",
value="bursty",
confidence=0.82,
window=Window(start_ts=1714000000.0, end_ts=1714003600.0),
source="behave/shell-sensor/timing.py",
)
print(obs.model_dump_json())
```
## Tests
```bash
pytest tests/
```
## License
Code: [GPL-3.0-or-later](../LICENSE)