Shell-session behavioral observation registry layered on core. SPDX: GPL-3.0-or-later (code) / CC-BY-SA-4.0 (attribution-recipes.md).
339 lines
18 KiB
Markdown
339 lines
18 KiB
Markdown
<!-- SPDX-License-Identifier: CC-BY-SA-4.0 -->
|
|
|
|
# BEHAVE Attribution Recipes
|
|
|
|
> **This document is not part of BEHAVE.** BEHAVE (`scratchpad.md`) defines the observation taxonomy and emission envelope. It does **not** assert who an actor is, link sessions, or assign profiles. Those are attribution-engine concerns.
|
|
>
|
|
> This document collects **reference patterns** for an attribution engine that consumes BEHAVE observations. The patterns are illustrative, not authoritative. A real engine may use any of these recipes, none of them, or its own.
|
|
|
|
---
|
|
|
|
## Engine Interface
|
|
|
|
An attribution engine is a process that:
|
|
|
|
### Consumes
|
|
- **`attacker.observation.*`** — BEHAVE observation streams (the entire taxonomy from `scratchpad.md`).
|
|
- **`identity.label.*`** — manual ground-truth labels applied by users (e.g. "this session was our internal red team").
|
|
- **`identity.engagement.*`** — authorized-engagement registry (red-team scopes-of-work, bug-bounty windows, scheduled pentest dates).
|
|
|
|
### Emits
|
|
- **`attribution.profile.candidate`** — one or more profiles whose pattern an identity's observations partially match, each with a confidence score. Emitted continuously as observations accumulate.
|
|
- **`attribution.profile.current`** — the engine's current best aggregate verdict for an identity. A view, not a fact.
|
|
- **`attribution.profile.changed`** — fired when `attribution.profile.current` shifts.
|
|
- **`attribution.linkage.proposed`** — engine proposes linking two identities or sessions, with a confidence score. The user / clusterer accepts or rejects.
|
|
- **`attribution.confidence.delta`** — per-identity confidence trajectory, suitable for time-series visualization.
|
|
|
|
### Does not emit
|
|
- Anything in `attacker.observation.*` (BEHAVE-owned).
|
|
- Anything in `identity.label.*` or `identity.engagement.*` (user-owned).
|
|
|
|
### Replaceability
|
|
The engine is a **separate package** from BEHAVE. A BEHAVE deployment without an engine still produces useful observation streams; downstream consumers may aggregate them however they wish. A reference engine implementation may ship alongside BEHAVE for demos and bootstrap, but it is not BEHAVE.
|
|
|
|
---
|
|
|
|
## Profile Recipes
|
|
|
|
Profiles are organized by **motive + engagement model + skill tier + tradecraft discipline** — the categories that intel teams (Mandiant, CrowdStrike, ENISA, ATT&CK Groups) use.
|
|
|
|
Each recipe defines:
|
|
|
|
- **`dominant_observations`** — observations whose presence (over a session window) raises confidence in this profile. Each carries a weight `[0.0, 1.0]`.
|
|
- **`necessary_observations`** — observations that *must* appear in the window for the profile to be eligible. If absent, confidence is capped at zero.
|
|
- **`incompatible_observations`** — observations whose presence excludes this profile.
|
|
- **`exemplars`** — MITRE ATT&CK Group IDs (`G####`) or community-named groups that exemplify the profile.
|
|
- **`min_confidence`** — floor below which the engine should not emit `attribution.profile.candidate` for this profile.
|
|
|
|
Engines are free to ignore weights, replace this scoring model, or learn their own from labeled data.
|
|
|
|
---
|
|
|
|
### `opportunistic_crimeware_operator`
|
|
|
|
Volume-game commodity-malware operator. Buys/rents stealers (Raccoon, RedLine, Lumma, Vidar). Sloppy when forced to be manual.
|
|
|
|
```yaml
|
|
profile: opportunistic_crimeware_operator
|
|
dominant_observations:
|
|
- {primitive: motor.keystroke_cadence, value_in: [bursty, hunt_and_peck], weight: 0.5}
|
|
- {primitive: motor.error_correction, value_in: [immediate], weight: 0.4}
|
|
- {primitive: cognitive.cognitive_load, value_in: [high], weight: 0.5}
|
|
- {primitive: cognitive.tool_vocabulary, value_in: [narrow], weight: 0.6}
|
|
- {primitive: cognitive.error_resilience.retry_tactic, value_in: [rerun], weight: 0.4}
|
|
- {primitive: temporal.session_duration, value_in: [short], weight: 0.4}
|
|
- {primitive: temporal.persistence, value_in: [hit_and_run], weight: 0.5}
|
|
- {primitive: operational.opsec_discipline, value_in: [careless], weight: 0.6}
|
|
- {primitive: toolchain.tls.ja3_client, match: common_default, weight: 0.3}
|
|
incompatible_observations:
|
|
- {primitive: motor.keystroke_cadence, value_eq: machine}
|
|
exemplars: []
|
|
notes: "Tell vs. nearest neighbor (initial_access_broker): lacks validation discipline — does not test creds across services before exiting."
|
|
min_confidence: 0.55
|
|
```
|
|
|
|
---
|
|
|
|
### `initial_access_broker`
|
|
|
|
Distinct profession in the criminal economy. Gets in, validates, sells. No post-exploitation.
|
|
|
|
```yaml
|
|
profile: initial_access_broker
|
|
dominant_observations:
|
|
- {primitive: motor.keystroke_cadence, value_in: [steady], weight: 0.5}
|
|
- {primitive: motor.command_chunking, value_in: [fluent], weight: 0.5}
|
|
- {primitive: cognitive.exploration_style, value_in: [targeted], weight: 0.7}
|
|
- {primitive: cognitive.planning_depth, value_in: [shallow], weight: 0.4}
|
|
- {primitive: temporal.session_duration, value_in: [short], weight: 0.5}
|
|
- {primitive: temporal.persistence, value_in: [return_visitor], weight: 0.5}
|
|
- {primitive: operational.objective, value_in: [recon], weight: 0.6}
|
|
- {primitive: toolchain.http.user_agent_tool_class, value_in: [evilwinrm, impacket], weight: 0.5}
|
|
incompatible_observations:
|
|
- {primitive: operational.objective, value_in: [destructive]}
|
|
exemplars: ["UNC2465", "UNC2596"]
|
|
notes: "Tell vs. ransomware_affiliate: escalation absent — validates AD reachability and exits, never deploys payload."
|
|
min_confidence: 0.6
|
|
```
|
|
|
|
---
|
|
|
|
### `ransomware_affiliate`
|
|
|
|
Post-exploitation hands-on actor running a RaaS playbook (LockBit, ALPHV/BlackCat, Akira, Play, Medusa).
|
|
|
|
```yaml
|
|
profile: ransomware_affiliate
|
|
dominant_observations:
|
|
- {primitive: motor.keystroke_cadence, value_in: [steady], weight: 0.5}
|
|
- {primitive: motor.command_chunking, value_in: [fluent], weight: 0.5}
|
|
- {primitive: cognitive.exploration_style, value_in: [methodical], weight: 0.7}
|
|
- {primitive: temporal.escalation_pattern, value_in: [bursty], weight: 0.5}
|
|
- {primitive: temporal.session_duration, value_in: [long, marathon], weight: 0.5}
|
|
- {primitive: toolchain.c2.beacon_family, value_in: [cobalt_strike, sliver, havoc], weight: 0.8}
|
|
necessary_observations:
|
|
- {primitive: operational.objective, value_in: [destructive], within_window: engagement}
|
|
incompatible_observations:
|
|
- {primitive: identity.engagement.authorized, matches_session: true} # excludes red-team
|
|
exemplars: ["G1015", "G1040", "G0102"]
|
|
notes: "Tell vs. state_aligned_espionage_operator: dwell is days, not months; exfil-then-encrypt closes the engagement loudly."
|
|
min_confidence: 0.65
|
|
```
|
|
|
|
---
|
|
|
|
### `state_aligned_espionage_operator`
|
|
|
|
APT tradecraft. Disciplined, patient, custom tooling, careful opsec, long dwell.
|
|
|
|
```yaml
|
|
profile: state_aligned_espionage_operator
|
|
dominant_observations:
|
|
- {primitive: motor.keystroke_cadence, value_in: [steady], weight: 0.5}
|
|
- {primitive: motor.motor_stability, value_in: [steady], weight: 0.4}
|
|
- {primitive: motor.error_correction, value_in: [route_around], weight: 0.5}
|
|
- {primitive: cognitive.cognitive_load, value_in: [low], weight: 0.5}
|
|
- {primitive: cognitive.tool_vocabulary, value_in: [broad], weight: 0.6}
|
|
- {primitive: cognitive.planning_depth, value_in: [deep], weight: 0.6}
|
|
- {primitive: temporal.persistence, value_in: [resident], weight: 0.7}
|
|
- {primitive: operational.opsec_discipline, value_in: [careful], weight: 0.7}
|
|
- {primitive: operational.cleanup_behavior, value_in: [thorough], weight: 0.6}
|
|
- {primitive: toolchain.c2.beacon_family, value_in: [unknown], weight: 0.4} # custom implants
|
|
incompatible_observations:
|
|
- {primitive: operational.objective, value_in: [destructive], dominant_in_window: true}
|
|
- {primitive: identity.engagement.authorized, matches_session: true}
|
|
exemplars: ["G0007", "G0016", "G0050", "G0096"]
|
|
notes: |
|
|
Tell vs. authorized_red_teamer: objective trends to long-term collection; no engagement-bounded dwell.
|
|
Tell vs. ransomware_affiliate: encryption never fires.
|
|
min_confidence: 0.7
|
|
```
|
|
|
|
---
|
|
|
|
### `authorized_red_teamer`
|
|
|
|
Pentester or red-team engagement. Legally scoped. **Critical to distinguish — the most common attribution-fail is treating a friendly as hostile.**
|
|
|
|
```yaml
|
|
profile: authorized_red_teamer
|
|
necessary_observations:
|
|
- {primitive: identity.engagement.authorized, matches_session: true} # without registry hit, profile cannot apply
|
|
dominant_observations:
|
|
- {primitive: motor.keystroke_cadence, value_in: [steady], weight: 0.4}
|
|
- {primitive: motor.command_chunking, value_in: [fluent], weight: 0.4}
|
|
- {primitive: cognitive.tool_vocabulary, value_in: [broad], weight: 0.5}
|
|
- {primitive: cognitive.exploration_style, value_in: [methodical], weight: 0.5}
|
|
- {primitive: temporal.session_timing, value_in: [diurnal], weight: 0.4}
|
|
- {primitive: toolchain.c2.beacon_family, value_in: [cobalt_strike, sliver], weight: 0.5}
|
|
exemplars: []
|
|
notes: |
|
|
The necessary_observation on identity.engagement.authorized is load-bearing. Without an authoritative
|
|
engagement registry hit, the profile must not apply — otherwise red-teamers collapse onto
|
|
ransomware_affiliate. C2 watermark resolution against known commercial license keys is a secondary
|
|
signal but not enforced in this recipe.
|
|
min_confidence: 0.7
|
|
```
|
|
|
|
---
|
|
|
|
### `malicious_insider` *(aspirational — requires per-identity baselining)*
|
|
|
|
Already authenticated. Knows the environment. No exploitation phase. **Not yet operational** — depends on per-identity historical baselining, which is an engine feature that does not exist yet.
|
|
|
|
```yaml
|
|
profile: malicious_insider
|
|
status: aspirational
|
|
necessary_observations:
|
|
- {primitive: identity.label.applied, contains: insider_baseline_exists} # gate: can only apply if baseline exists
|
|
dominant_observations:
|
|
- {primitive: cognitive.tool_vocabulary, value_in: [narrow], context: environment_specific, weight: 0.4}
|
|
- {primitive: temporal.session_timing, deviation_from: identity_baseline, weight: 0.6}
|
|
- {primitive: operational.objective, value_in: [exfil, destructive], no_recon_phase: true, weight: 0.6}
|
|
exemplars: []
|
|
notes: |
|
|
Detectable only as DEVIATION FROM SELF, not from population. Requires per-identity historical
|
|
baseline (NOT YET IMPLEMENTED). Cross-references HR/UEBA out-of-band. Until baselining ships,
|
|
the engine should not emit candidates for this profile.
|
|
min_confidence: 0.7
|
|
```
|
|
|
|
---
|
|
|
|
### `automated_scanner_bot`
|
|
|
|
Mass scanners (Shodan, Censys, internetdb), exploit-as-a-service worms (Mirai descendants, Mozi, RondoDox), opportunistic CVE chasers. **No human present.**
|
|
|
|
```yaml
|
|
profile: automated_scanner_bot
|
|
necessary_observations:
|
|
- {primitive: motor.keystroke_cadence, value_eq: machine}
|
|
dominant_observations:
|
|
- {primitive: motor.error_correction, value_in: [absent], weight: 0.5}
|
|
- {primitive: temporal.lifecycle_markers.idle_periodicity, value_in: [periodic], weight: 0.6}
|
|
- {primitive: temporal.escalation_pattern, value_in: [sustained], weight: 0.5}
|
|
- {primitive: operational.objective, value_in: [recon], weight: 0.5}
|
|
- {primitive: toolchain.http.user_agent_tool_class, value_in: [masscan, nuclei, unknown], weight: 0.5}
|
|
exemplars: []
|
|
notes: "Tell vs. opportunistic_crimeware_operator: no human latency, no error correction, no command sequencing."
|
|
min_confidence: 0.8
|
|
```
|
|
|
|
---
|
|
|
|
### `ai_assisted_operator` *(empirically calibrated 2026-05-02 — "YOU-sim" signature)*
|
|
|
|
Operator working alongside an LLM — typing some commands, pasting others, pacing themselves at typing-speed because the LLM is suggesting next moves but the human is still in the chair making decisions. **The most operationally important class to detect**: this is the realistic 2026 adversary, neither pure human nor pure agent. They inherit *some* mechanical signatures from the LLM (clean pastes, no typos, scripted-feeling commands) and *some* human signatures from the operator (variable paste rate, faster pauses than pure LLM, real intent driving the recon flow). On the 5-point calibration grid this profile sits **between** `HUMAN` and `LW-sim`, sharing primitives with both — which is exactly why it's hard to spot and worth modelling explicitly.
|
|
|
|
```yaml
|
|
profile: ai_assisted_operator
|
|
status: empirically_calibrated
|
|
calibration_session: "46434173-82ee-4b3b-bfcd-c954607050a2" # YOU-sim, sessions-2026-05-02-with-llm.jsonl
|
|
dominant_observations:
|
|
- {primitive: motor.input_modality, value_in: [pasted], weight: 0.6}
|
|
- {primitive: motor.paste_burst_rate, value_in: [occasional], weight: 0.7} # NOT habitual — that's pure-LLM
|
|
- {primitive: motor.error_correction, value_in: [absent], weight: 0.5}
|
|
- {primitive: motor.shell_mastery.tab_completion, value_in: [none], weight: 0.4}
|
|
- {primitive: cognitive.inter_command_latency_class, value_in: [typing_speed], weight: 0.7} # FASTER than llm_lightweight
|
|
- {primitive: cognitive.inter_command_consistency, value_in: [metronomic], weight: 0.6}
|
|
- {primitive: cognitive.command_branch_diversity, value_in: [linear_playbook], weight: 0.4}
|
|
- {primitive: cognitive.feedback_loop_engagement, value_in: [fire_and_forget], weight: 0.3}
|
|
incompatible_observations:
|
|
- {primitive: motor.input_modality, value_eq: typed} # rules out pure human
|
|
exemplars: []
|
|
notes: |
|
|
Hybrid signature sitting between HUMAN (typed + bimodal + closed_loop + instant)
|
|
and LW-sim (pasted + habitual + llm_lightweight + linear_playbook + fire_and_forget).
|
|
|
|
Distinguishing tells from neighbors on the calibration grid:
|
|
vs HUMAN: pasted (not typed); absent error correction; metronomic (not bimodal); no tab use
|
|
vs LW-sim: paste rate is OCCASIONAL not HABITUAL (operator types some commands);
|
|
pauses sit in TYPING_SPEED band not LLM_LIGHTWEIGHT (faster — human is the bottleneck,
|
|
not the model)
|
|
vs CLAUDE-FF: same as LW-sim plus pause band difference; the heavyweight pause band cleanly excludes
|
|
this profile
|
|
|
|
The "occasional paste rate + typing_speed pauses" combination is the load-bearing fingerprint.
|
|
Pure-LLM operators paste habitually; pure humans don't paste at all; LLM-assisted operators
|
|
paste SOMETIMES (when copying an LLM suggestion verbatim) and type the rest, AND their pauses
|
|
are dominated by operator decision time (typing-speed) rather than model round-trip
|
|
(llm_lightweight or slower). This is the empirical signature that emerged from the 2026-05-02
|
|
calibration grid, replacing the v0.1 speculative definition.
|
|
|
|
CALIBRATION CAVEAT: the YOU-sim session that calibrated this profile was a human deliberately
|
|
pacing themselves to mimic an LLM-assisted operator (paste-and-pause uniformly). A REAL LLM-
|
|
assisted threat actor in the wild may show MORE variability (mixing typed and pasted within a
|
|
session, variable pause distributions) — the metronomic-paste-uniform signature here is the
|
|
IDEALIZED form. Real-world detection should weight the joint signature loosely until field-
|
|
validated against actual incident data.
|
|
min_confidence: 0.65
|
|
```
|
|
|
|
---
|
|
|
|
## Linkage Rules
|
|
|
|
These rules consume observations from two identities (or two sessions of one identity) and emit `attribution.linkage.proposed` events. The clusterer (or a human) accepts or rejects each proposal.
|
|
|
|
Confidence is numeric `[0.0, 1.0]`. Action thresholds are engine-configurable; reasonable defaults below.
|
|
|
|
| Correlation | Confidence | Suggested Action |
|
|
| :--- | :--- | :--- |
|
|
| Same motor profile + same toolchain | `>= 0.9` | Propose link / merge |
|
|
| Same motor profile + different toolchain | `0.75 - 0.9` | Propose link as tool rotation; flag for review |
|
|
| Different motor profile + same toolchain | `< 0.4` | Propose **shared infrastructure** marker; do NOT merge identities |
|
|
| Same motor profile + different IP/creds | `0.8 - 0.95` | Propose link; behavioral match overrides network identity |
|
|
| Environmental signals conflict with motor (e.g. layout/locale shift mid-session) | `0.5 - 0.7` | Flag for review; possible red team or proxied access |
|
|
|
|
"Same motor profile" here means an aggregate over the motor observation streams — the engine decides how to compute similarity (vector distance over feature space, learned embedding, etc.). BEHAVE provides the streams; the engine provides the metric.
|
|
|
|
---
|
|
|
|
## User-Owned Topic Schemas
|
|
|
|
These topics are NOT BEHAVE-owned and NOT engine-emitted. Users publish to them; the engine consumes them. Schemas are listed here for engine-implementer reference.
|
|
|
|
### `identity.label.applied`
|
|
|
|
Manual ground-truth label on an identity.
|
|
|
|
```
|
|
{
|
|
identity_ref: "uuid-...", # AttackerIdentity UUID
|
|
label: "ransomware_affiliate", # may match a profile name OR be free-form
|
|
source: "analyst:asamuel", # who applied the label
|
|
confidence: 0.95, # the labeler's confidence
|
|
evidence: "incident-4471", # optional pointer to evidence (ticket, IR report, etc.)
|
|
ts: 1714521661.001,
|
|
id: "uuid-...",
|
|
v: 1
|
|
}
|
|
```
|
|
|
|
### `identity.engagement.authorized`
|
|
|
|
Registry entry for an authorized engagement (red team, pentest, bug bounty window).
|
|
|
|
```
|
|
{
|
|
engagement_id: "engagement-2026-q2-redteam-acme",
|
|
scope: {
|
|
networks: ["10.0.0.0/8"],
|
|
domains: ["acme-test.example"],
|
|
accounts: ["redteam-svc-*"],
|
|
c2_watermarks: ["acme-cs-license-7f3a"], # known consultancy license keys
|
|
},
|
|
window: {
|
|
start_ts: 1714521600,
|
|
end_ts: 1717113600,
|
|
},
|
|
consultancy: "ACME Red Team Inc.",
|
|
contact: "redteam@acme-rt.example",
|
|
ts: 1714521661.001,
|
|
id: "uuid-...",
|
|
v: 1
|
|
}
|
|
```
|
|
|
|
The `authorized_red_teamer` profile recipe consumes this topic via its `necessary_observations` clause. Without a matching engagement registry entry, the profile does not apply.
|