Files
BEHAVE/BEHAVE-SHELL/attribution-recipes.md
anti bf654b9aed feat(shell): initial decnet_behave_shell spec + tests
Shell-session behavioral observation registry layered on core.
SPDX: GPL-3.0-or-later (code) / CC-BY-SA-4.0 (attribution-recipes.md).
2026-05-10 06:17:28 -04:00

18 KiB

BEHAVE Attribution Recipes

This document is not part of BEHAVE. BEHAVE (scratchpad.md) defines the observation taxonomy and emission envelope. It does not assert who an actor is, link sessions, or assign profiles. Those are attribution-engine concerns.

This document collects reference patterns for an attribution engine that consumes BEHAVE observations. The patterns are illustrative, not authoritative. A real engine may use any of these recipes, none of them, or its own.


Engine Interface

An attribution engine is a process that:

Consumes

  • attacker.observation.* — BEHAVE observation streams (the entire taxonomy from scratchpad.md).
  • identity.label.* — manual ground-truth labels applied by users (e.g. "this session was our internal red team").
  • identity.engagement.* — authorized-engagement registry (red-team scopes-of-work, bug-bounty windows, scheduled pentest dates).

Emits

  • attribution.profile.candidate — one or more profiles whose pattern an identity's observations partially match, each with a confidence score. Emitted continuously as observations accumulate.
  • attribution.profile.current — the engine's current best aggregate verdict for an identity. A view, not a fact.
  • attribution.profile.changed — fired when attribution.profile.current shifts.
  • attribution.linkage.proposed — engine proposes linking two identities or sessions, with a confidence score. The user / clusterer accepts or rejects.
  • attribution.confidence.delta — per-identity confidence trajectory, suitable for time-series visualization.

Does not emit

  • Anything in attacker.observation.* (BEHAVE-owned).
  • Anything in identity.label.* or identity.engagement.* (user-owned).

Replaceability

The engine is a separate package from BEHAVE. A BEHAVE deployment without an engine still produces useful observation streams; downstream consumers may aggregate them however they wish. A reference engine implementation may ship alongside BEHAVE for demos and bootstrap, but it is not BEHAVE.


Profile Recipes

Profiles are organized by motive + engagement model + skill tier + tradecraft discipline — the categories that intel teams (Mandiant, CrowdStrike, ENISA, ATT&CK Groups) use.

Each recipe defines:

  • dominant_observations — observations whose presence (over a session window) raises confidence in this profile. Each carries a weight [0.0, 1.0].
  • necessary_observations — observations that must appear in the window for the profile to be eligible. If absent, confidence is capped at zero.
  • incompatible_observations — observations whose presence excludes this profile.
  • exemplars — MITRE ATT&CK Group IDs (G####) or community-named groups that exemplify the profile.
  • min_confidence — floor below which the engine should not emit attribution.profile.candidate for this profile.

Engines are free to ignore weights, replace this scoring model, or learn their own from labeled data.


opportunistic_crimeware_operator

Volume-game commodity-malware operator. Buys/rents stealers (Raccoon, RedLine, Lumma, Vidar). Sloppy when forced to be manual.

profile: opportunistic_crimeware_operator
dominant_observations:
  - {primitive: motor.keystroke_cadence,         value_in: [bursty, hunt_and_peck], weight: 0.5}
  - {primitive: motor.error_correction,          value_in: [immediate],             weight: 0.4}
  - {primitive: cognitive.cognitive_load,        value_in: [high],                  weight: 0.5}
  - {primitive: cognitive.tool_vocabulary,       value_in: [narrow],                weight: 0.6}
  - {primitive: cognitive.error_resilience.retry_tactic, value_in: [rerun],         weight: 0.4}
  - {primitive: temporal.session_duration,       value_in: [short],                 weight: 0.4}
  - {primitive: temporal.persistence,            value_in: [hit_and_run],           weight: 0.5}
  - {primitive: operational.opsec_discipline,    value_in: [careless],              weight: 0.6}
  - {primitive: toolchain.tls.ja3_client,        match: common_default,             weight: 0.3}
incompatible_observations:
  - {primitive: motor.keystroke_cadence,         value_eq: machine}
exemplars: []
notes: "Tell vs. nearest neighbor (initial_access_broker): lacks validation discipline — does not test creds across services before exiting."
min_confidence: 0.55

initial_access_broker

Distinct profession in the criminal economy. Gets in, validates, sells. No post-exploitation.

profile: initial_access_broker
dominant_observations:
  - {primitive: motor.keystroke_cadence,         value_in: [steady],                weight: 0.5}
  - {primitive: motor.command_chunking,          value_in: [fluent],                weight: 0.5}
  - {primitive: cognitive.exploration_style,     value_in: [targeted],              weight: 0.7}
  - {primitive: cognitive.planning_depth,        value_in: [shallow],               weight: 0.4}
  - {primitive: temporal.session_duration,       value_in: [short],                 weight: 0.5}
  - {primitive: temporal.persistence,            value_in: [return_visitor],        weight: 0.5}
  - {primitive: operational.objective,           value_in: [recon],                 weight: 0.6}
  - {primitive: toolchain.http.user_agent_tool_class, value_in: [evilwinrm, impacket], weight: 0.5}
incompatible_observations:
  - {primitive: operational.objective,           value_in: [destructive]}
exemplars: ["UNC2465", "UNC2596"]
notes: "Tell vs. ransomware_affiliate: escalation absent — validates AD reachability and exits, never deploys payload."
min_confidence: 0.6

ransomware_affiliate

Post-exploitation hands-on actor running a RaaS playbook (LockBit, ALPHV/BlackCat, Akira, Play, Medusa).

profile: ransomware_affiliate
dominant_observations:
  - {primitive: motor.keystroke_cadence,         value_in: [steady],                weight: 0.5}
  - {primitive: motor.command_chunking,          value_in: [fluent],                weight: 0.5}
  - {primitive: cognitive.exploration_style,     value_in: [methodical],            weight: 0.7}
  - {primitive: temporal.escalation_pattern,     value_in: [bursty],                weight: 0.5}
  - {primitive: temporal.session_duration,       value_in: [long, marathon],        weight: 0.5}
  - {primitive: toolchain.c2.beacon_family,      value_in: [cobalt_strike, sliver, havoc], weight: 0.8}
necessary_observations:
  - {primitive: operational.objective,           value_in: [destructive], within_window: engagement}
incompatible_observations:
  - {primitive: identity.engagement.authorized,  matches_session: true}   # excludes red-team
exemplars: ["G1015", "G1040", "G0102"]
notes: "Tell vs. state_aligned_espionage_operator: dwell is days, not months; exfil-then-encrypt closes the engagement loudly."
min_confidence: 0.65

state_aligned_espionage_operator

APT tradecraft. Disciplined, patient, custom tooling, careful opsec, long dwell.

profile: state_aligned_espionage_operator
dominant_observations:
  - {primitive: motor.keystroke_cadence,         value_in: [steady],                weight: 0.5}
  - {primitive: motor.motor_stability,           value_in: [steady],                weight: 0.4}
  - {primitive: motor.error_correction,          value_in: [route_around],          weight: 0.5}
  - {primitive: cognitive.cognitive_load,        value_in: [low],                   weight: 0.5}
  - {primitive: cognitive.tool_vocabulary,       value_in: [broad],                 weight: 0.6}
  - {primitive: cognitive.planning_depth,        value_in: [deep],                  weight: 0.6}
  - {primitive: temporal.persistence,            value_in: [resident],              weight: 0.7}
  - {primitive: operational.opsec_discipline,    value_in: [careful],               weight: 0.7}
  - {primitive: operational.cleanup_behavior,    value_in: [thorough],              weight: 0.6}
  - {primitive: toolchain.c2.beacon_family,      value_in: [unknown],               weight: 0.4}  # custom implants
incompatible_observations:
  - {primitive: operational.objective,           value_in: [destructive], dominant_in_window: true}
  - {primitive: identity.engagement.authorized,  matches_session: true}
exemplars: ["G0007", "G0016", "G0050", "G0096"]
notes: |
  Tell vs. authorized_red_teamer: objective trends to long-term collection; no engagement-bounded dwell.
  Tell vs. ransomware_affiliate: encryption never fires.
min_confidence: 0.7

authorized_red_teamer

Pentester or red-team engagement. Legally scoped. Critical to distinguish — the most common attribution-fail is treating a friendly as hostile.

profile: authorized_red_teamer
necessary_observations:
  - {primitive: identity.engagement.authorized,  matches_session: true}   # without registry hit, profile cannot apply
dominant_observations:
  - {primitive: motor.keystroke_cadence,         value_in: [steady],                weight: 0.4}
  - {primitive: motor.command_chunking,          value_in: [fluent],                weight: 0.4}
  - {primitive: cognitive.tool_vocabulary,       value_in: [broad],                 weight: 0.5}
  - {primitive: cognitive.exploration_style,     value_in: [methodical],            weight: 0.5}
  - {primitive: temporal.session_timing,         value_in: [diurnal],               weight: 0.4}
  - {primitive: toolchain.c2.beacon_family,      value_in: [cobalt_strike, sliver], weight: 0.5}
exemplars: []
notes: |
  The necessary_observation on identity.engagement.authorized is load-bearing. Without an authoritative
  engagement registry hit, the profile must not apply — otherwise red-teamers collapse onto
  ransomware_affiliate. C2 watermark resolution against known commercial license keys is a secondary
  signal but not enforced in this recipe.
min_confidence: 0.7

malicious_insider (aspirational — requires per-identity baselining)

Already authenticated. Knows the environment. No exploitation phase. Not yet operational — depends on per-identity historical baselining, which is an engine feature that does not exist yet.

profile: malicious_insider
status: aspirational
necessary_observations:
  - {primitive: identity.label.applied,          contains: insider_baseline_exists}  # gate: can only apply if baseline exists
dominant_observations:
  - {primitive: cognitive.tool_vocabulary,       value_in: [narrow], context: environment_specific, weight: 0.4}
  - {primitive: temporal.session_timing,         deviation_from: identity_baseline,  weight: 0.6}
  - {primitive: operational.objective,           value_in: [exfil, destructive], no_recon_phase: true, weight: 0.6}
exemplars: []
notes: |
  Detectable only as DEVIATION FROM SELF, not from population. Requires per-identity historical
  baseline (NOT YET IMPLEMENTED). Cross-references HR/UEBA out-of-band. Until baselining ships,
  the engine should not emit candidates for this profile.
min_confidence: 0.7

automated_scanner_bot

Mass scanners (Shodan, Censys, internetdb), exploit-as-a-service worms (Mirai descendants, Mozi, RondoDox), opportunistic CVE chasers. No human present.

profile: automated_scanner_bot
necessary_observations:
  - {primitive: motor.keystroke_cadence,         value_eq: machine}
dominant_observations:
  - {primitive: motor.error_correction,          value_in: [absent],                weight: 0.5}
  - {primitive: temporal.lifecycle_markers.idle_periodicity, value_in: [periodic],  weight: 0.6}
  - {primitive: temporal.escalation_pattern,     value_in: [sustained],             weight: 0.5}
  - {primitive: operational.objective,           value_in: [recon],                 weight: 0.5}
  - {primitive: toolchain.http.user_agent_tool_class, value_in: [masscan, nuclei, unknown], weight: 0.5}
exemplars: []
notes: "Tell vs. opportunistic_crimeware_operator: no human latency, no error correction, no command sequencing."
min_confidence: 0.8

ai_assisted_operator (empirically calibrated 2026-05-02 — "YOU-sim" signature)

Operator working alongside an LLM — typing some commands, pasting others, pacing themselves at typing-speed because the LLM is suggesting next moves but the human is still in the chair making decisions. The most operationally important class to detect: this is the realistic 2026 adversary, neither pure human nor pure agent. They inherit some mechanical signatures from the LLM (clean pastes, no typos, scripted-feeling commands) and some human signatures from the operator (variable paste rate, faster pauses than pure LLM, real intent driving the recon flow). On the 5-point calibration grid this profile sits between HUMAN and LW-sim, sharing primitives with both — which is exactly why it's hard to spot and worth modelling explicitly.

profile: ai_assisted_operator
status: empirically_calibrated
calibration_session: "46434173-82ee-4b3b-bfcd-c954607050a2"  # YOU-sim, sessions-2026-05-02-with-llm.jsonl
dominant_observations:
  - {primitive: motor.input_modality,            value_in: [pasted],                weight: 0.6}
  - {primitive: motor.paste_burst_rate,          value_in: [occasional],            weight: 0.7}  # NOT habitual — that's pure-LLM
  - {primitive: motor.error_correction,          value_in: [absent],                weight: 0.5}
  - {primitive: motor.shell_mastery.tab_completion, value_in: [none],               weight: 0.4}
  - {primitive: cognitive.inter_command_latency_class, value_in: [typing_speed],    weight: 0.7}  # FASTER than llm_lightweight
  - {primitive: cognitive.inter_command_consistency, value_in: [metronomic],        weight: 0.6}
  - {primitive: cognitive.command_branch_diversity, value_in: [linear_playbook],    weight: 0.4}
  - {primitive: cognitive.feedback_loop_engagement, value_in: [fire_and_forget],    weight: 0.3}
incompatible_observations:
  - {primitive: motor.input_modality,            value_eq: typed}                   # rules out pure human
exemplars: []
notes: |
  Hybrid signature sitting between HUMAN (typed + bimodal + closed_loop + instant)
  and LW-sim (pasted + habitual + llm_lightweight + linear_playbook + fire_and_forget).

  Distinguishing tells from neighbors on the calibration grid:
    vs HUMAN:        pasted (not typed); absent error correction; metronomic (not bimodal); no tab use
    vs LW-sim:       paste rate is OCCASIONAL not HABITUAL (operator types some commands);
                     pauses sit in TYPING_SPEED band not LLM_LIGHTWEIGHT (faster — human is the bottleneck,
                     not the model)
    vs CLAUDE-FF:    same as LW-sim plus pause band difference; the heavyweight pause band cleanly excludes
                     this profile

  The "occasional paste rate + typing_speed pauses" combination is the load-bearing fingerprint.
  Pure-LLM operators paste habitually; pure humans don't paste at all; LLM-assisted operators
  paste SOMETIMES (when copying an LLM suggestion verbatim) and type the rest, AND their pauses
  are dominated by operator decision time (typing-speed) rather than model round-trip
  (llm_lightweight or slower). This is the empirical signature that emerged from the 2026-05-02
  calibration grid, replacing the v0.1 speculative definition.

  CALIBRATION CAVEAT: the YOU-sim session that calibrated this profile was a human deliberately
  pacing themselves to mimic an LLM-assisted operator (paste-and-pause uniformly). A REAL LLM-
  assisted threat actor in the wild may show MORE variability (mixing typed and pasted within a
  session, variable pause distributions) — the metronomic-paste-uniform signature here is the
  IDEALIZED form. Real-world detection should weight the joint signature loosely until field-
  validated against actual incident data.
min_confidence: 0.65

Linkage Rules

These rules consume observations from two identities (or two sessions of one identity) and emit attribution.linkage.proposed events. The clusterer (or a human) accepts or rejects each proposal.

Confidence is numeric [0.0, 1.0]. Action thresholds are engine-configurable; reasonable defaults below.

Correlation Confidence Suggested Action
Same motor profile + same toolchain >= 0.9 Propose link / merge
Same motor profile + different toolchain 0.75 - 0.9 Propose link as tool rotation; flag for review
Different motor profile + same toolchain < 0.4 Propose shared infrastructure marker; do NOT merge identities
Same motor profile + different IP/creds 0.8 - 0.95 Propose link; behavioral match overrides network identity
Environmental signals conflict with motor (e.g. layout/locale shift mid-session) 0.5 - 0.7 Flag for review; possible red team or proxied access

"Same motor profile" here means an aggregate over the motor observation streams — the engine decides how to compute similarity (vector distance over feature space, learned embedding, etc.). BEHAVE provides the streams; the engine provides the metric.


User-Owned Topic Schemas

These topics are NOT BEHAVE-owned and NOT engine-emitted. Users publish to them; the engine consumes them. Schemas are listed here for engine-implementer reference.

identity.label.applied

Manual ground-truth label on an identity.

{
  identity_ref: "uuid-...",                  # AttackerIdentity UUID
  label:        "ransomware_affiliate",      # may match a profile name OR be free-form
  source:       "analyst:asamuel",           # who applied the label
  confidence:   0.95,                        # the labeler's confidence
  evidence:     "incident-4471",             # optional pointer to evidence (ticket, IR report, etc.)
  ts:           1714521661.001,
  id:           "uuid-...",
  v:            1
}

identity.engagement.authorized

Registry entry for an authorized engagement (red team, pentest, bug bounty window).

{
  engagement_id:  "engagement-2026-q2-redteam-acme",
  scope: {
    networks:     ["10.0.0.0/8"],
    domains:      ["acme-test.example"],
    accounts:     ["redteam-svc-*"],
    c2_watermarks: ["acme-cs-license-7f3a"],   # known consultancy license keys
  },
  window: {
    start_ts:     1714521600,
    end_ts:       1717113600,
  },
  consultancy:    "ACME Red Team Inc.",
  contact:        "redteam@acme-rt.example",
  ts:             1714521661.001,
  id:             "uuid-...",
  v:              1
}

The authorized_red_teamer profile recipe consumes this topic via its necessary_observations clause. Without a matching engagement registry entry, the profile does not apply.