diff --git a/BEHAVE-SHELL/attribution-recipes.md b/BEHAVE-SHELL/attribution-recipes.md new file mode 100644 index 0000000..073c2e5 --- /dev/null +++ b/BEHAVE-SHELL/attribution-recipes.md @@ -0,0 +1,338 @@ + + +# BEHAVE Attribution Recipes + +> **This document is not part of BEHAVE.** BEHAVE (`scratchpad.md`) defines the observation taxonomy and emission envelope. It does **not** assert who an actor is, link sessions, or assign profiles. Those are attribution-engine concerns. +> +> This document collects **reference patterns** for an attribution engine that consumes BEHAVE observations. The patterns are illustrative, not authoritative. A real engine may use any of these recipes, none of them, or its own. + +--- + +## Engine Interface + +An attribution engine is a process that: + +### Consumes +- **`attacker.observation.*`** — BEHAVE observation streams (the entire taxonomy from `scratchpad.md`). +- **`identity.label.*`** — manual ground-truth labels applied by users (e.g. "this session was our internal red team"). +- **`identity.engagement.*`** — authorized-engagement registry (red-team scopes-of-work, bug-bounty windows, scheduled pentest dates). + +### Emits +- **`attribution.profile.candidate`** — one or more profiles whose pattern an identity's observations partially match, each with a confidence score. Emitted continuously as observations accumulate. +- **`attribution.profile.current`** — the engine's current best aggregate verdict for an identity. A view, not a fact. +- **`attribution.profile.changed`** — fired when `attribution.profile.current` shifts. +- **`attribution.linkage.proposed`** — engine proposes linking two identities or sessions, with a confidence score. The user / clusterer accepts or rejects. +- **`attribution.confidence.delta`** — per-identity confidence trajectory, suitable for time-series visualization. + +### Does not emit +- Anything in `attacker.observation.*` (BEHAVE-owned). +- Anything in `identity.label.*` or `identity.engagement.*` (user-owned). + +### Replaceability +The engine is a **separate package** from BEHAVE. A BEHAVE deployment without an engine still produces useful observation streams; downstream consumers may aggregate them however they wish. A reference engine implementation may ship alongside BEHAVE for demos and bootstrap, but it is not BEHAVE. + +--- + +## Profile Recipes + +Profiles are organized by **motive + engagement model + skill tier + tradecraft discipline** — the categories that intel teams (Mandiant, CrowdStrike, ENISA, ATT&CK Groups) use. + +Each recipe defines: + +- **`dominant_observations`** — observations whose presence (over a session window) raises confidence in this profile. Each carries a weight `[0.0, 1.0]`. +- **`necessary_observations`** — observations that *must* appear in the window for the profile to be eligible. If absent, confidence is capped at zero. +- **`incompatible_observations`** — observations whose presence excludes this profile. +- **`exemplars`** — MITRE ATT&CK Group IDs (`G####`) or community-named groups that exemplify the profile. +- **`min_confidence`** — floor below which the engine should not emit `attribution.profile.candidate` for this profile. + +Engines are free to ignore weights, replace this scoring model, or learn their own from labeled data. + +--- + +### `opportunistic_crimeware_operator` + +Volume-game commodity-malware operator. Buys/rents stealers (Raccoon, RedLine, Lumma, Vidar). Sloppy when forced to be manual. + +```yaml +profile: opportunistic_crimeware_operator +dominant_observations: + - {primitive: motor.keystroke_cadence, value_in: [bursty, hunt_and_peck], weight: 0.5} + - {primitive: motor.error_correction, value_in: [immediate], weight: 0.4} + - {primitive: cognitive.cognitive_load, value_in: [high], weight: 0.5} + - {primitive: cognitive.tool_vocabulary, value_in: [narrow], weight: 0.6} + - {primitive: cognitive.error_resilience.retry_tactic, value_in: [rerun], weight: 0.4} + - {primitive: temporal.session_duration, value_in: [short], weight: 0.4} + - {primitive: temporal.persistence, value_in: [hit_and_run], weight: 0.5} + - {primitive: operational.opsec_discipline, value_in: [careless], weight: 0.6} + - {primitive: toolchain.tls.ja3_client, match: common_default, weight: 0.3} +incompatible_observations: + - {primitive: motor.keystroke_cadence, value_eq: machine} +exemplars: [] +notes: "Tell vs. nearest neighbor (initial_access_broker): lacks validation discipline — does not test creds across services before exiting." +min_confidence: 0.55 +``` + +--- + +### `initial_access_broker` + +Distinct profession in the criminal economy. Gets in, validates, sells. No post-exploitation. + +```yaml +profile: initial_access_broker +dominant_observations: + - {primitive: motor.keystroke_cadence, value_in: [steady], weight: 0.5} + - {primitive: motor.command_chunking, value_in: [fluent], weight: 0.5} + - {primitive: cognitive.exploration_style, value_in: [targeted], weight: 0.7} + - {primitive: cognitive.planning_depth, value_in: [shallow], weight: 0.4} + - {primitive: temporal.session_duration, value_in: [short], weight: 0.5} + - {primitive: temporal.persistence, value_in: [return_visitor], weight: 0.5} + - {primitive: operational.objective, value_in: [recon], weight: 0.6} + - {primitive: toolchain.http.user_agent_tool_class, value_in: [evilwinrm, impacket], weight: 0.5} +incompatible_observations: + - {primitive: operational.objective, value_in: [destructive]} +exemplars: ["UNC2465", "UNC2596"] +notes: "Tell vs. ransomware_affiliate: escalation absent — validates AD reachability and exits, never deploys payload." +min_confidence: 0.6 +``` + +--- + +### `ransomware_affiliate` + +Post-exploitation hands-on actor running a RaaS playbook (LockBit, ALPHV/BlackCat, Akira, Play, Medusa). + +```yaml +profile: ransomware_affiliate +dominant_observations: + - {primitive: motor.keystroke_cadence, value_in: [steady], weight: 0.5} + - {primitive: motor.command_chunking, value_in: [fluent], weight: 0.5} + - {primitive: cognitive.exploration_style, value_in: [methodical], weight: 0.7} + - {primitive: temporal.escalation_pattern, value_in: [bursty], weight: 0.5} + - {primitive: temporal.session_duration, value_in: [long, marathon], weight: 0.5} + - {primitive: toolchain.c2.beacon_family, value_in: [cobalt_strike, sliver, havoc], weight: 0.8} +necessary_observations: + - {primitive: operational.objective, value_in: [destructive], within_window: engagement} +incompatible_observations: + - {primitive: identity.engagement.authorized, matches_session: true} # excludes red-team +exemplars: ["G1015", "G1040", "G0102"] +notes: "Tell vs. state_aligned_espionage_operator: dwell is days, not months; exfil-then-encrypt closes the engagement loudly." +min_confidence: 0.65 +``` + +--- + +### `state_aligned_espionage_operator` + +APT tradecraft. Disciplined, patient, custom tooling, careful opsec, long dwell. + +```yaml +profile: state_aligned_espionage_operator +dominant_observations: + - {primitive: motor.keystroke_cadence, value_in: [steady], weight: 0.5} + - {primitive: motor.motor_stability, value_in: [steady], weight: 0.4} + - {primitive: motor.error_correction, value_in: [route_around], weight: 0.5} + - {primitive: cognitive.cognitive_load, value_in: [low], weight: 0.5} + - {primitive: cognitive.tool_vocabulary, value_in: [broad], weight: 0.6} + - {primitive: cognitive.planning_depth, value_in: [deep], weight: 0.6} + - {primitive: temporal.persistence, value_in: [resident], weight: 0.7} + - {primitive: operational.opsec_discipline, value_in: [careful], weight: 0.7} + - {primitive: operational.cleanup_behavior, value_in: [thorough], weight: 0.6} + - {primitive: toolchain.c2.beacon_family, value_in: [unknown], weight: 0.4} # custom implants +incompatible_observations: + - {primitive: operational.objective, value_in: [destructive], dominant_in_window: true} + - {primitive: identity.engagement.authorized, matches_session: true} +exemplars: ["G0007", "G0016", "G0050", "G0096"] +notes: | + Tell vs. authorized_red_teamer: objective trends to long-term collection; no engagement-bounded dwell. + Tell vs. ransomware_affiliate: encryption never fires. +min_confidence: 0.7 +``` + +--- + +### `authorized_red_teamer` + +Pentester or red-team engagement. Legally scoped. **Critical to distinguish — the most common attribution-fail is treating a friendly as hostile.** + +```yaml +profile: authorized_red_teamer +necessary_observations: + - {primitive: identity.engagement.authorized, matches_session: true} # without registry hit, profile cannot apply +dominant_observations: + - {primitive: motor.keystroke_cadence, value_in: [steady], weight: 0.4} + - {primitive: motor.command_chunking, value_in: [fluent], weight: 0.4} + - {primitive: cognitive.tool_vocabulary, value_in: [broad], weight: 0.5} + - {primitive: cognitive.exploration_style, value_in: [methodical], weight: 0.5} + - {primitive: temporal.session_timing, value_in: [diurnal], weight: 0.4} + - {primitive: toolchain.c2.beacon_family, value_in: [cobalt_strike, sliver], weight: 0.5} +exemplars: [] +notes: | + The necessary_observation on identity.engagement.authorized is load-bearing. Without an authoritative + engagement registry hit, the profile must not apply — otherwise red-teamers collapse onto + ransomware_affiliate. C2 watermark resolution against known commercial license keys is a secondary + signal but not enforced in this recipe. +min_confidence: 0.7 +``` + +--- + +### `malicious_insider` *(aspirational — requires per-identity baselining)* + +Already authenticated. Knows the environment. No exploitation phase. **Not yet operational** — depends on per-identity historical baselining, which is an engine feature that does not exist yet. + +```yaml +profile: malicious_insider +status: aspirational +necessary_observations: + - {primitive: identity.label.applied, contains: insider_baseline_exists} # gate: can only apply if baseline exists +dominant_observations: + - {primitive: cognitive.tool_vocabulary, value_in: [narrow], context: environment_specific, weight: 0.4} + - {primitive: temporal.session_timing, deviation_from: identity_baseline, weight: 0.6} + - {primitive: operational.objective, value_in: [exfil, destructive], no_recon_phase: true, weight: 0.6} +exemplars: [] +notes: | + Detectable only as DEVIATION FROM SELF, not from population. Requires per-identity historical + baseline (NOT YET IMPLEMENTED). Cross-references HR/UEBA out-of-band. Until baselining ships, + the engine should not emit candidates for this profile. +min_confidence: 0.7 +``` + +--- + +### `automated_scanner_bot` + +Mass scanners (Shodan, Censys, internetdb), exploit-as-a-service worms (Mirai descendants, Mozi, RondoDox), opportunistic CVE chasers. **No human present.** + +```yaml +profile: automated_scanner_bot +necessary_observations: + - {primitive: motor.keystroke_cadence, value_eq: machine} +dominant_observations: + - {primitive: motor.error_correction, value_in: [absent], weight: 0.5} + - {primitive: temporal.lifecycle_markers.idle_periodicity, value_in: [periodic], weight: 0.6} + - {primitive: temporal.escalation_pattern, value_in: [sustained], weight: 0.5} + - {primitive: operational.objective, value_in: [recon], weight: 0.5} + - {primitive: toolchain.http.user_agent_tool_class, value_in: [masscan, nuclei, unknown], weight: 0.5} +exemplars: [] +notes: "Tell vs. opportunistic_crimeware_operator: no human latency, no error correction, no command sequencing." +min_confidence: 0.8 +``` + +--- + +### `ai_assisted_operator` *(empirically calibrated 2026-05-02 — "YOU-sim" signature)* + +Operator working alongside an LLM — typing some commands, pasting others, pacing themselves at typing-speed because the LLM is suggesting next moves but the human is still in the chair making decisions. **The most operationally important class to detect**: this is the realistic 2026 adversary, neither pure human nor pure agent. They inherit *some* mechanical signatures from the LLM (clean pastes, no typos, scripted-feeling commands) and *some* human signatures from the operator (variable paste rate, faster pauses than pure LLM, real intent driving the recon flow). On the 5-point calibration grid this profile sits **between** `HUMAN` and `LW-sim`, sharing primitives with both — which is exactly why it's hard to spot and worth modelling explicitly. + +```yaml +profile: ai_assisted_operator +status: empirically_calibrated +calibration_session: "46434173-82ee-4b3b-bfcd-c954607050a2" # YOU-sim, sessions-2026-05-02-with-llm.jsonl +dominant_observations: + - {primitive: motor.input_modality, value_in: [pasted], weight: 0.6} + - {primitive: motor.paste_burst_rate, value_in: [occasional], weight: 0.7} # NOT habitual — that's pure-LLM + - {primitive: motor.error_correction, value_in: [absent], weight: 0.5} + - {primitive: motor.shell_mastery.tab_completion, value_in: [none], weight: 0.4} + - {primitive: cognitive.inter_command_latency_class, value_in: [typing_speed], weight: 0.7} # FASTER than llm_lightweight + - {primitive: cognitive.inter_command_consistency, value_in: [metronomic], weight: 0.6} + - {primitive: cognitive.command_branch_diversity, value_in: [linear_playbook], weight: 0.4} + - {primitive: cognitive.feedback_loop_engagement, value_in: [fire_and_forget], weight: 0.3} +incompatible_observations: + - {primitive: motor.input_modality, value_eq: typed} # rules out pure human +exemplars: [] +notes: | + Hybrid signature sitting between HUMAN (typed + bimodal + closed_loop + instant) + and LW-sim (pasted + habitual + llm_lightweight + linear_playbook + fire_and_forget). + + Distinguishing tells from neighbors on the calibration grid: + vs HUMAN: pasted (not typed); absent error correction; metronomic (not bimodal); no tab use + vs LW-sim: paste rate is OCCASIONAL not HABITUAL (operator types some commands); + pauses sit in TYPING_SPEED band not LLM_LIGHTWEIGHT (faster — human is the bottleneck, + not the model) + vs CLAUDE-FF: same as LW-sim plus pause band difference; the heavyweight pause band cleanly excludes + this profile + + The "occasional paste rate + typing_speed pauses" combination is the load-bearing fingerprint. + Pure-LLM operators paste habitually; pure humans don't paste at all; LLM-assisted operators + paste SOMETIMES (when copying an LLM suggestion verbatim) and type the rest, AND their pauses + are dominated by operator decision time (typing-speed) rather than model round-trip + (llm_lightweight or slower). This is the empirical signature that emerged from the 2026-05-02 + calibration grid, replacing the v0.1 speculative definition. + + CALIBRATION CAVEAT: the YOU-sim session that calibrated this profile was a human deliberately + pacing themselves to mimic an LLM-assisted operator (paste-and-pause uniformly). A REAL LLM- + assisted threat actor in the wild may show MORE variability (mixing typed and pasted within a + session, variable pause distributions) — the metronomic-paste-uniform signature here is the + IDEALIZED form. Real-world detection should weight the joint signature loosely until field- + validated against actual incident data. +min_confidence: 0.65 +``` + +--- + +## Linkage Rules + +These rules consume observations from two identities (or two sessions of one identity) and emit `attribution.linkage.proposed` events. The clusterer (or a human) accepts or rejects each proposal. + +Confidence is numeric `[0.0, 1.0]`. Action thresholds are engine-configurable; reasonable defaults below. + +| Correlation | Confidence | Suggested Action | +| :--- | :--- | :--- | +| Same motor profile + same toolchain | `>= 0.9` | Propose link / merge | +| Same motor profile + different toolchain | `0.75 - 0.9` | Propose link as tool rotation; flag for review | +| Different motor profile + same toolchain | `< 0.4` | Propose **shared infrastructure** marker; do NOT merge identities | +| Same motor profile + different IP/creds | `0.8 - 0.95` | Propose link; behavioral match overrides network identity | +| Environmental signals conflict with motor (e.g. layout/locale shift mid-session) | `0.5 - 0.7` | Flag for review; possible red team or proxied access | + +"Same motor profile" here means an aggregate over the motor observation streams — the engine decides how to compute similarity (vector distance over feature space, learned embedding, etc.). BEHAVE provides the streams; the engine provides the metric. + +--- + +## User-Owned Topic Schemas + +These topics are NOT BEHAVE-owned and NOT engine-emitted. Users publish to them; the engine consumes them. Schemas are listed here for engine-implementer reference. + +### `identity.label.applied` + +Manual ground-truth label on an identity. + +``` +{ + identity_ref: "uuid-...", # AttackerIdentity UUID + label: "ransomware_affiliate", # may match a profile name OR be free-form + source: "analyst:asamuel", # who applied the label + confidence: 0.95, # the labeler's confidence + evidence: "incident-4471", # optional pointer to evidence (ticket, IR report, etc.) + ts: 1714521661.001, + id: "uuid-...", + v: 1 +} +``` + +### `identity.engagement.authorized` + +Registry entry for an authorized engagement (red team, pentest, bug bounty window). + +``` +{ + engagement_id: "engagement-2026-q2-redteam-acme", + scope: { + networks: ["10.0.0.0/8"], + domains: ["acme-test.example"], + accounts: ["redteam-svc-*"], + c2_watermarks: ["acme-cs-license-7f3a"], # known consultancy license keys + }, + window: { + start_ts: 1714521600, + end_ts: 1717113600, + }, + consultancy: "ACME Red Team Inc.", + contact: "redteam@acme-rt.example", + ts: 1714521661.001, + id: "uuid-...", + v: 1 +} +``` + +The `authorized_red_teamer` profile recipe consumes this topic via its `necessary_observations` clause. Without a matching engagement registry entry, the profile does not apply. diff --git a/BEHAVE-SHELL/decnet_behave_shell/__init__.py b/BEHAVE-SHELL/decnet_behave_shell/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/BEHAVE-SHELL/decnet_behave_shell/spec/__init__.py b/BEHAVE-SHELL/decnet_behave_shell/spec/__init__.py new file mode 100644 index 0000000..3686037 --- /dev/null +++ b/BEHAVE-SHELL/decnet_behave_shell/spec/__init__.py @@ -0,0 +1,37 @@ +# SPDX-License-Identifier: GPL-3.0-or-later +"""BEHAVE observation envelope and primitive registry — DECNET-aligned. + +Public API: + + from spec import Observation, Window, OBSERVATION_SCHEMA_VERSION + from spec import PRIMITIVE_REGISTRY, ValueKind, ValueTypeSpec + from spec import event_topic_for, to_event_payload, from_event_payload + +See ``spec.envelope`` for the central PII-discipline statement that binds every +sensor emitting BEHAVE observations. +""" + +from .envelope import OBSERVATION_SCHEMA_VERSION, Observation, ObservationValue, Window +from .event_adapter import ( + TOPIC_PREFIX, + event_topic_for, + from_event_payload, + to_event_payload, +) +from .primitives import PRIMITIVE_REGISTRY, ValueKind, ValueTypeSpec, get, is_known + +__all__ = [ + "OBSERVATION_SCHEMA_VERSION", + "Observation", + "ObservationValue", + "Window", + "PRIMITIVE_REGISTRY", + "ValueKind", + "ValueTypeSpec", + "is_known", + "get", + "TOPIC_PREFIX", + "event_topic_for", + "to_event_payload", + "from_event_payload", +] diff --git a/BEHAVE-SHELL/decnet_behave_shell/spec/envelope.py b/BEHAVE-SHELL/decnet_behave_shell/spec/envelope.py new file mode 100644 index 0000000..785cee1 --- /dev/null +++ b/BEHAVE-SHELL/decnet_behave_shell/spec/envelope.py @@ -0,0 +1,57 @@ +# SPDX-License-Identifier: GPL-3.0-or-later +"""BEHAVE-SHELL Observation envelope (registry-aware subclass). + +The base envelope (`Observation`, `Window`, `OBSERVATION_SCHEMA_VERSION`, +`ObservationValue`) lives in `decnet-behave-core`; it enforces only structural +invariants (window ordering, confidence bounds, schema version, no extras). + +This module subclasses the core `Observation` to add registry-aware validation +against `BEHAVE-SHELL`'s `PRIMITIVE_REGISTRY`. The subclass is exported under +the same name `Observation` so existing imports (``from spec.envelope import +Observation``) continue to resolve to the registry-validated form without +consumer changes. + +PII discipline (lifted from DECNET ``attackers.py:268-285,308-311``) — see the +core envelope module docstring for the binding statement. +""" + +from __future__ import annotations + +from pydantic import model_validator + +from decnet_behave_core.spec.envelope import ( + OBSERVATION_SCHEMA_VERSION, + ObservationValue, + Window, +) +from decnet_behave_core.spec.envelope import Observation as _BaseObservation + +from .primitives import PRIMITIVE_REGISTRY + + +class Observation(_BaseObservation): + """Shell-domain Observation: base envelope + BEHAVE-SHELL registry check.""" + + @model_validator(mode="after") + def _validate_against_shell_registry(self) -> "Observation": + spec = PRIMITIVE_REGISTRY.get(self.primitive) + if spec is None: + raise ValueError( + f"unknown primitive {self.primitive!r}; " + f"add it to spec/primitives.py:PRIMITIVE_REGISTRY first" + ) + try: + spec.validate_value(self.value) + except ValueError as exc: + raise ValueError( + f"value invalid for primitive {self.primitive!r}: {exc}" + ) from None + return self + + +__all__ = [ + "OBSERVATION_SCHEMA_VERSION", + "Observation", + "ObservationValue", + "Window", +] diff --git a/BEHAVE-SHELL/decnet_behave_shell/spec/event_adapter.py b/BEHAVE-SHELL/decnet_behave_shell/spec/event_adapter.py new file mode 100644 index 0000000..888e7e4 --- /dev/null +++ b/BEHAVE-SHELL/decnet_behave_shell/spec/event_adapter.py @@ -0,0 +1,58 @@ +# SPDX-License-Identifier: GPL-3.0-or-later +"""DECNET bus interop. Aligns BEHAVE Observation with DECNET Event payload shape. + +DECNET's Event (decnet/bus/base.py:26) carries ``(topic, payload, type, v, ts, id)``. +A BEHAVE Observation maps onto that envelope as follows: + + topic = "attacker.observation." + observation.primitive + payload = observation.model_dump(exclude={"id", "ts", "v"}) + type = observation.primitive + v = observation.v + ts = observation.ts + id = observation.id + +The publisher must set ``topic`` from the primitive when calling ``bus.publish()``; +DECNET's bus does not trust topic from the wire (anti-spoofing, base.py:60-76). + +This module does NOT import DECNET. The adapter speaks dicts; consumers wire it +to their own bus. +""" + +from __future__ import annotations + +from typing import Any + +from .envelope import Observation + +TOPIC_PREFIX: str = "attacker.observation" + + +def event_topic_for(primitive: str) -> str: + """Return the canonical DECNET bus topic for a BEHAVE primitive.""" + return f"{TOPIC_PREFIX}.{primitive}" + + +def to_event_payload(obs: Observation) -> dict[str, Any]: + """Project an Observation into a dict suitable for ``Event.payload``. + + Excludes ``id``, ``ts``, and ``v`` because those are carried at the Event + envelope level by DECNET, not in the payload body. + """ + return obs.model_dump(exclude={"id", "ts", "v"}, mode="json") + + +def from_event_payload(primitive: str, payload: dict[str, Any]) -> Observation: + """Reconstruct an Observation from ``(topic-derived primitive, Event.payload)``. + + The ``primitive`` argument is the trailing segment of the bus topic, NOT a + field read from the payload — relying on the wire-side ``primitive`` field + would let a misbehaving publisher spoof observations on topics they don't + actually publish to. This mirrors DECNET's ``Event.from_dict`` discipline + (decnet/bus/base.py:60-76). + """ + if "primitive" in payload and payload["primitive"] != primitive: + raise ValueError( + f"payload.primitive ({payload['primitive']!r}) does not match " + f"topic-derived primitive ({primitive!r}); refusing to reconstruct" + ) + return Observation.model_validate({**payload, "primitive": primitive}) diff --git a/BEHAVE-SHELL/decnet_behave_shell/spec/primitives.py b/BEHAVE-SHELL/decnet_behave_shell/spec/primitives.py new file mode 100644 index 0000000..6edd896 --- /dev/null +++ b/BEHAVE-SHELL/decnet_behave_shell/spec/primitives.py @@ -0,0 +1,298 @@ +# SPDX-License-Identifier: GPL-3.0-or-later +"""BEHAVE primitive registry. + +Source-of-truth for what `Observation.primitive` may be and what `Observation.value` +must look like. Mirrors every row in the primitive tables of `scratchpad.md`. + +Adding a new primitive is a deliberate registry edit. Sensors are expected to fail +loudly if they construct an `Observation` with an unknown primitive — that is by +design. Drift between this registry and `scratchpad.md` is a bug; v0.1 keeps the +registry hand-written so PR review catches drift, v0.2 may auto-extract from the +markdown if drift becomes a maintenance issue. + +PII discipline: the value-type specs here describe the SHAPE of the value, not +its content. Sensors are still bound by the rules in `spec/envelope.py`'s module +docstring — never put raw keystrokes, command bodies, credentials, or payload +bytes into a value, regardless of what shape this registry permits. +""" + +from __future__ import annotations + +from enum import Enum +from typing import Any, Optional + +from pydantic import BaseModel, Field + + +class ValueKind(str, Enum): + """Discriminator for the shape an `Observation.value` must take.""" + + CATEGORICAL = "categorical" # str, must appear in `allowed` + NUMERIC = "numeric" # int | float, optional min/max bounds + HASH = "hash" # str — hex / base64 / fingerprint string + ARRAY = "array" # list, element shape given by `array_of` + FREE_STRING = "free_string" # arbitrary string (e.g. BCP-47 locale, p0f label) + BOOL = "bool" # plain boolean + + +class ValueTypeSpec(BaseModel): + """Per-primitive value-type spec. + + Only the fields relevant to ``kind`` should be populated; the rest stay None. + Validation in ``Observation`` consults this spec to accept or reject a value + for a given primitive. + """ + + kind: ValueKind + allowed: Optional[list[str]] = Field( + default=None, description="CATEGORICAL only — enum of valid string values" + ) + min_val: Optional[float] = Field(default=None, description="NUMERIC lower bound (inclusive)") + max_val: Optional[float] = Field(default=None, description="NUMERIC upper bound (inclusive)") + array_of: Optional[ValueKind] = Field( + default=None, description="ARRAY only — kind of each element" + ) + notes: Optional[str] = Field(default=None, description="Free-form note for registry readers") + + def validate_value(self, value: Any) -> None: + """Raise ``ValueError`` if *value* does not conform to this spec.""" + if self.kind is ValueKind.CATEGORICAL: + if not isinstance(value, str): + raise ValueError(f"expected categorical string, got {type(value).__name__}") + if self.allowed is not None and value not in self.allowed: + raise ValueError( + f"value {value!r} not in allowed set {self.allowed!r}" + ) + elif self.kind is ValueKind.NUMERIC: + if isinstance(value, bool) or not isinstance(value, (int, float)): + raise ValueError(f"expected numeric, got {type(value).__name__}") + if self.min_val is not None and value < self.min_val: + raise ValueError(f"value {value} below min_val {self.min_val}") + if self.max_val is not None and value > self.max_val: + raise ValueError(f"value {value} above max_val {self.max_val}") + elif self.kind is ValueKind.HASH: + if not isinstance(value, str) or not value: + raise ValueError("expected non-empty hash string") + elif self.kind is ValueKind.FREE_STRING: + if not isinstance(value, str): + raise ValueError(f"expected string, got {type(value).__name__}") + elif self.kind is ValueKind.BOOL: + if not isinstance(value, bool): + raise ValueError(f"expected bool, got {type(value).__name__}") + elif self.kind is ValueKind.ARRAY: + if not isinstance(value, list): + raise ValueError(f"expected array, got {type(value).__name__}") + if self.array_of is None: + return + element_spec = ValueTypeSpec(kind=self.array_of) + for i, element in enumerate(value): + try: + element_spec.validate_value(element) + except ValueError as exc: + raise ValueError(f"array element [{i}]: {exc}") from None + + +# ─── Convenience constructors (keep the registry table readable) ──────────── + +def _cat(*allowed: str, notes: Optional[str] = None) -> ValueTypeSpec: + return ValueTypeSpec(kind=ValueKind.CATEGORICAL, allowed=list(allowed), notes=notes) + +def _num(min_val: Optional[float] = None, max_val: Optional[float] = None, notes: Optional[str] = None) -> ValueTypeSpec: + return ValueTypeSpec(kind=ValueKind.NUMERIC, min_val=min_val, max_val=max_val, notes=notes) + +def _hash(notes: Optional[str] = None) -> ValueTypeSpec: + return ValueTypeSpec(kind=ValueKind.HASH, notes=notes) + +def _str(notes: Optional[str] = None) -> ValueTypeSpec: + return ValueTypeSpec(kind=ValueKind.FREE_STRING, notes=notes) + +def _bool(notes: Optional[str] = None) -> ValueTypeSpec: + return ValueTypeSpec(kind=ValueKind.BOOL, notes=notes) + +def _array(of: ValueKind, notes: Optional[str] = None) -> ValueTypeSpec: + return ValueTypeSpec(kind=ValueKind.ARRAY, array_of=of, notes=notes) + + +# ─── The registry ─────────────────────────────────────────────────────────── +# +# Mirrors scratchpad.md row-for-row. If you edit one, edit the other. + +PRIMITIVE_REGISTRY: dict[str, ValueTypeSpec] = { + # ── motor.* ──────────────────────────────────────────────────────────── + "motor.keystroke_cadence": _cat("steady", "bursty", "hunt_and_peck", "machine"), + "motor.motor_stability": _cat("steady", "variable", "tremor"), + "motor.error_correction": _cat("immediate", "deferred", "absent", "route_around"), + "motor.command_chunking": _cat("fluent", "fragmented", "single_command"), + "motor.paste_burst_rate": _cat("none", "occasional", "habitual"), + "motor.input_modality": _cat( + "typed", "pasted", "mixed", + notes="dominant input modality across the session — first-class promotion of the paste-vs-type axis", + ), + # motor.shell_mastery.* + "motor.shell_mastery.tab_completion": _cat("none", "occasional", "habitual"), + "motor.shell_mastery.shortcut_usage": _cat("none", "moderate", "heavy"), + "motor.shell_mastery.pipe_chaining_depth": _cat("shallow", "moderate", "deep"), + + # ── cognitive.* ──────────────────────────────────────────────────────── + "cognitive.cognitive_load": _cat("low", "medium", "high"), + "cognitive.exploration_style": _cat("methodical", "chaotic", "targeted"), + "cognitive.planning_depth": _cat("deep", "shallow", "reactive"), + "cognitive.tool_vocabulary": _cat("narrow", "moderate", "broad"), + "cognitive.inter_command_latency_class": _cat( + "instant", "typing_speed", "deliberate", + "llm_lightweight", "llm_heavyweight", "long", + notes="llm_lightweight = 2-8s (orchestrated agents w/ small models or terse " + "prompts); llm_heavyweight = 8-30s (reasoning-class agents in tool " + "loops with text generation between calls); long = >30s (likely " + "human-supervised LLM workflow). The two LLM bands are the v0.2 " + "split of the original llm_roundtrip 2-8s band, which conflated " + "lightweight and reasoning-class operators.", + ), + "cognitive.inter_command_consistency": _cat( + "metronomic", "variable", "bimodal", + notes="dispersion (CV) of inter-command pauses; metronomic = LLM-pure, " + "variable = human, bimodal = LLM-assisted human (LLM-paced bursts + " + "human-thinking gaps). v0.1 uses CV thresholds; true bimodal " + "detection (Hartigan dip / two-peak detection) is v0.2.", + ), + "cognitive.command_branch_diversity": _cat( + "linear_playbook", "adaptive_branching", "unknown", + notes="Content-based (not timing-based) discriminator between scripted " + "playbook execution and adaptive branching. Computed from the " + "set of first-token binaries in the session: low repetition " + "(unique/total ratio near 1) = linear_playbook (each step a " + "different canonical recon command). High repetition (multiple " + "invocations of the same tool with different args) = adaptive_" + "branching (operator iterating on a tool to follow up on a " + "finding). Empirically (CLAUDE-FF vs CLAUDE-CL on 2026-05-02): " + "fire-and-forget runs 10 distinct tools, closed-loop runs 5-6 " + "tools with curl repeated as the operator chases a thread.", + ), + "cognitive.feedback_loop_engagement": _cat( + "closed_loop", "fire_and_forget", "unknown", + notes="Whether the operator's pace correlates with the volume of output " + "they observed before issuing the next command. closed_loop = " + "positive Pearson r between preceding output bytes and subsequent " + "pause (pause grows with output to read/ingest). fire_and_forget = " + "no correlation (operator paces independently of output, e.g. " + "scripted recon, prerecorded playbook). unknown = insufficient " + "samples to compute. CUTS ACROSS the LLM/human axis: humans reading " + "real output are closed_loop, scripted humans and fire-and-forget " + "LLM agents are fire_and_forget, closed-loop LLM agents (true plan-" + "execute-observe) are closed_loop. Replaces the v0.1 " + "output_pause_correlation primitive — same underlying measurement, " + "more honest framing.", + ), + # cognitive.error_resilience.* + "cognitive.error_resilience.retry_tactic": _cat("rerun", "modify", "switch", "abort"), + "cognitive.error_resilience.frustration_typing": _cat("low", "moderate", "high"), + "cognitive.error_resilience.fallback_to_man": _cat("absent", "present"), + + # ── temporal.* ───────────────────────────────────────────────────────── + "temporal.session_timing": _cat("diurnal", "nocturnal", "irregular"), + "temporal.session_duration": _cat("short", "medium", "long", "marathon"), + "temporal.escalation_pattern": _cat("sustained", "erratic", "bursty"), + "temporal.persistence": _cat("hit_and_run", "return_visitor", "resident"), + # temporal.lifecycle_markers.* + "temporal.lifecycle_markers.landing_ritual": _cat("present", "absent"), + "temporal.lifecycle_markers.exit_behavior": _cat("graceful", "abrupt", "cleanup"), + "temporal.lifecycle_markers.idle_periodicity": _cat("random", "periodic"), + + # ── operational.* ────────────────────────────────────────────────────── + "operational.opsec_discipline": _cat("careful", "careless", "learning"), + "operational.cleanup_behavior": _cat("thorough", "partial", "none"), + "operational.objective": _cat("recon", "exfil", "persistence", "lateral", "destructive"), + "operational.multi_actor_indicators": _cat("solo", "handoff_detected", "team_coordinated"), + + # ── environmental.* ──────────────────────────────────────────────────── + "environmental.keyboard_layout": _cat("qwerty", "azerty", "qwertz", "other"), + "environmental.locale": _str(notes="BCP-47 tag (e.g. 'en-US', 'pt-BR'); free string by deliberate choice"), + "environmental.numpad_usage": _cat("detected", "not_detected"), + "environmental.terminal_multiplexer": _cat("none", "tmux", "screen"), + "environmental.shell_type": _cat("bash", "zsh", "fish", "cmd.exe", "powershell"), + + # ── cultural.* ───────────────────────────────────────────────────────── + "cultural.meal_break_gaps": _cat("none_detected", "morning", "midday", "evening", "late_night"), + "cultural.periodic_micro_pauses": _cat("none_detected", "regular_intervals_detected"), + "cultural.dst_behavior": _cat("shifts_with_dst", "anchored_to_utc", "unknown"), + "cultural.weekend_cadence": _cat("fri_sat", "sat_sun", "no_weekend", "irregular"), + "cultural.holiday_gaps": _cat("none_detected", "specific_dates_detected"), + + # ── emotional_valence.* ──────────────────────────────────────────────── + "emotional_valence.valence": _cat("positive", "neutral", "negative"), + "emotional_valence.arousal": _cat("low_calm", "medium_engaged", "high_agitated"), + "emotional_valence.stress_response": _cat("none", "eustress_positive", "distress_negative"), + "emotional_valence.frustration_venting": _cat("none", "detected"), + + # ── toolchain.tls.* ──────────────────────────────────────────────────── + "toolchain.tls.ja3_client": _hash(), + "toolchain.tls.ja3s_server": _hash(), + "toolchain.tls.ja4_client": _hash(), + "toolchain.tls.ja4s_server": _hash(), + "toolchain.tls.jarm_server": _hash(notes="62-char JARM hash"), + "toolchain.tls.tls_cert_simhash": _hash(notes="SHA-256 hex of leaf cert"), + + # ── toolchain.transport.* ────────────────────────────────────────────── + "toolchain.transport.tcp_stack": _str(notes="p0f label, e.g. 'Linux 5.x'"), + "toolchain.transport.h2_akamai_fingerprint": _str(notes="HTTP/2 SETTINGS+priority+pseudo-header order hash; status: planned"), + "toolchain.transport.quic_client": _str(notes="QUIC initial packet fingerprint; status: planned"), + + # ── toolchain.ssh.* ──────────────────────────────────────────────────── + "toolchain.ssh.hassh_client": _hash(notes="md5"), + "toolchain.ssh.hassh_server": _hash(notes="md5; status: partial"), + "toolchain.ssh.ssh_client_banner": _str(notes="RFC 4253 banner string"), + "toolchain.ssh.kex_algorithm_order": _array(ValueKind.FREE_STRING), + + # ── toolchain.http.* ─────────────────────────────────────────────────── + "toolchain.http.user_agent_tool_class": _cat( + "nmap_nse", "sqlmap", "nuclei", "masscan", "curl", "metasploit", + "ffuf", "gobuster", "feroxbuster", "nikto", "wpscan", "evilwinrm", + "impacket", "unknown", + ), + "toolchain.http.header_order_fingerprint": _str(notes="status: planned"), + "toolchain.http.body_oddities": _array(ValueKind.FREE_STRING, notes="status: planned"), + + # ── toolchain.c2.* ───────────────────────────────────────────────────── + "toolchain.c2.beacon_family": _cat( + "cobalt_strike", "sliver", "havoc", "mythic", + "merlin", "brc4", "nighthawk", "unknown", + notes="last 3 = status: planned", + ), + "toolchain.c2.beacon_interval_ms": _num(min_val=0, notes="median IAT in milliseconds"), + "toolchain.c2.beacon_jitter_cv": _num(min_val=0, notes="coefficient of variation"), + "toolchain.c2.sleep_skew": _cat("none", "gaussian", "uniform", "walk", notes="status: partial"), + "toolchain.c2.c2_callback_endpoint": _str(notes="url or host:port"), + "toolchain.c2.attack_software_id": _str(notes="MITRE Software ID, e.g. 'S0154'"), + + # ── toolchain.protocol_abuse.* ───────────────────────────────────────── + "toolchain.protocol_abuse.dns_exfil_tool": _cat( + "iodine", "dnscat2", "custom_high_entropy", "none", notes="status: planned", + ), + "toolchain.protocol_abuse.smb_dialect": _cat( + "SMB1", "SMB2.0.2", "SMB2.1", "SMB3.0", "SMB3.0.2", "SMB3.1.1", + notes="status: planned", + ), + "toolchain.protocol_abuse.kerberos_etype_offer": _hash(notes="status: planned — hash of supported etypes"), + "toolchain.protocol_abuse.ldap_bind_pattern": _cat( + "simple", "sasl_gssapi", "ntlm", "ntlmssp_v1", "responder_like", + notes="status: partial", + ), + "toolchain.protocol_abuse.responder_signature": _str( + notes="bool + variant; convention: 'false' or 'true:llmnr', 'true:nbtns', etc.; status: planned", + ), + "toolchain.protocol_abuse.mitm6_signature": _bool(notes="status: planned"), + + # ── toolchain.payload.* ──────────────────────────────────────────────── + "toolchain.payload.payload_simhash": _hash(notes="64-bit SimHash, hex string"), + "toolchain.payload.payload_entropy_class": _cat("low", "medium", "high", "packed", notes="status: planned"), + "toolchain.payload.loader_family": _cat("donut", "sgn", "pe2sh", "nimcrypt", "unknown", notes="status: planned"), +} + + +def is_known(primitive: str) -> bool: + return primitive in PRIMITIVE_REGISTRY + + +def get(primitive: str) -> ValueTypeSpec: + """Return the value-type spec for *primitive*; raise KeyError if unknown.""" + return PRIMITIVE_REGISTRY[primitive] diff --git a/BEHAVE-SHELL/json/observation.schema.json b/BEHAVE-SHELL/json/observation.schema.json new file mode 100644 index 0000000..7e9af0a --- /dev/null +++ b/BEHAVE-SHELL/json/observation.schema.json @@ -0,0 +1,144 @@ +{ + "$defs": { + "Window": { + "description": "Measurement window. For point observations, ``start_ts == end_ts``.\n\nBoth fields are epoch seconds (float). Distinct from ``Observation.ts``\n(the emission time), because a sensor may compute an observation over\na window in the past and emit it later.", + "properties": { + "end_ts": { + "description": "Window end, epoch seconds (>= start_ts)", + "title": "End Ts", + "type": "number" + }, + "start_ts": { + "description": "Window start, epoch seconds", + "title": "Start Ts", + "type": "number" + } + }, + "required": [ + "start_ts", + "end_ts" + ], + "title": "Window", + "type": "object" + } + }, + "$id": "https://behave.local/schema/observation/v1.json", + "$schema": "https://json-schema.org/draft/2020-12/schema", + "additionalProperties": false, + "description": "Shell-domain Observation: base envelope + BEHAVE-SHELL registry check.", + "properties": { + "confidence": { + "description": "Sensor's confidence in this measurement (not in any downstream verdict)", + "maximum": 1.0, + "minimum": 0.0, + "title": "Confidence", + "type": "number" + }, + "evidence_ref": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "default": null, + "description": "Pointer to underlying raw evidence; NEVER the evidence itself", + "title": "Evidence Ref" + }, + "id": { + "description": "UUID for dedup", + "title": "Id", + "type": "string" + }, + "identity_ref": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "null" + } + ], + "default": null, + "description": "AttackerIdentity UUID if the observation is pre-attributed", + "title": "Identity Ref" + }, + "primitive": { + "description": "Fully-qualified primitive path, e.g. 'motor.keystroke_cadence'", + "title": "Primitive", + "type": "string" + }, + "source": { + "description": "Canonical sensor identifier, e.g. 'decnet/sniffer/timing.py'", + "minLength": 1, + "title": "Source", + "type": "string" + }, + "ts": { + "description": "Emission timestamp, epoch seconds", + "title": "Ts", + "type": "number" + }, + "v": { + "default": 1, + "description": "Envelope schema version", + "title": "V", + "type": "integer" + }, + "value": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "integer" + }, + { + "type": "number" + }, + { + "type": "boolean" + }, + { + "items": { + "type": "string" + }, + "type": "array" + }, + { + "items": { + "type": "integer" + }, + "type": "array" + }, + { + "items": { + "type": "number" + }, + "type": "array" + }, + { + "additionalProperties": true, + "type": "object" + } + ], + "description": "Value typed by the primitive's registry entry; see spec.primitives", + "title": "Value" + }, + "window": { + "$ref": "#/$defs/Window", + "description": "Measurement window" + } + }, + "required": [ + "primitive", + "value", + "confidence", + "window", + "source" + ], + "title": "Observation", + "type": "object" +} diff --git a/BEHAVE-SHELL/pyproject.toml b/BEHAVE-SHELL/pyproject.toml new file mode 100644 index 0000000..770db7d --- /dev/null +++ b/BEHAVE-SHELL/pyproject.toml @@ -0,0 +1,33 @@ +[build-system] +requires = ["setuptools>=68", "wheel"] +build-backend = "setuptools.build_meta" + +[project] +name = "decnet-behave-shell" +version = "0.1.0" +description = "BEHAVE-SHELL — shell-session behavioral observation registry, layered on decnet-behave-core" +requires-python = ">=3.11" +license = { text = "GPL-3.0-or-later" } +authors = [{ name = "ANTI" }] +dependencies = ["pydantic>=2.6", "decnet-behave-core>=0.1.0"] + +[project.optional-dependencies] +dev = ["pytest>=8", "pytest-cov", "ruff"] + +[project.urls] +"Source" = "https://git.resacachile.cl/anti/BEHAVE" + +[tool.setuptools.packages.find] +include = ["decnet_behave_shell*"] + +[tool.ruff] +line-length = 100 +target-version = "py311" + +[tool.ruff.lint] +select = ["E", "F", "I", "B", "UP"] +ignore = ["E501"] + +[tool.pytest.ini_options] +testpaths = ["tests"] +addopts = "-q --import-mode=importlib" diff --git a/BEHAVE-SHELL/scripts/generate_schema.py b/BEHAVE-SHELL/scripts/generate_schema.py new file mode 100644 index 0000000..23bc816 --- /dev/null +++ b/BEHAVE-SHELL/scripts/generate_schema.py @@ -0,0 +1,42 @@ +# SPDX-License-Identifier: GPL-3.0-or-later +"""Regenerate ``json/observation.schema.json`` from the Pydantic source of truth. + +Idempotent. CI can gate on ``git diff --quiet json/observation.schema.json`` after +running this — a non-empty diff means someone changed the model without +regenerating the JSON Schema artifact. +""" + +from __future__ import annotations + +import json +import sys +from pathlib import Path + +# Allow running this script directly without installing the package. +_REPO_ROOT = Path(__file__).resolve().parent.parent +if str(_REPO_ROOT) not in sys.path: + sys.path.insert(0, str(_REPO_ROOT)) + +from decnet_behave_shell.spec.envelope import OBSERVATION_SCHEMA_VERSION, Observation # noqa: E402 + + +def build_schema() -> dict: + schema = Observation.model_json_schema() + schema["$id"] = ( + f"https://behave.local/schema/observation/v{OBSERVATION_SCHEMA_VERSION}.json" + ) + schema["$schema"] = "https://json-schema.org/draft/2020-12/schema" + return schema + + +def main() -> int: + schema = build_schema() + out = _REPO_ROOT / "json" / "observation.schema.json" + out.parent.mkdir(parents=True, exist_ok=True) + out.write_text(json.dumps(schema, indent=2, sort_keys=True) + "\n") + print(f"wrote {out}") + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/BEHAVE-SHELL/tests/test_envelope.py b/BEHAVE-SHELL/tests/test_envelope.py new file mode 100644 index 0000000..d39d5b8 --- /dev/null +++ b/BEHAVE-SHELL/tests/test_envelope.py @@ -0,0 +1,107 @@ +# SPDX-License-Identifier: GPL-3.0-or-later +"""Registry-aware envelope tests for BEHAVE-SHELL. + +Structural envelope tests (window, confidence bounds, schema version, etc.) +live in `decnet-behave-core`'s test suite. This file exercises the SHELL- +SPECIFIC validation: that BEHAVE-SHELL's Observation subclass rejects +primitives not in the shell registry and rejects values that violate the +per-primitive ValueTypeSpec. +""" + +from __future__ import annotations + +import pytest +from pydantic import ValidationError + +from decnet_behave_shell.spec import Observation, Window + + +def _make(primitive: str = "motor.keystroke_cadence", value="steady", **kwargs) -> Observation: + base = dict( + primitive=primitive, + value=value, + confidence=0.8, + window=Window(start_ts=1.0, end_ts=2.0), + source="test/sensor", + ) + base.update(kwargs) + return Observation(**base) + + +def test_unknown_primitive_rejected(): + with pytest.raises(ValidationError) as exc_info: + _make(primitive="motor.nonexistent", value="whatever") + assert "unknown primitive" in str(exc_info.value) + + +def test_categorical_value_outside_allowed_rejected(): + with pytest.raises(ValidationError) as exc_info: + _make(primitive="motor.keystroke_cadence", value="not_a_real_value") + assert "not in allowed set" in str(exc_info.value) + + +def test_categorical_wrong_type_rejected(): + with pytest.raises(ValidationError): + _make(primitive="motor.keystroke_cadence", value=42) + + +def test_numeric_min_bound_enforced(): + with pytest.raises(ValidationError): + _make(primitive="toolchain.c2.beacon_interval_ms", value=-1) + + +def test_numeric_accepts_valid(): + obs = _make(primitive="toolchain.c2.beacon_interval_ms", value=60_000) + assert obs.value == 60_000 + + +def test_numeric_rejects_bool(): + # bool is a subclass of int — must be rejected explicitly. + with pytest.raises(ValidationError): + _make(primitive="toolchain.c2.beacon_interval_ms", value=True) + + +def test_hash_requires_nonempty_string(): + with pytest.raises(ValidationError): + _make(primitive="toolchain.tls.ja3_client", value="") + + +def test_array_validates_elements(): + obs = _make( + primitive="toolchain.ssh.kex_algorithm_order", + value=["curve25519-sha256", "ecdh-sha2-nistp256"], + ) + assert isinstance(obs.value, list) + + +def test_array_rejects_non_list(): + with pytest.raises(ValidationError): + _make(primitive="toolchain.ssh.kex_algorithm_order", value="not a list") + + +def test_bool_primitive_accepts_bool(): + obs = _make(primitive="toolchain.protocol_abuse.mitm6_signature", value=True) + assert obs.value is True + + +def test_bool_primitive_rejects_int(): + with pytest.raises(ValidationError): + _make(primitive="toolchain.protocol_abuse.mitm6_signature", value=1) + + +def test_free_string_primitive_accepts_arbitrary_string(): + obs = _make(primitive="environmental.locale", value="pt-BR") + assert obs.value == "pt-BR" + + +def test_extra_fields_still_forbidden_via_subclass(): + # Inherited from base — the subclass shouldn't relax this. + with pytest.raises(ValidationError): + Observation( + primitive="motor.keystroke_cadence", + value="steady", + confidence=0.5, + window=Window(start_ts=1.0, end_ts=2.0), + source="test/sensor", + unknown_field="oops", + ) diff --git a/BEHAVE-SHELL/tests/test_event_adapter.py b/BEHAVE-SHELL/tests/test_event_adapter.py new file mode 100644 index 0000000..705434f --- /dev/null +++ b/BEHAVE-SHELL/tests/test_event_adapter.py @@ -0,0 +1,89 @@ +# SPDX-License-Identifier: GPL-3.0-or-later +"""DECNET interop tests for the event adapter.""" + +from __future__ import annotations + +import pytest + +from decnet_behave_shell.spec import ( + Observation, + Window, + event_topic_for, + from_event_payload, + to_event_payload, +) + + +def _obs(**kwargs) -> Observation: + base = dict( + primitive="motor.keystroke_cadence", + value="steady", + confidence=0.8, + window=Window(start_ts=1.0, end_ts=2.0), + source="test/sensor", + ) + base.update(kwargs) + return Observation(**base) + + +def test_topic_derivation_uses_attacker_observation_prefix(): + topic = event_topic_for("motor.keystroke_cadence") + assert topic == "attacker.observation.motor.keystroke_cadence" + + +def test_topic_handles_deeply_nested_primitive(): + topic = event_topic_for("toolchain.protocol_abuse.smb_dialect") + assert topic == "attacker.observation.toolchain.protocol_abuse.smb_dialect" + + +def test_payload_excludes_envelope_level_fields(): + obs = _obs() + payload = to_event_payload(obs) + # These fields ride at the DECNET Event envelope, not in the payload body. + assert "id" not in payload + assert "ts" not in payload + assert "v" not in payload + # These remain in the payload body. + assert payload["primitive"] == "motor.keystroke_cadence" + assert payload["value"] == "steady" + assert payload["confidence"] == 0.8 + assert payload["source"] == "test/sensor" + + +def test_round_trip_through_event_payload(): + obs = _obs( + evidence_ref="session_X/keystrokes[0:42]", + identity_ref="00000000000000000000000000000001", + ) + payload = to_event_payload(obs) + reconstructed = from_event_payload("motor.keystroke_cadence", payload) + + # id and ts will differ (auto-generated on reconstruct), v defaults match. + assert reconstructed.primitive == obs.primitive + assert reconstructed.value == obs.value + assert reconstructed.confidence == obs.confidence + assert reconstructed.window == obs.window + assert reconstructed.source == obs.source + assert reconstructed.evidence_ref == obs.evidence_ref + assert reconstructed.identity_ref == obs.identity_ref + assert reconstructed.v == obs.v + + +def test_from_event_payload_rejects_topic_payload_mismatch(): + obs = _obs() + payload = to_event_payload(obs) + # payload still carries primitive="motor.keystroke_cadence"; reconstructing + # under a different topic-derived primitive must refuse rather than silently + # adopt the wire-side value (see decnet/bus/base.py:60-76 for the same anti- + # spoofing discipline). + with pytest.raises(ValueError, match="does not match"): + from_event_payload("toolchain.tls.ja3_client", payload) + + +def test_payload_is_json_serializable(): + import json + obs = _obs(primitive="toolchain.ssh.kex_algorithm_order", value=["a", "b"]) + payload = to_event_payload(obs) + serialized = json.dumps(payload) + deserialized = json.loads(serialized) + assert deserialized["value"] == ["a", "b"] diff --git a/BEHAVE-SHELL/tests/test_primitives.py b/BEHAVE-SHELL/tests/test_primitives.py new file mode 100644 index 0000000..42ab025 --- /dev/null +++ b/BEHAVE-SHELL/tests/test_primitives.py @@ -0,0 +1,134 @@ +# SPDX-License-Identifier: GPL-3.0-or-later +"""Registry coverage tests. + +Asserts that every primitive listed in scratchpad.md's tables has exactly one +entry in PRIMITIVE_REGISTRY. Drift-detector — failing this test means +scratchpad.md and the registry have diverged. +""" + +from __future__ import annotations + +import re +from pathlib import Path + +from decnet_behave_shell.spec import PRIMITIVE_REGISTRY, ValueKind + +# Primitive paths expected by scratchpad.md (hand-extracted; v0.1). +EXPECTED_PRIMITIVES = { + # motor.* + "motor.keystroke_cadence", + "motor.motor_stability", + "motor.error_correction", + "motor.command_chunking", + "motor.paste_burst_rate", + "motor.input_modality", + "motor.shell_mastery.tab_completion", + "motor.shell_mastery.shortcut_usage", + "motor.shell_mastery.pipe_chaining_depth", + # cognitive.* + "cognitive.cognitive_load", + "cognitive.exploration_style", + "cognitive.planning_depth", + "cognitive.tool_vocabulary", + "cognitive.inter_command_latency_class", + "cognitive.inter_command_consistency", + "cognitive.command_branch_diversity", + "cognitive.feedback_loop_engagement", + "cognitive.error_resilience.retry_tactic", + "cognitive.error_resilience.frustration_typing", + "cognitive.error_resilience.fallback_to_man", + # temporal.* + "temporal.session_timing", + "temporal.session_duration", + "temporal.escalation_pattern", + "temporal.persistence", + "temporal.lifecycle_markers.landing_ritual", + "temporal.lifecycle_markers.exit_behavior", + "temporal.lifecycle_markers.idle_periodicity", + # operational.* + "operational.opsec_discipline", + "operational.cleanup_behavior", + "operational.objective", + "operational.multi_actor_indicators", + # environmental.* + "environmental.keyboard_layout", + "environmental.locale", + "environmental.numpad_usage", + "environmental.terminal_multiplexer", + "environmental.shell_type", + # cultural.* + "cultural.meal_break_gaps", + "cultural.periodic_micro_pauses", + "cultural.dst_behavior", + "cultural.weekend_cadence", + "cultural.holiday_gaps", + # emotional_valence.* + "emotional_valence.valence", + "emotional_valence.arousal", + "emotional_valence.stress_response", + "emotional_valence.frustration_venting", + # toolchain.tls.* + "toolchain.tls.ja3_client", + "toolchain.tls.ja3s_server", + "toolchain.tls.ja4_client", + "toolchain.tls.ja4s_server", + "toolchain.tls.jarm_server", + "toolchain.tls.tls_cert_simhash", + # toolchain.transport.* + "toolchain.transport.tcp_stack", + "toolchain.transport.h2_akamai_fingerprint", + "toolchain.transport.quic_client", + # toolchain.ssh.* + "toolchain.ssh.hassh_client", + "toolchain.ssh.hassh_server", + "toolchain.ssh.ssh_client_banner", + "toolchain.ssh.kex_algorithm_order", + # toolchain.http.* + "toolchain.http.user_agent_tool_class", + "toolchain.http.header_order_fingerprint", + "toolchain.http.body_oddities", + # toolchain.c2.* + "toolchain.c2.beacon_family", + "toolchain.c2.beacon_interval_ms", + "toolchain.c2.beacon_jitter_cv", + "toolchain.c2.sleep_skew", + "toolchain.c2.c2_callback_endpoint", + "toolchain.c2.attack_software_id", + # toolchain.protocol_abuse.* + "toolchain.protocol_abuse.dns_exfil_tool", + "toolchain.protocol_abuse.smb_dialect", + "toolchain.protocol_abuse.kerberos_etype_offer", + "toolchain.protocol_abuse.ldap_bind_pattern", + "toolchain.protocol_abuse.responder_signature", + "toolchain.protocol_abuse.mitm6_signature", + # toolchain.payload.* + "toolchain.payload.payload_simhash", + "toolchain.payload.payload_entropy_class", + "toolchain.payload.loader_family", +} + + +def test_registry_covers_expected_primitives_exactly(): + registry_keys = set(PRIMITIVE_REGISTRY.keys()) + missing = EXPECTED_PRIMITIVES - registry_keys + extra = registry_keys - EXPECTED_PRIMITIVES + assert not missing, f"registry missing: {sorted(missing)}" + assert not extra, f"registry has unexpected entries: {sorted(extra)}" + + +def test_every_primitive_has_a_valid_spec(): + for primitive, spec in PRIMITIVE_REGISTRY.items(): + if spec.kind is ValueKind.CATEGORICAL: + assert spec.allowed, f"{primitive}: categorical must define `allowed`" + assert all(isinstance(v, str) for v in spec.allowed) + elif spec.kind is ValueKind.ARRAY: + assert spec.array_of is not None, f"{primitive}: array must define `array_of`" + assert spec.array_of is not ValueKind.ARRAY, ( + f"{primitive}: nested arrays not supported in v0.1" + ) + + +def test_primitive_paths_are_dotted_lowercase(): + pattern = re.compile(r"^[a-z][a-z0-9_]*(\.[a-z][a-z0-9_]*)+$") + for primitive in PRIMITIVE_REGISTRY: + assert pattern.match(primitive), f"malformed primitive path: {primitive!r}"