Pins the evidence shape for IPv6 link-local leakage findings. All fields
optional (total=False) so partial observation (passive sniffer vs active
solicitation) fills whatever the vector provides. Lifter lands in a
subsequent commit.
asyncio_default_fixture_loop_scope was 'module', so all async tests in
a module share one event loop. test_lifespan_startup_and_shutdown patched
log_ingestion_worker/log_collector_worker/attacker_profile_worker but not
tarpit_watcher_worker — the real while-True coroutine was created as an
asyncio task on the shared loop and never cancelled. The xdist worker ran
for 4+ hours (confirmed via py-spy + etime=04:48) consuming 15+ GB before
OOM-kill.
Fixes:
- Patch tarpit_watcher_worker in both TestLifespan tests
- Change asyncio_default_fixture_loop_scope to 'function' so each test
gets its own loop; tasks cannot outlive their test
- Add loop_scope='module' to precision_engine which legitimately needs
a module-scoped event loop
- test_evidence_shape.py: replace broken (command, BehavioralLifter)
pairing with correct (http_fingerprint, HttpFingerprintLifter) case;
expand _LIFTER_CASES to 5-tuples with per-lifter payloads and rule
factories; wire StubRuleStore + _index.install() per lifter; remove
xfail marker — all 4 parametrized cases now pass
- factory.py: add _span() helper gated on _telemetry._ENABLED; wrap
each per-lifter dispatch in _tag_one() that opens a
ttp.lifter.{name} child span per call
- http_fingerprint_lifter.py: add missing name = "http_fingerprint"
- test_tracing.py: replace pytest.fail() stubs in
test_lifter_child_spans_emitted and test_no_pii_canary_in_span_attributes
with real test bodies; remove xfail markers
Removes the E.3.14b xfail marker and writes the test body:
- _StubRepo gains get_attacker_intel_row_by_uuid(uuid) backed by an
optional intel_rows dict; existing tests pass None (no catch-up, no
change to their behaviour).
- The test drives a session.ended event with NO intel.enriched published,
injects an AttackerIntel row into the stub repo, and asserts the
tagger is called with source_kind='intel' carrying the correct payload
fields (abuseipdb_score, greynoise_classification).
- Pins the asymmetry contract: email.received has no catch-up path
(sibling test already green); intel does.
Replace pytest.fail() stub with actual test body: constructs IntelLifter
with R0054, feeds score=30 payload, asserts confidence=0.21 (0.70×0.30)
which is below CONFIDENCE_FLOOR. xfail marker removed.
Corrects docstring: R0054 T1110 base_conf=0.70, not 0.85 as originally written.
- TolerantTagger.tag validates evidence keys against EVIDENCE_SCHEMA TypedDicts;
TypeError (programmer error) propagates instead of being swallowed
- IntelEvidence and EmailEvidence expanded from stubs to full per-provider
key sets (total=False); IntelEvidence old stub fields replaced wholesale
- EVIDENCE_SCHEMA map added to models/ttp.py and imported by base.py
- TTPTag __table_args__ gains confidence [0,1] CheckConstraint (DB-enforced)
- xfail removed from test_confidence_outside_range_rejected_at_insert and
test_evidence_shape_violation_propagates_as_typeerror — both now pass
- TypeError removed from _SWALLOWED_EXCS fuzz list; test_intel_evidence_keys
updated to assert the real provider key set
Remaining files from the fingerprint-bounties + characterizes-SRO commit:
misp_export, repository, bounties mixin, all 4 router endpoints, and test suite
updates. Prerequisite: previous commit added _extract_fingerprint_bounty_data
and the stix_export changes.
Surfaces the intrusion-set reverse index from the loaded ATT&CK
bundle: given a technique, returns the list of groups MITRE has
documented as using it. Read-only — explicitly NOT an attribution
claim about a DECNET attacker. The frontend pulls this lazily when
the operator expands a technique panel; payload-size cost on every
TTPTagDetailRow makes embedding wasteful for techniques with 50+
documented groups.
- decnet/web/router/ttp/api_get_groups_for_technique.py exposes
GET /api/v1/ttp/techniques/{technique_id}/groups, response_model
list[GroupRef]. Same JWT-viewer auth gating as the rest of the
TTP router. 404 when the technique_id doesn't resolve in the
bundle.
- Sub-techniques are queried directly (no auto-union with parent)
to match ATT&CK Navigator semantics; callers that want a broader
view query the parent themselves.
- tests/ttp/test_groups_for_technique.py covers happy path, 404,
sub-technique attribution independence, empty-list-on-zero-groups,
and that responses include mitre_url + aliases.
- tests/web/test_api_attackers.py: fix pre-existing fixture drift
introduced by a2a61b63 — three TestGetAttackerDetail cases were
missing AsyncMock for repo.latest_observation_per_primitive,
causing TypeError on await of MagicMock. The new groups endpoint
doesn't share code with attacker_detail; this is a drive-by fix
surfaced by the same suite run.
Phase 2 attached mitre_url to intel-emitted tags' evidence JSON;
Phase 3 promotes it to a real column populated for *every* tag —
intel, credential, behavioral, canary, identity, email, rule-engine —
from one source. Pre-v1, so the SQLModel field is added directly
without an Alembic migration.
- TTPTag gains mitre_url: Optional[str] (not indexed — derived
deeplink, not a query target; technique_id is already indexed).
- _emit.py and rule_engine._evaluate_rules both populate mitre_url
via attack_stix.mitre_url_for(sub_technique_id or technique_id).
Sub-technique URL when present, else parent. The two construction
sites stay separate because the rule_engine path carries per-emit
span instrumentation that emit_tags() can't preserve without
threading a span object through; minimal-change beats forced
refactor here.
- intel_lifter strips mitre_url from evidence_extra in all four
decision functions. The column is canonical now; duplicating in
the JSON column would drift when the bundle moves. The unused
TechniqueEmission import + tracking dicts removed too.
- IdentityTechniqueRow / TechniqueRollupRow / TTPTagDetailRow /
CampaignTechniqueRow gain mitre_url: Optional[str].
- sqlmodel_repo/ttp.py:_mitre_url_for added; the 5 row-builder sites
pass mitre_url=_mitre_url_for(sub_technique_id or technique_id)
alongside the existing technique_name resolution.
- api_get_tag_details.py needs no change — list_tags_by_scope_and
_technique already returns model_dump() rows that flow the new
column through **row spread to TTPTagDetailRow.
- tests/ttp/test_emit_attaches_mitre_url.py covers both construction
paths (top-level, sub-tech, unknown, multi-emit) and a regression
test that intel_lifter evidence dicts no longer contain mitre_url.
Two reusable bundle-derived lookups that the next two commits build
on:
- mitre_url_for(tid) returns the canonical attack.mitre.org URL by
reading external_references on the cached attack-pattern. Backed
by the existing lru-cached _attack_pattern_by_id so per-call cost
is constant. Handles top-level techniques and sub-techniques
(T1059.004 -> .../techniques/T1059/004).
- GroupRef + groups_using_technique(tid) surface the intrusion-set
reverse index from the loaded bundle: given a technique, return
the MITRE-tracked groups documented as using it. Sorted by
group_id for deterministic responses; lru-cached. Sub-technique
semantics match ATT&CK Navigator (do NOT auto-union with parent).
- decnet/ttp/data/intel_loader._mitre_url_for collapses to a thin
re-export of attack_stix.mitre_url_for; the loader keeps mitre_url
on TechniqueEmission for the eventual STIX export.
- tests/ttp/test_attack_url.py covers both helpers: top-level + sub
URLs, unknown -> None / (), GroupRef immutability + hashability,
deterministic ordering, sub-technique distinct from parent.
The four provider→technique tables (AbuseIPDB cat→techniques,
GreyNoise tag→techniques, ThreatFox threat_type→techniques, plus
the Feodo binary-listed signal) used to live as Final[dict] constants
in intel_lifter.py. Two real problems with that:
1. Drift between rules/ttp/R0054.yaml..R0058.yaml (which declare
the full slate per provider) and the Python dicts (which decide
which slate-member fires per signal). The v2 audit comment in
intel_lifter.py documented that they had silently drifted.
2. No ATT&CK provenance on emissions — the loaded STIX bundle has
rich external_references (canonical attack.mitre.org URLs) that
never surfaced because the lifter had no path back to them.
Mappings now live as YAML at decnet/ttp/data/intel/{provider}.yaml,
validated at load against the loaded ATT&CK bundle, with each entry
enriched by attack_stix._attack_pattern_by_id to attach the canonical
MITRE URL to every emission.
- decnet/ttp/data/intel_loader.py: pydantic-validated schema +
ProviderMapping/Signal/TechniqueEmission frozen dataclasses +
load_provider_mapping(provider) lru-cached.
- Per-technique high_score_threshold inlined into YAML
(collapses the separate _ABUSEIPDB_HIGH_SCORE_GATED dict).
- external_reference field follows the STIX 2.1 external-reference
shape (source_name + url + optional external_id) so the future
STIX/MISP exporter is a direct translation.
- intel_lifter.py: dicts deleted, decision functions read from
ProviderMapping accessors. Decision-flow constants (T1071/T1595
bare-classification fallbacks in _greynoise_decisions) stay in
code — they're not table rows.
- Each emit slot's evidence_extra now carries mitre_url for any
technique resolved in the bundle (every one in practice).
- tests/ttp/test_intel_mappings.py: snapshot equivalence vs the
legacy dicts, high-score gate behavior, every-signal-has-an-
external-reference, every-emission-has-a-mitre-url, negative
paths (unknown technique_id raises AttackBundleError, mismatched
provider field rejected, dir listing matches expected providers).
The YAML schema + mitre_url enrichment lays groundwork for the
future STIX exporter; this commit does NOT build that exporter.
MITRE's ATT&CK Terms of Use require reproducing their copyright +
license alongside any cached copy of ATT&CK data. Today we ship the
bundle but not the license — this commit closes that compliance gap.
- attack_version.py pins ATTACK_LICENSE_URL +
ATTACK_LICENSE_SHA256 + ATTACK_LICENSE_FILENAME, sourced from the
same attack-stix-data repo as the bundle.
- attack_stix.py:_fetch_license downloads LICENSE.txt next to the
bundle. License sha mismatch is logged + refreshed (license text
gets occasional formatting tweaks; not a security event), unlike
the bundle which stays fail-closed.
- _ensure_license is the compliance ratchet: resolve_bundle_path
refuses to return without LICENSE.txt on disk. Override-mode
(DECNET_ATTACK_BUNDLE) checks for a sibling LICENSE.txt first,
then DECNET_ATTACK_LICENSE, then the cache dir.
- python -m decnet.ttp.attack_stix license prints the cached license
to stdout for operator audit.
- loaded_license_path() exposes the active license path read-only.
- tests/ttp/test_attack_license.py covers happy paths (sibling +
explicit env), refusal when DECNET_ATTACK_LICENSE points at a
missing file, the CLI subcommand, and the pinned-sha shape.
Drift between the technique/tactic IDs hardcoded in the lifters and
what the loaded ATT&CK STIX bundle actually contains is silent in the
status quo: a renamed-or-retired technique just stops being tagged.
Every emission point now has an explicit validator that asserts its
IDs resolve in the loaded bundle, called once at TTP-worker boot.
- intel_lifter.all_emitted_technique_ids() collects every technique
the four provider tables (AbuseIPDB / GreyNoise / Feodo / ThreatFox)
plus the decision-flow constants in _greynoise_decisions and
_feodo_decisions can emit. validate_against_attack_bundle() runs it
through attack_stix.assert_known_technique_ids().
- ukc.validate_against_attack_bundle() asserts every key in
ATTACK_TACTIC_TO_UKC resolves, with TA0100..TA0106 documented as
_NON_ENTERPRISE_TACTICS (lives in the ICS bundle, not the
enterprise bundle DECNET loads).
- decnet/ttp/worker.py:run_ttp_worker_loop calls both validators
before subscribing to the bus. A bundle-vs-code mismatch refuses
to start the worker rather than silently mistagging events.
- tests/ttp/test_attack_bundle_validation.py covers the happy path
for both validators, the negative path (injected bogus tactic ID
raises AttackBundleError), the ICS exemption, and the lone T1078
reference in credential_lifter.
Replace the hand-maintained TECHNIQUE_NAMES dict (pinned to v15.1) with
a runtime loader that reads the official enterprise-attack-N.json STIX
bundle. Version bumps now require only updating attack_version.py;
sub-technique parents, tactic IDs, and kill-chain phases all come from
MITRE's published data.
- decnet/ttp/attack_version.py pins version 19.0 + sha256 + URL
- decnet/ttp/attack_stix.py is the lazy STIX loader. Resolution order:
DECNET_ATTACK_BUNDLE env -> ~/.cache/decnet/attack/ -> fetch from
the pinned MITRE GitHub URL. SHA-256 verified before parse;
mismatch fails closed.
- decnet/ttp/attack_catalog.py collapses to a shim re-exporting
technique_name() so the ~9 router/repo call sites don't churn.
- python -m decnet.ttp.attack_stix fetch warms the cache and can
print sha256 for version-bump workflows.
- test_attack_catalog.py now asserts every rule-emitted ID resolves
in the loaded bundle (same contract, real source) and exercises
the SHA-256-mismatch fail-closed path.
Destructive half of BEHAVE-INTEGRATION.md Phase 1. SessionProfile +
its kd_* columns + the dialect ALTER TABLE migration helpers are
deleted outright; pre-v1, the table shipped empty, no migration
ceremony required (per the no-new-_migrate_-pre-v1 memory rule).
DEBT-036 closes via DEBT-050 supersedure. AttackerDetail's
``observations`` field is wired to the new ``observations`` table
and returns an empty list until the BEHAVE-SHELL extractor (DEBT-050
Phase 2) starts emitting.
decnet/web/db/models/attackers.py — SessionProfile class deleted
(~135 lines), KD_PAUSE_*/KD_START_OF_ACTION_IDLE_S module constants
deleted, module docstring updated to point at the observations
table. AttackerIdentity.kd_digraph_simhash is KEPT — it's the v2
federation centroid hook, not a SessionProfile field; docstring
repointed to the BEHAVE primitive that will populate it.
decnet/web/db/sqlmodel_repo/attackers/sessions.py — DELETED.
SessionProfilesMixin dropped from the AttackersMixin MRO.
decnet/web/db/repository.py — abstract upsert_session_profile +
get_session_profile removed.
decnet/web/db/sqlite/repository.py + mysql/repository.py —
_migrate_session_profile_table helpers and their initialize() calls
removed. mysql initialize() now goes attackers → column_types →
admin (no session_profile step).
decnet/web/db/models/__init__.py — SessionProfile re-export gone.
decnet/web/db/models/attacker_intel.py — docstring cross-reference
to SessionProfile.schema_version retargeted to AttackerIdentity.
decnet/web/router/attackers/api_get_attacker_detail.py — adds
``observations: []`` to the response by calling
``repo.latest_observation_per_primitive(uuid)`` and projecting to a
list sorted by primitive path. Empty until the extractor lands;
shape matches BEHAVE-INTEGRATION.md §"AttackerDetail consumer".
tests/profiler/test_session_profile.py — DELETED (56 lines).
tests/db/test_base_repo.py — DummyRepo loses upsert_session_profile
and get_session_profile overrides.
tests/db/mysql/test_mysql_migration.py — initialize-call-order
assertion updated; session_profile step removed from the expected
sequence; docstring records why.
tests/ttp/test_lifter_absence.py — docstring "no SessionProfile" →
"no ObservationRow".
R0047 (BEC) and the encoded-payload predicate substring-match against
the email body. Shipping raw body text on the abstracted service bus
is the wrong privacy stance — the bus transport may swap from UNIX
socket to networked at any time, and "loopback today" is not a license
to put PII on the wire.
EmailLifter now opens the .eml lazily from
/var/lib/decnet/artifacts/{decky_id}/smtp/{stored_as} when a body-aware
predicate runs and parses the body in-process via stdlib email +
policy.default. The decoded body is memoized into the payload dict so
multiple body-aware predicates on the same event open the file once.
Bus envelope only carries the artifact pointer (decky_id + stored_as);
raw body bytes never cross the host disk boundary on the agent → master
hop. Filesystem access on agents is unblocked by DEBT-035 (setgid +
group-readable artifacts root, paid 2026-05-02).
The legacy inline body_text path is preserved — when the producer ships
body_text on the bus the helper short-circuits without opening the file.
Three bug classes uncovered by the 2026-05-02 ship-time audit:
* AbuseIPDB code/name mismatch in v1: cat 10 was treated as DDoS (it's
Web Spam — DDoS is cat 4, intentionally unmapped per A.10) and cat 17
as VPN IP (it's Spoofing — VPN IP is cat 13). Both typos mirrored in
code AND the design doc Appendix A.10. Code now matches the AbuseIPDB
taxonomy exactly; cat 17 retargets to T1566 (email-spoofing as a
phishing precursor), and cats 7 (Phishing) and 16 (SQL Injection)
pick up T1566 / T1190 emissions that v1 didn't cover.
* ThreatFox dispatch keyed on `ioc_type` in v1, but `ioc_type` is the
indicator format (url / domain / hash variants) and carries no ATT&CK
signal. The canonical taxonomy field per ThreatFox's API is
`threat_type` (botnet_cc / payload_delivery / payload / cc_skimming).
Repoint dispatch through the new `threatfox_threat_types` payload
field; `ioc_type` rides as evidence only. Also adds the missing
cc_skimming -> T1056 (Input Capture) mapping and registers T1056 in
attack_catalog.py.
* GreyNoise bare-malicious lane: a `classification == "malicious"` row
with no recognised tag used to emit nothing. Now lights T1071 at a
half multiplier, suppressed when a tag already fires T1071 to avoid
double-stamping at conflicting confidence levels.
The inspector was dumping the whole `CMD uid=0 user=root src=… pwd=…
cmd=nmap -p- 192.168.1.0/24` syslog body into a single ``command_text``
blob. ANTI: "I'd like to separate the fields." Done — three layers
work together:
1. Collector session aggregator: new `_parse_cmd_msg` splits the bash
PROMPT_COMMAND msg into `{uid, user, src, pwd, command}`. The
session-ended envelope's per-command dict now carries the
structured fields, with `command_text` set to just the cmd= value
(preserving embedded whitespace — `nmap -p- 1.2.3.0/24` etc.).
2. Rule engine: per-source_kind auxiliary evidence list
(`_AUX_EVIDENCE_FIELDS`). For `command` events the engine
automatically promotes uid/user/src/pwd into the persisted
`evidence` dict on top of the rule's explicit `evidence_fields`.
Engine-controlled, not per-rule — adding a new aux field is one
line here, not a 30-rule YAML sweep, and rule authors can't
accidentally drop it.
3. TTPInspector frontend: evidence renders as a structured
`kvs` grid (UID / USER / SRC / PWD / CMD rows) instead of
pretty-printed JSON. Primary-order list keeps shell fields at
the top; everything else falls below alphabetically so unfamiliar
evidence shapes still surface predictably.
Tests:
- session_aggregator pins the structured-fields emit (uid/user/src/
pwd/command_text without "CMD" prefix, embedded whitespace
preserved).
- rule_engine_tagger pins the aux-field auto-promotion + the
no-`None`-leakage path when payload doesn't carry an aux key.
"T1595" alone is opaque; "T1595 — Active Scanning" tells you the
story at a glance. The names come from a backend-side static catalogue
pinned to the same ATT&CK release as the rule engine
(_ATTACK_RELEASE = "v15.1") — names are the canonical MITRE labels,
not author-supplied strings on rules, so a rule author can't typo a
name and the entire fleet sees the typo.
- New `decnet/ttp/attack_catalog.py` with `TECHNIQUE_NAMES` covering
every technique_id + sub_technique_id emitted by `rules/ttp/`
(R0001..R0058 → 69 IDs in the v0 pack).
- `IdentityTechniqueRow` / `TechniqueRollupRow` / `CampaignTechniqueRow`
/ `TTPTagDetailRow` gain optional `technique_name` /
`sub_technique_name` fields. Repo + router populate them from the
catalogue at row-construction time. None when an ID isn't in the
catalogue — UI falls back to the bare ID.
- Coverage test (`tests/ttp/test_attack_catalog.py`) walks every
YAML rule and asserts every emitted ID has a catalogue entry, so
a future rule author who forgets to update the catalogue gets a
loud failure rather than a silent UI fallback.
Frontend:
- `TTPsObservedSection` shows "T1595.002 — Active Scanning:
Vulnerability Scanning" instead of just the ID, with overflow
ellipsis + tooltip for narrow viewports. Inspector header /
TECHNIQUE row also surface the names.
The collector's `attacker.session.ended` envelope carries
`attacker_uuid: null` and `attacker_ip: <ip>` because the collector
doesn't talk to the DB. The TTP worker passed that null straight
through, and `TTPTag.__init__` raised the documented invariant:
ValueError: ttp_tag requires at least one of attacker_uuid /
identity_uuid; both NULL is not a valid anchor.
The worker now resolves `attacker_uuid` from `attacker_ip` via
`BaseRepository.get_attacker_uuid_by_ip` before fanning out the
event. When the IP isn't in the DB yet (profiler hasn't ingested
the row), the event is dropped with one log line — better than
exploding mid-tag.
- New `get_attacker_uuid_by_ip(ip) -> str | None` on the repo
(BaseRepository abstract + AttackersCoreMixin impl).
- `_resolve_attacker_uuid` helper in `decnet/ttp/worker.py` runs
before `_build_events`. Short-circuits when the payload already
has either anchor; drops the event when neither anchor is
resolvable.
- Tests pin: short-circuit on existing uuid/identity, repo lookup,
drop on unknown IP, drop on "Unknown" sentinel, drop on
no-anchor payload, drop on repo failure.
The endpoint was a contract-phase stub returning `[]` even though the
RuleStore loaded all 58 YAML rules at worker startup. UI saw an empty
table; operators couldn't tell whether anything was wired up.
- `api_list_rules` now calls `get_rule_store().load_compiled()` and
serializes each CompiledRule + its operational state into a
RuleCatalogueRow. Sorted by rule_id for stable golden snapshots.
- Add `description: str` to RuleSchema (pydantic) and CompiledRule
(NamedTuple, defaulted) + propagate through `_compile_one` so the
catalogue surfaces the human-readable YAML description, not just
the slug-style `name`.
- Update `tests/ttp/test_rule_engine.py` _fields assertion for the
new column; new `tests/api/ttp/test_rules_catalogue.py` pins the
catalogue contents (R0001/R0014 presence, row shape, sort order).
Worker behaviour is unchanged: it was already loading rules
correctly. This is purely a read-side wiring fix on the operator API.
The TTP worker entry moved out of decnet/cli/workers.py into its own
module so the TTP CLI surface (worker + admin verbs) is colocated,
mirroring decnet/cli/canary.py / webhook.py / swarm.py.
- New `decnet/cli/ttp.py` with `decnet ttp` (worker, ExecStart-stable
for decnet-ttp.service) and `decnet ttp-backfill --since-days N`.
- `decnet ttp-backfill` walks Attacker.commands and CanaryTrigger
history, dispatches each row through the live CompositeTagger,
persists tags via repo.insert_tags (idempotent INSERT OR IGNORE).
--dry-run / --source command|canary|all / --batch-size supported.
- Backfill deliberately bypasses bus publish — historical replay
must not re-trigger SIEM/webhook fan-out per TTP_TAGGING.md
§"Bus topics" loop-prevention invariant.
- Added `iter_attacker_commands_since` / `iter_canary_triggers_since`
read-only iterators on TTPMixin + abstract bindings on
BaseRepository.
- Master-only via gating; both `ttp` and `ttp-backfill` listed in
MASTER_ONLY_COMMANDS.
The canonical rule-based engine from §"Tagging engines, layered §1"
of TTP_TAGGING.md was fully implemented but never instantiated as a
composite child — pure pattern rules (R0014/R0017/R0023/... 23 rules
total) had no tagger to dispatch them.
- Add `RuleEngineTagger(Tagger)` adapter in rule_engine.py wrapping
`RuleEngine.evaluate()`. `HANDLES = {command, http_request,
auth_attempt, payload}` — the source kinds whose rules typically
live outside any per-source lifter.
- Adapter's `watch_store()` filters via `_is_engine_owned` so the
engine's dispatch index excludes lifter-claimed rules
(`match.kind: lifter:*`) and stays disjoint from per-lifter ownership.
- Prepend `RuleEngineTagger` to the `CompositeTagger` lifter list so
generic pattern rules dispatch before per-source cross-event logic.
- Composes with E.3.18a (worker hydrates `watch_store`) and E.3.18b
(worker fans session payloads into per-`command` events) — together
these three commits make R0001–R0030 actually fire at runtime.
R0001–R0030 declare `applies_to: [command]` and match per command, not
per session. The worker now translates one `attacker.session.ended`
payload carrying a `commands: list` into:
- one source_kind="session" event (behavioral / cross-event lifters)
- one source_kind="command" event per command (RuleEngineTagger)
Both string and dict command shapes are accepted; dicts contribute
their `id` / `uuid` / `command_id` as the per-command source_id so
the deterministic `compute_tag_uuid` keeps replays idempotent. Tags
from session + per-command dispatch are aggregated into a single
`ttp.tagged` envelope per upstream session.
Each per-source lifter holds its own RuleIndex and exposes an
`async watch_store()` that loads the corpus and drains store change
events forever. Until this commit nothing called `watch_store()` in
production — every dispatch index stayed empty and no rule fired.
- Add `WatchableTagger` runtime-checkable Protocol in `decnet.ttp.base`.
- `CompositeTagger.iter_watchables()` yields lifters that satisfy it.
- `run_ttp_worker_loop` fans out one task per watchable, cancelled
and awaited alongside pump/heartbeat/control in the existing finally.
- Watch failures log and exit the watch task without taking the
worker down — mirrors the pump-task tolerance contract.
Inner loop drains a per-process asyncio.Queue populated by one pump
task per topic in _TOPICS, dispatches each event through
CompositeTagger, persists via repo.insert_tags(), and publishes
ttp.tagged + per-technique ttp.rule.fired.<id> only when the insert
returned a non-zero rowcount.
CompositeTagger seeded with all six lifters (Behavioral, Intel,
CanaryFingerprint, Email, Identity, Credential).
Loop-prevention invariant from TTP_TAGGING.md §"Bus topics" enforced:
N replays of the same upstream event publish exactly one ttp.tagged
event. test_worker_bus covers both the direct invocation path and
the idempotency replay path.
Intel catch-up via attacker.session.ended is intentionally deferred
to E.3.14b — needs a session→intel join the repo doesn't expose yet.
IdentityLifter owns lifter:identity_* — currently R0003 (password
spraying). CredentialLifter owns lifter:credential_* — R0001 generic
auth brute, R0002 password guessing, R0004 credential reuse, R0005
valid-account use, R0006 default credentials.
YAMLs R0001/R0002/R0003/R0005/R0006 had their match.kind normalised
to fit the lifter prefix scheme — the design doc's promised "YAMLs
normalised in a separate refactor commit" lands here.
Identity-rollup tags null out attacker_uuid on emit so the worked-
example invariant holds (the tag belongs to the Identity, never to
one member IP).
Tests: test_identity_lifter.py + test_credential_lifter.py cover
each predicate's positive/negative path, state modulation
(disabled/clipped/expired), source-kind gating, and idempotent
replay. test_lifter_absence and test_lifters updated for the new
ctor signature.
SMTP message-level technique tagger per Appendix A.6: open relay abuse
(rcpt_count + foreign From), mass phishing (rcpt_count + body simhash),
phishing-kit X-Mailer, IDN/punycode URL, sender masquerade composite
(From/Return-Path/DKIM/SPF), malicious attachment (macro/.lnk/.iso/.img/
hash match), BEC subject+body composite, encoded payload in body.
PII discipline (TTP_TAGGING.md §'Hard parts §6') is enforced at the
lifter layer via _filter_evidence(): emitted TTPTag.evidence is
restricted to the EmailEvidence-allowed allowlist (body_sha256,
matched_headers — names only, rcpt_domain_set — domains only,
attachment_sha256s, rcpt_count) plus PII-safe match discriminators
(matched_kit, matched_trigger, matched_url_host, etc). Raw addresses,
raw body bytes, full URLs, and decoded base64 previews NEVER appear in
evidence — defense-in-depth over the YAML evidence_fields hint.
Tests: tests/ttp/test_email_lifter.py per-rule positive + negative +
PII allowlist guard + state modulation. tests/ttp/rule_precision/
test_email_rules.py xfail flipped to real precision (R0041-R0048
H-band ≥95%). Corpus rows updated to acknowledge that R0045 (masquerade)
co-fires with R0041 / R0047 when the sender-masquerade signals are
present alongside open-relay or BEC patterns — overlap is by design,
not a precision bug.
Browser-payload derivations per Appendix A.9: navigator.webdriver flag,
canvas/audio/WebGL automation hash matches (Puppeteer/Playwright/
Selenium/curl-impersonate), WebRTC IP leak, TZ/language vs source-IP
geo mismatch, navigator.platform vs userAgent vs WebGL renderer
inconsistency.
Evidence shape pinned to CanaryFingerprintEvidence (metric +
matched_signature) — raw fingerprint blobs (canvas hashes, full UAs,
navigator.platform values) explicitly NOT carried into TTPTag.evidence
per TTP_TAGGING.md §'Hard parts §7' (enrichment vs tag boundary). The
identity-merge guard rail is preserved: composite fp.id matches across
IPs are NOT a TTP, so no rule fires on the bare hash.
Tests: tests/ttp/test_canary_fingerprint_lifter.py per-rule positive +
negative + evidence-shape guard + state modulation.
tests/ttp/rule_precision/test_canary_rules.py xfail flipped to real
precision (R0049/R0050/R0051/R0053 H-band ≥95%; R0052 M-band ≥80%).
Per-provider verdict translator for AbuseIPDB, GreyNoise, Feodo Tracker,
and ThreatFox per Appendix A.10. Each rule's predicate inspects payload
fields produced by the enrich worker (no DB I/O, no decnet.intel.*
imports — E.2.7 decoupling guard preserved). AbuseIPDB confidence is
scaled by abuse_confidence_score / 100; categories drive per-technique
fan-out. R0058 aggregate-bump is a no-op in v0 (cross-tag bump deferred
to E.3.14 worker bootstrap).
Per-provider null tolerance is the steady state — a missing provider
column produces zero tags from that rule, never an error.
Tests:
- tests/ttp/test_intel_lifter.py — per-provider positive + negative +
state modulation + decoupling source-import guard.
- tests/ttp/rule_precision/test_intel_rules.py — xfail flipped, real
precision driven over seed_intel.jsonl (R0054-R0057 H-band ≥95%;
R0058 skipped as bump-only).
- tests/ttp/test_lifter_absence.py — IntelLifter all-populated test
flipped from xfail-strict to real assertion with realistic payload.
- tests/ttp/test_lifters.py — partial-null xfail flipped to real
assertion.
Reads pre-shaped session aggregates from TaggerEvent.payload and emits
techniques per Appendix A behavior tables. Per-rule predicates dispatch
on match.kind (lifter:behavioral_<name>); the lifter holds its own
RuleIndex watching the same RuleStore as the engine, so disable / clip /
TTL state reaches lifter-bound rules through the same atomic-swap path.
R0032/R0036/R0037/R0040 YAMLs had over-escaped regex strings (\\
instead of \\) — fixed in place.
Factory wired so default get_tagger() returns CompositeTagger with
BehavioralLifter shipped; remaining three lifters (E.3.10-E.3.12) land
in subsequent commits.
E.2.6 contract preserved via TolerantTagger: empty payload steady-state
yields [] with zero ERROR records. Disabled / clipped / expired state
verified.
E.3.9.0 prerequisite for the per-source lifters (E.3.9-E.3.13). The
dispatch index, install/evict/apply_change atomic-swap protocol, and
state-modulation helpers (is_active / apply_ceiling) move out of
rule_engine.py into _rule_index.py and _state.py. RuleEngine wraps a
RuleIndex; back-compat shims preserve _by_kind / _by_rule / _install
attribute access for tests poking at the dispatch internals.
Lifters in E.3.9-E.3.12 will each hold their own RuleIndex, watching
the same RuleStore via subscribe_changes() fan-out. Hot-reload
semantics (disable / clip / TTL via set_state API) now reach
lifter-bound rules through the same atomic-swap path the engine uses,
not a future composite-rebuild compromise.
5 YAMLs for the intel-verdict cohort per Appendix B / A.10:
AbuseIPDB category mapping, GreyNoise classification, Feodo
Tracker hit, ThreatFox IOC type, aggregate-malicious bump-only.
IntelLifter (E.3.10) consumes by rule_id and tolerates absence
silently (null provider column → no tag).
R0058 is the meta bump-only rule — emits a single confidence=0.0
sentinel so it validates and surfaces in the catalogue, but the
repository's sub-0.3 drop ensures no fresh tag persists if the
fanout fires accidentally. test_intel_rules.py pins that
zero-confidence invariant.
Marks E.3.8 done in development/TTP_TAGGING.md with the cohort-
split summary.
5 YAMLs for the canary-fingerprint cohort per Appendix B / A.9:
navigator.webdriver flag, automation canvas/audio/WebGL hash match,
WebRTC IP leak, TZ/lang vs geo mismatch, platform inconsistency.
CanaryFingerprintLifter (E.3.11) consumes by rule_id.
test_canary_rules.py: YAML-present + inert-in-v0 + xfail(strict)
gated on E.3.11.
10 YAMLs for the behavioral / cross-event cohort per Appendix B:
beaconing, data destruction, ransom note, web exfil, DB mass-read,
credentials-in-files, k8s SA token harvest, Docker host escape,
LLMNR poisoning, TFTP router-config retrieval.
Every rule is lifter-bound (BehavioralLifter / IdentityLifter) —
the v0 RuleEngine cannot count, aggregate, or compose cross-event
signals, so these YAMLs declare the technique mappings the lifter
will consume by rule_id at E.3.9. Their match specs use a
'kind: lifter:*' shape inert to the regex matcher.
test_behavioral_rules.py asserts each YAML compiles, none fire
from the v0 engine (FP regression guard against a YAML drifting
into a regex), and an xfail(strict=True, reason='impl phase E.3.9')
precision case that will flip green when the lifter lands.
30 YAMLs for the shell/command rule cohort per Appendix B (rules/ttp/).
Splits into engine-active (R0007-R0029, regex on command_text /
raw_url / user_agent) and lifter-bound (R0001-R0006, R0030 — the
v0 RuleEngine cannot count auth attempts, do identity rollups, or
parse fingerprint blobs; the BehavioralLifter / IdentityLifter /
CredentialLifter consume them by rule_id at E.3.9 / E.3.13).
test_command_rules.py asserts:
- every R000N has a YAML that compiles
- lifter-bound rules NEVER fire from the v0 engine (regression
guard against a YAML drifting into a regex match.spec)
- engine-active rules meet their Appendix-C precision target
against the seed corpus (≥0.95 high-conf, ≥0.80 medium)
Conftest fixes: precision_engine moved to module-scope so module-
scope precomputed dispatch fixture (fired_by_label) can request it;
_RULES_DIR path bumped from parents[2] to parents[3] so the loader
resolves the project root regardless of pytest cwd; make_event
synthesizes attacker_uuid so TTPTag's anchor invariant is satisfied.
Seed corpus broadened: positive examples for every regex rule plus
6 negative examples across innocuous shell verbs (ls, echo, cd, ps,
df, free) so FPs surface in precision rather than passing vacuously.
Sub-step preceding the rule-pack commits per TTP_TAGGING.md:2967.
Adds the per-rule precision suite scaffolding under
tests/ttp/rule_precision/:
- conftest.py: precision_engine fixture (RuleEngine populated from
./rules/ttp/), corpus_loader (real → seed → empty fallback),
precision_for() helper for TP/FP accounting.
- _build_corpus.py: extractor for a real prod corpus pull. Mandatory
--exclude-ip / DECNET_TTP_CORPUS_EXCLUDE_IPS — operator IPs never
end up in the committed exclusion list. Pulls both 'command' and
'unknown_command' event types.
- corpus/seed_*.jsonl: synthetic seed rows for each cohort so the
harness exercises in clean checkouts.
- corpus/*.jsonl (operator-built) is gitignored.
- test_corpus_loads.py: sentinel that every seed file parses.
Implements the rule engine body left empty at contract phase: evaluate()
dispatches by source_kind through self._by_kind, runs the rule's match
spec against event.payload, and emits one TTPTag per emits entry.
watch_store() loads the initial corpus from RuleStore.load_compiled,
then drains subscribe_changes, applying definition changes via
single-statement dict assignment (atomic swap, GIL-atomic to readers)
and state changes via NamedTuple._replace on the existing CompiledRule.
Why: with the FS + DB stores in place (E.3.5/E.3.6), the engine is the
last piece of the rule plane. Lifters (E.3.9–E.3.13) consume the
engine; the worker bootstrap (E.3.14) wires watch_store into the
asyncio event loop. After this commit a CompositeTagger constructed
with a RuleEngine + a populated rules dir will produce real tags.
Notes:
- CompiledRule.emits extended to 4-tuple
(technique_id, sub_technique_id, tactic, confidence). Tactic + confidence
ride per-emit so a single rule can carry multiple precision targets
(the "one event maps to many techniques" property). Compile helpers in
both backends extract them from the YAML emits dict; missing tactic
or confidence is a deploy-time error.
- v0 match operator is "pattern" (regex). The field defaults per
source_kind (command_text / raw_url / subject / verdict / …) and is
overridable via match.field. Future ops (contains, equals, in_set)
extend _match_event without touching the engine surface.
- Confidence model: rules with state="clipped" + confidence_max set
cap the per-emit confidence downward; clipped is a soft suppress, not
a hard skip. Disabled rules are skipped wholly; expires_at past is
re-checked at evaluate as defense-in-depth (the store auto-reverts,
but a racing read between expiry and revert must not fire the rule).
- _span(name, **attrs) helper in engine + both stores short-circuits on
decnet.telemetry._ENABLED — matches the project's @traced /
wrap_repository zero-overhead-when-disabled pattern instead of relying
solely on the no-op tracer indirection.
- Late-bound tracer (telemetry.get_tracer called per-span, not at
module load) so test_tracing's monkeypatch reaches the production
code path.
xfails flipped: tests/ttp/test_rule_engine.py multi-emit fan-out +
rule_version-collision-via-engine; tests/ttp/test_multi_mapping.py
N×M engine fan-out + idempotent replay; tests/ttp/test_tracing.py
ttp.eval span hierarchy + ttp.rule.fire span attributes.
Tests: 214 passed, 19 xfailed (gated on E.3.8 lifters / rule pack /
worker bootstrap).
mypy: clean on prod code; pre-existing test-stub arg-type warnings
unchanged.
Implements the DB-backed rule store body left empty at contract phase:
load_compiled reads from ttp_rule + ttp_rule_state; get_state /
set_state hit ttp_rule_state with the same expires_at auto-revert and
bus-event semantics as the FS backend; subscribe_changes returns a
per-subscriber queue. State persists across process restarts — the
swarm property the FS backend deliberately doesn't have.
Also lands two swarm-mode helpers:
- sync_from_filesystem(fs_store) — master-side, subscribes to a
FilesystemRuleStore and projects each RuleChange onto a ttp_rule
upsert/delete.
- tail_db(poll_interval) — worker-side, watermark poll over
ttp_rule.updated_at; emits RuleChange("definition", ...) for each
row that moved.
Why: swarm mode needs rule definitions and operator state to
propagate across hosts. The filesystem backend (E.3.5) was the
single-host-dev variant; this one survives restart and serves N
workers from a shared DB.
Notes:
- DatabaseRuleStore() with no args lazy-inits an in-memory SQLite
repo so the conformance fixture works without test plumbing. In
production the worker bootstrap (E.3.14) passes an explicit repo.
- The conftest.py rule_store fixture became async (pytest_asyncio),
per-backend creates/initializes a SQLite repo for the DB run.
- Adds a `seed_rule(store, rule_id, yaml)` helper to bridge backend
semantics: drop a YAML file (FS) vs insert a ttp_rule row (DB).
Used by the parametrized load_compiled conformance test.
- Late-bound _tracer() in both backends (was module-level get_tracer
binding) so test_tracing's monkeypatch of decnet.telemetry.get_tracer
actually affects span output.
xfails flipped: tests/ttp/store/test_database.py set_state-writes-to-
ttp_rule_state + filesystem-to-DB sync; tests/ttp/store/test_conformance.py
DB-side load_compiled / set_state isolation / round-trip / per-rule
fan-out / expired-state revert / set_state failure / get_state default
(was xfail-only-on-DB); tests/ttp/test_tracing.py set_state span
hierarchy.
Tests: 208 passed, 25 xfailed (gated on E.3.7 + lifters).
mypy: clean on all touched files.
Implements the filesystem-backed rule store body left empty at contract
phase: YAML parse + Pydantic validation, asyncinotify watch over
./rules/ttp/, in-process state cache with auto-revert on expires_at,
and a subscribe_changes() async iterator yielding one RuleChange per
per-rule edit. Bus topic builders ttp_rule_reloaded / ttp_rule_state
ship alongside.
Why: the rule plane needed a store before the engine (E.3.7) could
consume RuleChange events and atomically swap compiled rules into its
dispatch index.
Notes:
- Linux-only by construction (asyncinotify wheel gated by sys_platform
marker; FilesystemRuleStore.__init__ raises on non-Linux).
- Filename allowlist is the FIRST check on every inotify event.
- Content-hash dedup so a single write firing IN_CREATE + IN_CLOSE_WRITE
produces exactly one RuleChange.
- All compile work serializes on a single asyncio.Lock.
- Subscribers register their queue eagerly so events fired between
subscribe_changes() and the first __anext__() are buffered.
xfails flipped: per-save-style + filter-ordering + atomic-swap in
test_filesystem.py; load_compiled / set_state isolation / round-trip /
per-rule fan-out / expired-state revert / set_state failure semantics
in test_conformance.py (FS side; DB side stays xfail until E.3.6);
malformed-YAML compile-time check in test_rule_engine.py.
Tests: 197 passed, 35 xfailed (gated on E.3.6 / E.3.7 / lifters).
mypy + bandit: clean on all touched files.
Wiki update for the per-rule reload + state-change topics lands in a
matching wiki-checkout/Service-Bus.md edit (separate repo).
Dialect-split: portable rollup queries on TTPMixin; bulk insert with
ON CONFLICT DO NOTHING / INSERT IGNORE in the per-dialect repos.
Confidence-floor (< 0.3) drop applied at mixin layer before the
dialect hook. BaseRepository now declares the six TTP methods abstract.
Tests in tests/web/db/test_ttp_repo.py flipped from pytest.fail stubs
to real dual-backend behavioral tests; tests/ttp/test_confidence.py
drop-below-floor xfail removed.
Session-scoped autouse fixture in tests/ttp/conftest.py sets
DECNET_DEVELOPER_TRACING=true and forces decnet.telemetry._ENABLED
so the no-op tracer doesn't silently swallow emitted spans. The
span_exporter fixture also monkeypatches decnet.telemetry.get_tracer
so production code under test lands spans in the in-memory
exporter. Tracing tests skip when DECNET_OTEL_ENDPOINT (default
localhost:4317) isn't reachable so the dev loop stays green
without lying about coverage.
In-memory span exporter fixture wired to a per-test TracerProvider
(OTEL global is locked once set, so each test gets its own).
ttp.eval / ttp.lifter.{name} / ttp.rule.fire / ttp.rule.state.change
hierarchy + no-PII canary battery xfail-gated behind E.3.5–E.3.13.
Hypothesis property: N rule_ids × M technique_ids on one event yield
N×M distinct tag UUIDs. Worked example pinned: one rule emitting
(T1110, None) and (T1078, None) → two distinct UUIDs. Engine-level
fan-out + replay xfail-gated behind E.3.7.