E.3.9.0 prerequisite for the per-source lifters (E.3.9-E.3.13). The
dispatch index, install/evict/apply_change atomic-swap protocol, and
state-modulation helpers (is_active / apply_ceiling) move out of
rule_engine.py into _rule_index.py and _state.py. RuleEngine wraps a
RuleIndex; back-compat shims preserve _by_kind / _by_rule / _install
attribute access for tests poking at the dispatch internals.
Lifters in E.3.9-E.3.12 will each hold their own RuleIndex, watching
the same RuleStore via subscribe_changes() fan-out. Hot-reload
semantics (disable / clip / TTL via set_state API) now reach
lifter-bound rules through the same atomic-swap path the engine uses,
not a future composite-rebuild compromise.
5 YAMLs for the intel-verdict cohort per Appendix B / A.10:
AbuseIPDB category mapping, GreyNoise classification, Feodo
Tracker hit, ThreatFox IOC type, aggregate-malicious bump-only.
IntelLifter (E.3.10) consumes by rule_id and tolerates absence
silently (null provider column → no tag).
R0058 is the meta bump-only rule — emits a single confidence=0.0
sentinel so it validates and surfaces in the catalogue, but the
repository's sub-0.3 drop ensures no fresh tag persists if the
fanout fires accidentally. test_intel_rules.py pins that
zero-confidence invariant.
Marks E.3.8 done in development/TTP_TAGGING.md with the cohort-
split summary.
5 YAMLs for the canary-fingerprint cohort per Appendix B / A.9:
navigator.webdriver flag, automation canvas/audio/WebGL hash match,
WebRTC IP leak, TZ/lang vs geo mismatch, platform inconsistency.
CanaryFingerprintLifter (E.3.11) consumes by rule_id.
test_canary_rules.py: YAML-present + inert-in-v0 + xfail(strict)
gated on E.3.11.
10 YAMLs for the behavioral / cross-event cohort per Appendix B:
beaconing, data destruction, ransom note, web exfil, DB mass-read,
credentials-in-files, k8s SA token harvest, Docker host escape,
LLMNR poisoning, TFTP router-config retrieval.
Every rule is lifter-bound (BehavioralLifter / IdentityLifter) —
the v0 RuleEngine cannot count, aggregate, or compose cross-event
signals, so these YAMLs declare the technique mappings the lifter
will consume by rule_id at E.3.9. Their match specs use a
'kind: lifter:*' shape inert to the regex matcher.
test_behavioral_rules.py asserts each YAML compiles, none fire
from the v0 engine (FP regression guard against a YAML drifting
into a regex), and an xfail(strict=True, reason='impl phase E.3.9')
precision case that will flip green when the lifter lands.
30 YAMLs for the shell/command rule cohort per Appendix B (rules/ttp/).
Splits into engine-active (R0007-R0029, regex on command_text /
raw_url / user_agent) and lifter-bound (R0001-R0006, R0030 — the
v0 RuleEngine cannot count auth attempts, do identity rollups, or
parse fingerprint blobs; the BehavioralLifter / IdentityLifter /
CredentialLifter consume them by rule_id at E.3.9 / E.3.13).
test_command_rules.py asserts:
- every R000N has a YAML that compiles
- lifter-bound rules NEVER fire from the v0 engine (regression
guard against a YAML drifting into a regex match.spec)
- engine-active rules meet their Appendix-C precision target
against the seed corpus (≥0.95 high-conf, ≥0.80 medium)
Conftest fixes: precision_engine moved to module-scope so module-
scope precomputed dispatch fixture (fired_by_label) can request it;
_RULES_DIR path bumped from parents[2] to parents[3] so the loader
resolves the project root regardless of pytest cwd; make_event
synthesizes attacker_uuid so TTPTag's anchor invariant is satisfied.
Seed corpus broadened: positive examples for every regex rule plus
6 negative examples across innocuous shell verbs (ls, echo, cd, ps,
df, free) so FPs surface in precision rather than passing vacuously.
Sub-step preceding the rule-pack commits per TTP_TAGGING.md:2967.
Adds the per-rule precision suite scaffolding under
tests/ttp/rule_precision/:
- conftest.py: precision_engine fixture (RuleEngine populated from
./rules/ttp/), corpus_loader (real → seed → empty fallback),
precision_for() helper for TP/FP accounting.
- _build_corpus.py: extractor for a real prod corpus pull. Mandatory
--exclude-ip / DECNET_TTP_CORPUS_EXCLUDE_IPS — operator IPs never
end up in the committed exclusion list. Pulls both 'command' and
'unknown_command' event types.
- corpus/seed_*.jsonl: synthetic seed rows for each cohort so the
harness exercises in clean checkouts.
- corpus/*.jsonl (operator-built) is gitignored.
- test_corpus_loads.py: sentinel that every seed file parses.
Implements the rule engine body left empty at contract phase: evaluate()
dispatches by source_kind through self._by_kind, runs the rule's match
spec against event.payload, and emits one TTPTag per emits entry.
watch_store() loads the initial corpus from RuleStore.load_compiled,
then drains subscribe_changes, applying definition changes via
single-statement dict assignment (atomic swap, GIL-atomic to readers)
and state changes via NamedTuple._replace on the existing CompiledRule.
Why: with the FS + DB stores in place (E.3.5/E.3.6), the engine is the
last piece of the rule plane. Lifters (E.3.9–E.3.13) consume the
engine; the worker bootstrap (E.3.14) wires watch_store into the
asyncio event loop. After this commit a CompositeTagger constructed
with a RuleEngine + a populated rules dir will produce real tags.
Notes:
- CompiledRule.emits extended to 4-tuple
(technique_id, sub_technique_id, tactic, confidence). Tactic + confidence
ride per-emit so a single rule can carry multiple precision targets
(the "one event maps to many techniques" property). Compile helpers in
both backends extract them from the YAML emits dict; missing tactic
or confidence is a deploy-time error.
- v0 match operator is "pattern" (regex). The field defaults per
source_kind (command_text / raw_url / subject / verdict / …) and is
overridable via match.field. Future ops (contains, equals, in_set)
extend _match_event without touching the engine surface.
- Confidence model: rules with state="clipped" + confidence_max set
cap the per-emit confidence downward; clipped is a soft suppress, not
a hard skip. Disabled rules are skipped wholly; expires_at past is
re-checked at evaluate as defense-in-depth (the store auto-reverts,
but a racing read between expiry and revert must not fire the rule).
- _span(name, **attrs) helper in engine + both stores short-circuits on
decnet.telemetry._ENABLED — matches the project's @traced /
wrap_repository zero-overhead-when-disabled pattern instead of relying
solely on the no-op tracer indirection.
- Late-bound tracer (telemetry.get_tracer called per-span, not at
module load) so test_tracing's monkeypatch reaches the production
code path.
xfails flipped: tests/ttp/test_rule_engine.py multi-emit fan-out +
rule_version-collision-via-engine; tests/ttp/test_multi_mapping.py
N×M engine fan-out + idempotent replay; tests/ttp/test_tracing.py
ttp.eval span hierarchy + ttp.rule.fire span attributes.
Tests: 214 passed, 19 xfailed (gated on E.3.8 lifters / rule pack /
worker bootstrap).
mypy: clean on prod code; pre-existing test-stub arg-type warnings
unchanged.
Implements the DB-backed rule store body left empty at contract phase:
load_compiled reads from ttp_rule + ttp_rule_state; get_state /
set_state hit ttp_rule_state with the same expires_at auto-revert and
bus-event semantics as the FS backend; subscribe_changes returns a
per-subscriber queue. State persists across process restarts — the
swarm property the FS backend deliberately doesn't have.
Also lands two swarm-mode helpers:
- sync_from_filesystem(fs_store) — master-side, subscribes to a
FilesystemRuleStore and projects each RuleChange onto a ttp_rule
upsert/delete.
- tail_db(poll_interval) — worker-side, watermark poll over
ttp_rule.updated_at; emits RuleChange("definition", ...) for each
row that moved.
Why: swarm mode needs rule definitions and operator state to
propagate across hosts. The filesystem backend (E.3.5) was the
single-host-dev variant; this one survives restart and serves N
workers from a shared DB.
Notes:
- DatabaseRuleStore() with no args lazy-inits an in-memory SQLite
repo so the conformance fixture works without test plumbing. In
production the worker bootstrap (E.3.14) passes an explicit repo.
- The conftest.py rule_store fixture became async (pytest_asyncio),
per-backend creates/initializes a SQLite repo for the DB run.
- Adds a `seed_rule(store, rule_id, yaml)` helper to bridge backend
semantics: drop a YAML file (FS) vs insert a ttp_rule row (DB).
Used by the parametrized load_compiled conformance test.
- Late-bound _tracer() in both backends (was module-level get_tracer
binding) so test_tracing's monkeypatch of decnet.telemetry.get_tracer
actually affects span output.
xfails flipped: tests/ttp/store/test_database.py set_state-writes-to-
ttp_rule_state + filesystem-to-DB sync; tests/ttp/store/test_conformance.py
DB-side load_compiled / set_state isolation / round-trip / per-rule
fan-out / expired-state revert / set_state failure / get_state default
(was xfail-only-on-DB); tests/ttp/test_tracing.py set_state span
hierarchy.
Tests: 208 passed, 25 xfailed (gated on E.3.7 + lifters).
mypy: clean on all touched files.
Implements the filesystem-backed rule store body left empty at contract
phase: YAML parse + Pydantic validation, asyncinotify watch over
./rules/ttp/, in-process state cache with auto-revert on expires_at,
and a subscribe_changes() async iterator yielding one RuleChange per
per-rule edit. Bus topic builders ttp_rule_reloaded / ttp_rule_state
ship alongside.
Why: the rule plane needed a store before the engine (E.3.7) could
consume RuleChange events and atomically swap compiled rules into its
dispatch index.
Notes:
- Linux-only by construction (asyncinotify wheel gated by sys_platform
marker; FilesystemRuleStore.__init__ raises on non-Linux).
- Filename allowlist is the FIRST check on every inotify event.
- Content-hash dedup so a single write firing IN_CREATE + IN_CLOSE_WRITE
produces exactly one RuleChange.
- All compile work serializes on a single asyncio.Lock.
- Subscribers register their queue eagerly so events fired between
subscribe_changes() and the first __anext__() are buffered.
xfails flipped: per-save-style + filter-ordering + atomic-swap in
test_filesystem.py; load_compiled / set_state isolation / round-trip /
per-rule fan-out / expired-state revert / set_state failure semantics
in test_conformance.py (FS side; DB side stays xfail until E.3.6);
malformed-YAML compile-time check in test_rule_engine.py.
Tests: 197 passed, 35 xfailed (gated on E.3.6 / E.3.7 / lifters).
mypy + bandit: clean on all touched files.
Wiki update for the per-rule reload + state-change topics lands in a
matching wiki-checkout/Service-Bus.md edit (separate repo).
Dialect-split: portable rollup queries on TTPMixin; bulk insert with
ON CONFLICT DO NOTHING / INSERT IGNORE in the per-dialect repos.
Confidence-floor (< 0.3) drop applied at mixin layer before the
dialect hook. BaseRepository now declares the six TTP methods abstract.
Tests in tests/web/db/test_ttp_repo.py flipped from pytest.fail stubs
to real dual-backend behavioral tests; tests/ttp/test_confidence.py
drop-below-floor xfail removed.
Session-scoped autouse fixture in tests/ttp/conftest.py sets
DECNET_DEVELOPER_TRACING=true and forces decnet.telemetry._ENABLED
so the no-op tracer doesn't silently swallow emitted spans. The
span_exporter fixture also monkeypatches decnet.telemetry.get_tracer
so production code under test lands spans in the in-memory
exporter. Tracing tests skip when DECNET_OTEL_ENDPOINT (default
localhost:4317) isn't reachable so the dev loop stays green
without lying about coverage.
In-memory span exporter fixture wired to a per-test TracerProvider
(OTEL global is locked once set, so each test gets its own).
ttp.eval / ttp.lifter.{name} / ttp.rule.fire / ttp.rule.state.change
hierarchy + no-PII canary battery xfail-gated behind E.3.5–E.3.13.
Hypothesis property: N rule_ids × M technique_ids on one event yield
N×M distinct tag UUIDs. Worked example pinned: one rule emitting
(T1110, None) and (T1078, None) → two distinct UUIDs. Engine-level
fan-out + replay xfail-gated behind E.3.7.
Third and fourth TTP-tagging contract commits, plus a scoped subset
of the E.2.4 conformance tests covering the contract surface shipped
here (full hypothesis-fuzz suite still lands with E.2.4).
E.1.3 — decnet/ttp/base.py
- TaggerEvent NamedTuple: source_kind, source_id, attacker_uuid,
identity_uuid, session_id, decky_id, opaque payload.
- Tagger(ABC) with abstract async tag(); class-level name and
HANDLES: frozenset[str] (default empty so a misconfigured subclass
is loudly idle, not loudly noisy).
- TolerantTagger(Tagger): concrete tag() wraps abstract _tag_impl()
in try/except Exception (deliberately not BaseException — so
KeyboardInterrupt / SystemExit / asyncio.CancelledError propagate
and the worker can shut down cleanly). Swallowed exceptions log
at WARNING with exc_info, never ERROR — absence is the steady
state, not a bug. Subclasses override _tag_impl, never tag — the
tolerance contract is enforced in the base class, not on trust.
- KNOWN_SOURCE_KINDS: Final[frozenset[str]] enumerating every
source_kind a producer is allowed to emit. Closed-by-enumeration
at the runtime layer; the composite tagger keys its WARNING/INFO
bridge off this constant to surface the silent-drop trap from
the design doc (lines 160–195).
E.1.4 — decnet/ttp/factory.py
- get_tagger() reads DECNET_TTP_TAGGER_TYPE (default 'composite');
unknown values raise ValueError with the known-list. Mirrors
decnet.intel.factory and decnet.clustering.factory.
- _KNOWN = ('composite',). Per-lifter classes (E.1.6) are children
of the composite, not standalone tagger types.
- CompositeTagger(Tagger): pre-computes a dict[str, list[Tagger]]
dispatch index from each lifter's HANDLES; fans events out
concurrently with asyncio.gather and concatenates results.
Empty lifters=[] is the legal contract-phase state — E.1.6
wires the real lifters in.
- Unhandled-event observability: source_kind in KNOWN_SOURCE_KINDS
but no lifter claims it -> WARNING once per kind per process
(missed E.1.6 update). Unknown kind -> INFO once per kind per
process (future-feature telemetry, by design). Per-process dedup
via plain set; E.1.6 may swap in a proper rate-limiter once
production traffic shapes are known.
Tests — tests/ttp/test_base.py, tests/ttp/test_factory.py
- Tagger / TolerantTagger abstractness, missing-tag-impl rejection,
WARNING-not-ERROR log level, propagation of KeyboardInterrupt /
SystemExit / asyncio.CancelledError.
- Factory env-var routing, unknown-name ValueError, dispatch-index
correctness, only-claiming-lifter invocation, WARNING-once for
known-but-unclaimed kinds, INFO-once for unknown kinds, result
concatenation across lifters.
Mypy clean under .311/bin/mypy --ignore-missing-imports.