Ipv6LeakLifter subscribes to source_kind="ipv6_leak" events from both
the passive sniffer and active prober. Emits T1090 (Proxy) under TA0011
(C2) when fe80:: source address is observed — the attacker's VPN only
tunnels IPv4 so their link-local IID leaks their NIC identity.
Rule R0059 sets base confidence 0.85; iid_kind in the evidence carries
the per-observation strength (eui64 = MAC-derived, deterministic;
stable_privacy = RFC 7217; temporary = RFC 4941).
- test_evidence_shape.py: replace broken (command, BehavioralLifter)
pairing with correct (http_fingerprint, HttpFingerprintLifter) case;
expand _LIFTER_CASES to 5-tuples with per-lifter payloads and rule
factories; wire StubRuleStore + _index.install() per lifter; remove
xfail marker — all 4 parametrized cases now pass
- factory.py: add _span() helper gated on _telemetry._ENABLED; wrap
each per-lifter dispatch in _tag_one() that opens a
ttp.lifter.{name} child span per call
- http_fingerprint_lifter.py: add missing name = "http_fingerprint"
- test_tracing.py: replace pytest.fail() stubs in
test_lifter_child_spans_emitted and test_no_pii_canary_in_span_attributes
with real test bodies; remove xfail markers
Phase 2 attached mitre_url to intel-emitted tags' evidence JSON;
Phase 3 promotes it to a real column populated for *every* tag —
intel, credential, behavioral, canary, identity, email, rule-engine —
from one source. Pre-v1, so the SQLModel field is added directly
without an Alembic migration.
- TTPTag gains mitre_url: Optional[str] (not indexed — derived
deeplink, not a query target; technique_id is already indexed).
- _emit.py and rule_engine._evaluate_rules both populate mitre_url
via attack_stix.mitre_url_for(sub_technique_id or technique_id).
Sub-technique URL when present, else parent. The two construction
sites stay separate because the rule_engine path carries per-emit
span instrumentation that emit_tags() can't preserve without
threading a span object through; minimal-change beats forced
refactor here.
- intel_lifter strips mitre_url from evidence_extra in all four
decision functions. The column is canonical now; duplicating in
the JSON column would drift when the bundle moves. The unused
TechniqueEmission import + tracking dicts removed too.
- IdentityTechniqueRow / TechniqueRollupRow / TTPTagDetailRow /
CampaignTechniqueRow gain mitre_url: Optional[str].
- sqlmodel_repo/ttp.py:_mitre_url_for added; the 5 row-builder sites
pass mitre_url=_mitre_url_for(sub_technique_id or technique_id)
alongside the existing technique_name resolution.
- api_get_tag_details.py needs no change — list_tags_by_scope_and
_technique already returns model_dump() rows that flow the new
column through **row spread to TTPTagDetailRow.
- tests/ttp/test_emit_attaches_mitre_url.py covers both construction
paths (top-level, sub-tech, unknown, multi-emit) and a regression
test that intel_lifter evidence dicts no longer contain mitre_url.
The four provider→technique tables (AbuseIPDB cat→techniques,
GreyNoise tag→techniques, ThreatFox threat_type→techniques, plus
the Feodo binary-listed signal) used to live as Final[dict] constants
in intel_lifter.py. Two real problems with that:
1. Drift between rules/ttp/R0054.yaml..R0058.yaml (which declare
the full slate per provider) and the Python dicts (which decide
which slate-member fires per signal). The v2 audit comment in
intel_lifter.py documented that they had silently drifted.
2. No ATT&CK provenance on emissions — the loaded STIX bundle has
rich external_references (canonical attack.mitre.org URLs) that
never surfaced because the lifter had no path back to them.
Mappings now live as YAML at decnet/ttp/data/intel/{provider}.yaml,
validated at load against the loaded ATT&CK bundle, with each entry
enriched by attack_stix._attack_pattern_by_id to attach the canonical
MITRE URL to every emission.
- decnet/ttp/data/intel_loader.py: pydantic-validated schema +
ProviderMapping/Signal/TechniqueEmission frozen dataclasses +
load_provider_mapping(provider) lru-cached.
- Per-technique high_score_threshold inlined into YAML
(collapses the separate _ABUSEIPDB_HIGH_SCORE_GATED dict).
- external_reference field follows the STIX 2.1 external-reference
shape (source_name + url + optional external_id) so the future
STIX/MISP exporter is a direct translation.
- intel_lifter.py: dicts deleted, decision functions read from
ProviderMapping accessors. Decision-flow constants (T1071/T1595
bare-classification fallbacks in _greynoise_decisions) stay in
code — they're not table rows.
- Each emit slot's evidence_extra now carries mitre_url for any
technique resolved in the bundle (every one in practice).
- tests/ttp/test_intel_mappings.py: snapshot equivalence vs the
legacy dicts, high-score gate behavior, every-signal-has-an-
external-reference, every-emission-has-a-mitre-url, negative
paths (unknown technique_id raises AttackBundleError, mismatched
provider field rejected, dir listing matches expected providers).
The YAML schema + mitre_url enrichment lays groundwork for the
future STIX exporter; this commit does NOT build that exporter.
Drift between the technique/tactic IDs hardcoded in the lifters and
what the loaded ATT&CK STIX bundle actually contains is silent in the
status quo: a renamed-or-retired technique just stops being tagged.
Every emission point now has an explicit validator that asserts its
IDs resolve in the loaded bundle, called once at TTP-worker boot.
- intel_lifter.all_emitted_technique_ids() collects every technique
the four provider tables (AbuseIPDB / GreyNoise / Feodo / ThreatFox)
plus the decision-flow constants in _greynoise_decisions and
_feodo_decisions can emit. validate_against_attack_bundle() runs it
through attack_stix.assert_known_technique_ids().
- ukc.validate_against_attack_bundle() asserts every key in
ATTACK_TACTIC_TO_UKC resolves, with TA0100..TA0106 documented as
_NON_ENTERPRISE_TACTICS (lives in the ICS bundle, not the
enterprise bundle DECNET loads).
- decnet/ttp/worker.py:run_ttp_worker_loop calls both validators
before subscribing to the bus. A bundle-vs-code mismatch refuses
to start the worker rather than silently mistagging events.
- tests/ttp/test_attack_bundle_validation.py covers the happy path
for both validators, the negative path (injected bogus tactic ID
raises AttackBundleError), the ICS exemption, and the lone T1078
reference in credential_lifter.
R0047 (BEC) and the encoded-payload predicate substring-match against
the email body. Shipping raw body text on the abstracted service bus
is the wrong privacy stance — the bus transport may swap from UNIX
socket to networked at any time, and "loopback today" is not a license
to put PII on the wire.
EmailLifter now opens the .eml lazily from
/var/lib/decnet/artifacts/{decky_id}/smtp/{stored_as} when a body-aware
predicate runs and parses the body in-process via stdlib email +
policy.default. The decoded body is memoized into the payload dict so
multiple body-aware predicates on the same event open the file once.
Bus envelope only carries the artifact pointer (decky_id + stored_as);
raw body bytes never cross the host disk boundary on the agent → master
hop. Filesystem access on agents is unblocked by DEBT-035 (setgid +
group-readable artifacts root, paid 2026-05-02).
The legacy inline body_text path is preserved — when the producer ships
body_text on the bus the helper short-circuits without opening the file.
A previous agent (and several of my own commits) wrote to a top-level
DEBT.md without seeing the existing development/DEBT.md — the
canonical register since DEBT-001. Resulted in two parallel files,
inconsistent numbering schemes, and references that resolved to the
wrong place.
Migrate the six entries that landed in the rogue file into the
canonical register as DEBT-044 through DEBT-049, preserving their
status (resolved / partial / open) and cross-references. The
TTP_TAGGING.md references to "DEBT.md" already resolve to
development/DEBT.md by virtue of being in the same directory; only
the comment in decnet/ttp/impl/intel_lifter.py needed disambiguation
to "development/DEBT.md DEBT-048".
* DEBT-044 — `attacker.email.received` producer wiring (✅ RESOLVED 2026-05-02)
* DEBT-045 — EmailLifter heavyweight feature extraction (PARTIAL PAID 2026-05-02)
* DEBT-046 — EmailLifter mal-hash feed integration (open)
* DEBT-047 — EmailLifter R0047 BEC unblock (open, gated on DEBT-035)
* DEBT-048 — TTP intel provider mapping review (recurring quarterly)
* DEBT-049 — TTP Sigma adapter — post-v1 (open)
Summary table extended; "Remaining open" line updated; root file
removed. The DEBT-047 entry now explicitly cross-references DEBT-035
as the gating dependency for the R0047 BEC unblock.
Three bug classes uncovered by the 2026-05-02 ship-time audit:
* AbuseIPDB code/name mismatch in v1: cat 10 was treated as DDoS (it's
Web Spam — DDoS is cat 4, intentionally unmapped per A.10) and cat 17
as VPN IP (it's Spoofing — VPN IP is cat 13). Both typos mirrored in
code AND the design doc Appendix A.10. Code now matches the AbuseIPDB
taxonomy exactly; cat 17 retargets to T1566 (email-spoofing as a
phishing precursor), and cats 7 (Phishing) and 16 (SQL Injection)
pick up T1566 / T1190 emissions that v1 didn't cover.
* ThreatFox dispatch keyed on `ioc_type` in v1, but `ioc_type` is the
indicator format (url / domain / hash variants) and carries no ATT&CK
signal. The canonical taxonomy field per ThreatFox's API is
`threat_type` (botnet_cc / payload_delivery / payload / cc_skimming).
Repoint dispatch through the new `threatfox_threat_types` payload
field; `ioc_type` rides as evidence only. Also adds the missing
cc_skimming -> T1056 (Input Capture) mapping and registers T1056 in
attack_catalog.py.
* GreyNoise bare-malicious lane: a `classification == "malicious"` row
with no recognised tag used to emit nothing. Now lights T1071 at a
half multiplier, suppressed when a tag already fires T1071 to avoid
double-stamping at conflicting confidence levels.
The inspector was dumping the whole `CMD uid=0 user=root src=… pwd=…
cmd=nmap -p- 192.168.1.0/24` syslog body into a single ``command_text``
blob. ANTI: "I'd like to separate the fields." Done — three layers
work together:
1. Collector session aggregator: new `_parse_cmd_msg` splits the bash
PROMPT_COMMAND msg into `{uid, user, src, pwd, command}`. The
session-ended envelope's per-command dict now carries the
structured fields, with `command_text` set to just the cmd= value
(preserving embedded whitespace — `nmap -p- 1.2.3.0/24` etc.).
2. Rule engine: per-source_kind auxiliary evidence list
(`_AUX_EVIDENCE_FIELDS`). For `command` events the engine
automatically promotes uid/user/src/pwd into the persisted
`evidence` dict on top of the rule's explicit `evidence_fields`.
Engine-controlled, not per-rule — adding a new aux field is one
line here, not a 30-rule YAML sweep, and rule authors can't
accidentally drop it.
3. TTPInspector frontend: evidence renders as a structured
`kvs` grid (UID / USER / SRC / PWD / CMD rows) instead of
pretty-printed JSON. Primary-order list keeps shell fields at
the top; everything else falls below alphabetically so unfamiliar
evidence shapes still surface predictably.
Tests:
- session_aggregator pins the structured-fields emit (uid/user/src/
pwd/command_text without "CMD" prefix, embedded whitespace
preserved).
- rule_engine_tagger pins the aux-field auto-promotion + the
no-`None`-leakage path when payload doesn't carry an aux key.
The endpoint was a contract-phase stub returning `[]` even though the
RuleStore loaded all 58 YAML rules at worker startup. UI saw an empty
table; operators couldn't tell whether anything was wired up.
- `api_list_rules` now calls `get_rule_store().load_compiled()` and
serializes each CompiledRule + its operational state into a
RuleCatalogueRow. Sorted by rule_id for stable golden snapshots.
- Add `description: str` to RuleSchema (pydantic) and CompiledRule
(NamedTuple, defaulted) + propagate through `_compile_one` so the
catalogue surfaces the human-readable YAML description, not just
the slug-style `name`.
- Update `tests/ttp/test_rule_engine.py` _fields assertion for the
new column; new `tests/api/ttp/test_rules_catalogue.py` pins the
catalogue contents (R0001/R0014 presence, row shape, sort order).
Worker behaviour is unchanged: it was already loading rules
correctly. This is purely a read-side wiring fix on the operator API.
The canonical rule-based engine from §"Tagging engines, layered §1"
of TTP_TAGGING.md was fully implemented but never instantiated as a
composite child — pure pattern rules (R0014/R0017/R0023/... 23 rules
total) had no tagger to dispatch them.
- Add `RuleEngineTagger(Tagger)` adapter in rule_engine.py wrapping
`RuleEngine.evaluate()`. `HANDLES = {command, http_request,
auth_attempt, payload}` — the source kinds whose rules typically
live outside any per-source lifter.
- Adapter's `watch_store()` filters via `_is_engine_owned` so the
engine's dispatch index excludes lifter-claimed rules
(`match.kind: lifter:*`) and stays disjoint from per-lifter ownership.
- Prepend `RuleEngineTagger` to the `CompositeTagger` lifter list so
generic pattern rules dispatch before per-source cross-event logic.
- Composes with E.3.18a (worker hydrates `watch_store`) and E.3.18b
(worker fans session payloads into per-`command` events) — together
these three commits make R0001–R0030 actually fire at runtime.
IdentityLifter owns lifter:identity_* — currently R0003 (password
spraying). CredentialLifter owns lifter:credential_* — R0001 generic
auth brute, R0002 password guessing, R0004 credential reuse, R0005
valid-account use, R0006 default credentials.
YAMLs R0001/R0002/R0003/R0005/R0006 had their match.kind normalised
to fit the lifter prefix scheme — the design doc's promised "YAMLs
normalised in a separate refactor commit" lands here.
Identity-rollup tags null out attacker_uuid on emit so the worked-
example invariant holds (the tag belongs to the Identity, never to
one member IP).
Tests: test_identity_lifter.py + test_credential_lifter.py cover
each predicate's positive/negative path, state modulation
(disabled/clipped/expired), source-kind gating, and idempotent
replay. test_lifter_absence and test_lifters updated for the new
ctor signature.
SMTP message-level technique tagger per Appendix A.6: open relay abuse
(rcpt_count + foreign From), mass phishing (rcpt_count + body simhash),
phishing-kit X-Mailer, IDN/punycode URL, sender masquerade composite
(From/Return-Path/DKIM/SPF), malicious attachment (macro/.lnk/.iso/.img/
hash match), BEC subject+body composite, encoded payload in body.
PII discipline (TTP_TAGGING.md §'Hard parts §6') is enforced at the
lifter layer via _filter_evidence(): emitted TTPTag.evidence is
restricted to the EmailEvidence-allowed allowlist (body_sha256,
matched_headers — names only, rcpt_domain_set — domains only,
attachment_sha256s, rcpt_count) plus PII-safe match discriminators
(matched_kit, matched_trigger, matched_url_host, etc). Raw addresses,
raw body bytes, full URLs, and decoded base64 previews NEVER appear in
evidence — defense-in-depth over the YAML evidence_fields hint.
Tests: tests/ttp/test_email_lifter.py per-rule positive + negative +
PII allowlist guard + state modulation. tests/ttp/rule_precision/
test_email_rules.py xfail flipped to real precision (R0041-R0048
H-band ≥95%). Corpus rows updated to acknowledge that R0045 (masquerade)
co-fires with R0041 / R0047 when the sender-masquerade signals are
present alongside open-relay or BEC patterns — overlap is by design,
not a precision bug.
Browser-payload derivations per Appendix A.9: navigator.webdriver flag,
canvas/audio/WebGL automation hash matches (Puppeteer/Playwright/
Selenium/curl-impersonate), WebRTC IP leak, TZ/language vs source-IP
geo mismatch, navigator.platform vs userAgent vs WebGL renderer
inconsistency.
Evidence shape pinned to CanaryFingerprintEvidence (metric +
matched_signature) — raw fingerprint blobs (canvas hashes, full UAs,
navigator.platform values) explicitly NOT carried into TTPTag.evidence
per TTP_TAGGING.md §'Hard parts §7' (enrichment vs tag boundary). The
identity-merge guard rail is preserved: composite fp.id matches across
IPs are NOT a TTP, so no rule fires on the bare hash.
Tests: tests/ttp/test_canary_fingerprint_lifter.py per-rule positive +
negative + evidence-shape guard + state modulation.
tests/ttp/rule_precision/test_canary_rules.py xfail flipped to real
precision (R0049/R0050/R0051/R0053 H-band ≥95%; R0052 M-band ≥80%).
Per-provider verdict translator for AbuseIPDB, GreyNoise, Feodo Tracker,
and ThreatFox per Appendix A.10. Each rule's predicate inspects payload
fields produced by the enrich worker (no DB I/O, no decnet.intel.*
imports — E.2.7 decoupling guard preserved). AbuseIPDB confidence is
scaled by abuse_confidence_score / 100; categories drive per-technique
fan-out. R0058 aggregate-bump is a no-op in v0 (cross-tag bump deferred
to E.3.14 worker bootstrap).
Per-provider null tolerance is the steady state — a missing provider
column produces zero tags from that rule, never an error.
Tests:
- tests/ttp/test_intel_lifter.py — per-provider positive + negative +
state modulation + decoupling source-import guard.
- tests/ttp/rule_precision/test_intel_rules.py — xfail flipped, real
precision driven over seed_intel.jsonl (R0054-R0057 H-band ≥95%;
R0058 skipped as bump-only).
- tests/ttp/test_lifter_absence.py — IntelLifter all-populated test
flipped from xfail-strict to real assertion with realistic payload.
- tests/ttp/test_lifters.py — partial-null xfail flipped to real
assertion.
Reads pre-shaped session aggregates from TaggerEvent.payload and emits
techniques per Appendix A behavior tables. Per-rule predicates dispatch
on match.kind (lifter:behavioral_<name>); the lifter holds its own
RuleIndex watching the same RuleStore as the engine, so disable / clip /
TTL state reaches lifter-bound rules through the same atomic-swap path.
R0032/R0036/R0037/R0040 YAMLs had over-escaped regex strings (\\
instead of \\) — fixed in place.
Factory wired so default get_tagger() returns CompositeTagger with
BehavioralLifter shipped; remaining three lifters (E.3.10-E.3.12) land
in subsequent commits.
E.2.6 contract preserved via TolerantTagger: empty payload steady-state
yields [] with zero ERROR records. Disabled / clipped / expired state
verified.
E.3.9.0 prerequisite for the per-source lifters (E.3.9-E.3.13). The
dispatch index, install/evict/apply_change atomic-swap protocol, and
state-modulation helpers (is_active / apply_ceiling) move out of
rule_engine.py into _rule_index.py and _state.py. RuleEngine wraps a
RuleIndex; back-compat shims preserve _by_kind / _by_rule / _install
attribute access for tests poking at the dispatch internals.
Lifters in E.3.9-E.3.12 will each hold their own RuleIndex, watching
the same RuleStore via subscribe_changes() fan-out. Hot-reload
semantics (disable / clip / TTL via set_state API) now reach
lifter-bound rules through the same atomic-swap path the engine uses,
not a future composite-rebuild compromise.
Implements the rule engine body left empty at contract phase: evaluate()
dispatches by source_kind through self._by_kind, runs the rule's match
spec against event.payload, and emits one TTPTag per emits entry.
watch_store() loads the initial corpus from RuleStore.load_compiled,
then drains subscribe_changes, applying definition changes via
single-statement dict assignment (atomic swap, GIL-atomic to readers)
and state changes via NamedTuple._replace on the existing CompiledRule.
Why: with the FS + DB stores in place (E.3.5/E.3.6), the engine is the
last piece of the rule plane. Lifters (E.3.9–E.3.13) consume the
engine; the worker bootstrap (E.3.14) wires watch_store into the
asyncio event loop. After this commit a CompositeTagger constructed
with a RuleEngine + a populated rules dir will produce real tags.
Notes:
- CompiledRule.emits extended to 4-tuple
(technique_id, sub_technique_id, tactic, confidence). Tactic + confidence
ride per-emit so a single rule can carry multiple precision targets
(the "one event maps to many techniques" property). Compile helpers in
both backends extract them from the YAML emits dict; missing tactic
or confidence is a deploy-time error.
- v0 match operator is "pattern" (regex). The field defaults per
source_kind (command_text / raw_url / subject / verdict / …) and is
overridable via match.field. Future ops (contains, equals, in_set)
extend _match_event without touching the engine surface.
- Confidence model: rules with state="clipped" + confidence_max set
cap the per-emit confidence downward; clipped is a soft suppress, not
a hard skip. Disabled rules are skipped wholly; expires_at past is
re-checked at evaluate as defense-in-depth (the store auto-reverts,
but a racing read between expiry and revert must not fire the rule).
- _span(name, **attrs) helper in engine + both stores short-circuits on
decnet.telemetry._ENABLED — matches the project's @traced /
wrap_repository zero-overhead-when-disabled pattern instead of relying
solely on the no-op tracer indirection.
- Late-bound tracer (telemetry.get_tracer called per-span, not at
module load) so test_tracing's monkeypatch reaches the production
code path.
xfails flipped: tests/ttp/test_rule_engine.py multi-emit fan-out +
rule_version-collision-via-engine; tests/ttp/test_multi_mapping.py
N×M engine fan-out + idempotent replay; tests/ttp/test_tracing.py
ttp.eval span hierarchy + ttp.rule.fire span attributes.
Tests: 214 passed, 19 xfailed (gated on E.3.8 lifters / rule pack /
worker bootstrap).
mypy: clean on prod code; pre-existing test-stub arg-type warnings
unchanged.