Switch burst deque from monotonic() to time.time() (wall-clock, serializable).
Add DNS_STATE_PATH env var: on startup _load_state() reads {src:[ts,...]} JSON
and prunes entries older than the burst window. _flush_state() write-then-renames
atomically; _state_flusher() coroutine flushes every 5s when dirty. Detection of
the 5th event also triggers an immediate flush. No-op when DNS_STATE_PATH is
unset, so the default deployment is unchanged.
Rename _txt_times -> _tunnel_times. Add TYPE_CNAME=5, TYPE_NULL=10,
TYPE_PRIVATE=65399 constants. Guard burst counter with _TUNNEL_QTYPES
frozenset instead of TYPE_TXT only. Mixed-type queries from one source
now share a single burst window, closing iodine NULL/CNAME downlink
and AAAA-encoded uplink evasion gaps.
_is_tunneling now returns str|None (the detection method) instead of bool.
Two new tunables _QNAME_TOTAL_LEN_THRESHOLD=50 and _QNAME_ENTROPY_THRESHOLD=3.5
catch attackers who split a high-entropy payload across multiple short labels.
tunnel_method field added to tunneling_suspect events for downstream correlation.
_parse_edns_size only extracted the requestor UDP size; every other field in
the OPT record (DO bit, EDNS version, extended RCODE, all sub-options) was
invisible. Replaced with _parse_opt_record returning a full dict:
udp_size, ext_rcode, version, do_bit, z, options[(code, len, data)]
NSID request (option code 3) is now detected as fingerprint_probe with
probe=edns_nsid and contributes to recon_burst. DO bit, COOKIE (10), and
other options are not escalated; udp_size continues to drive amp_probe.
Tools like fpdns send OPCODE=IQUERY/STATUS/NOTIFY/UPDATE or set the reserved
Z bit to fingerprint resolver behaviour. Previously all these were parsed as
standard queries with no signal.
- opcode!=0 → fingerprint_probe probe=opcode_<name>, NOTIMP response;
fired before qdcount check so qdcount=0 UPDATE packets are still caught.
- Z bit set OR (AD+CD without RD) → fingerprint_probe probe=header_flags;
AD alone with RD is ignored to avoid tagging DNSSEC-aware stubs.
- Both variants contribute to recon_burst.
qclass=255 in a standard query is unusual enough to be a fingerprinting probe
(fpdns, various scanner scripts). Previously it was logged as a plain query
with qclass=ANY in the event field; now it emits fingerprint_probe with
probe=qclass_any and returns REFUSED — consistent with how we treat other
probe types. Contributes to recon_burst.
The inline probe_map dict inside _handle made tests blind to the probe
catalogue and couldn't be extended without touching the hot path. It is now
module-level _CHAOS_PROBE_MAP. authors.bind. joins the three existing entries
so it gets named correctly instead of carrying the raw qname.
Packets with multiple questions were silently parsed at q0 only; the extra
questions were invisible. Now emits multi_question at severity=5 with the
qdcount and q0 qname, then falls through and answers q0 normally.
Silent drops on <12B packets, qdcount=0, and question-section ValueError gave
fuzzers and scanners a completely dark target. New events malformed_packet,
empty_question_section, and question_parse_error fire at severity=5 so these
probes are visible without counting toward recon_burst.
Adds DNS_FORWARD_BUDGET (default 50) and DNS_FORWARD_WINDOW (default 1.0s)
env vars. _can_forward() maintains a rolling deque of upstream call
timestamps; queries that exceed the budget within the window are answered
with the sinkhole (127.x) instead of being forwarded, making the honeypot
ineligible as a sustained amp vector even when real_recursive is enabled.
Rate limit is global (not per-source) so IP-spoofed amplification floods
hit the ceiling regardless of how many source addresses are rotated.
When DNS_REAL_RECURSIVE=true and DNS_ZONE_MODE=recursive, out-of-zone
queries are forwarded to DNS_UPSTREAM (default 8.8.8.8:53) via async
UDP. Upstream response is relayed as-is; on timeout or error the
already-computed sinkhole (127.x) is returned instead.
_handle() always runs first so logging, tunneling detection, flood
tracking, and recon-burst aggregation fire on every query regardless
of whether the response ultimately comes from upstream. _dispatch()
overlays forwarding on top of the sync handler.
Protocol handlers (UDP datagram_received, TCP session) are now async
via asyncio.ensure_future / await _dispatch(). Service class exposes
real_recursive (bool) and upstream (string) config fields.
RA=1 + empty answer section is immediately detectable as fake by any
open-resolver scanner. Recursive mode now behaves like open mode
(127.0.0.x sinkhole, deterministic on qname) with RA=1 and AA=0,
matching what a real recursive resolver returns.
- Add per-src QPS counter (_qps_window) with flood_suspect event at ≥50 qps/10s;
one event per src per 30s cooldown, does not suppress baseline query events.
- Add tracking_evicted telemetry every 100 LRU evictions so IP-rotation evasion
of _txt_times/_qps_window/_recon_window is observable, not silent.
- Shared _track_lru helper consolidates LRU touch + eviction signalling across
all three bounded OrderedDicts.
- Add TYPE_AAAA=28 support: _fake_ipv6() returns deterministic ULA (fd::/8)
addresses for in-zone names; extra_records parser now accepts and validates
AAAA entries via socket.inet_pton.
- Add per-src recon-burst aggregation (_recon_window): fingerprint_probe +
zone_transfer + amp_probe are tracked per source in a 60s window; recon_burst
fires when ≥2 distinct signal types seen, once per src per 120s cooldown.
- 47 tests passing (19 new across TestAAAARecords, TestFloodDetection, TestReconBurst).
Python asyncio DNS server on UDP+TCP/53 masquerading as BIND 9.x.
Emits four event_type values: query, fingerprint_probe (version.bind /
hostname.bind / id.server CHAOS), zone_transfer (AXFR/IXFR, always
REFUSED), amp_probe (qtype=ANY or EDNS udp_size>1232), and
tunneling_suspect (long high-entropy labels or rapid TXT burst).
Zone persona is generated per-decky from instance_seed (domain name,
SOA serial, NS, A, MX, TXT SPF); overridable via config_schema.
Three zone modes: auth (default), recursive, open (sinkhole).
Import enrich_rpki from decnet.rpki and call it inline after the
ASN lookup. bgp_prefix, rpki_status, rpki_source added to the
record dict that feeds the Attacker upsert. enrich_rpki short-circuits
to (None, None) when asn is None, so private / unannounced IPs
never hit RIPE STAT.
RipeStatValidator makes two RIPE STAT calls per uncached IP:
network-info -> announced prefix, rpki-validation -> ROA state.
2-second timeout; any network failure returns status='unknown'.
SQLite cache keyed by IP, 12-hour TTL, pruned on validator init.
Cache avoids per-event HTTP for the high-churn attacker pool —
steady-state cost approaches zero for repeat offenders.
Synthesize the covering CIDR at lookup time from the matched iptoasn
range using ipaddress.summarize_address_range. AsnInfo.prefix is
populated per-query; not persisted in the pickle cache.
enrich_ip now returns (asn, as_name, bgp_prefix, provider_name).
Profiler worker updated to unpack the 4-tuple and write bgp_prefix
into the attacker record dict.
Four RFC 4443 stimuli (port-unreach, hop-limit-exceeded, unknown-NH,
bad-dest-option) produce a 4-char matrix + sha256 fingerprint for IPv6
attackers. Auto-registers via ActiveProbeMeta at priority=860 (after v4
icmp_error=850, before ipv6_leak=999). IPv4 targets fast-return None.
Sends four crafted stimuli (UDP/closed-port, TTL=1, DF+oversized,
bad IP option) and records which ICMP error classes come back, the
per-error RTT, and the bytes echoed in each ICMP body. Absence is
as informative as a reply — Linux rate-limiting is a fingerprint signal.
Returns None when no packets could be sent (no CAP_NET_RAW), so the
probe is a no-op in non-root test environments. Port-free ActiveProbe
subclass (priority=850), metaclass auto-registered in the registry.
Also fixes three sets of stale tests left over from the TlsCertProbe
migration (4b2759e0):
- test_active_probe_registry: closed name/order sets updated for
tls_certificate and icmp_error
- test_prober_rotation: dead patches on worker.fetch_leaf_cert removed
- test_prober_worker (TestProbeCycleTLSCert): rewritten to test
TlsCertProbe as an independent registry probe, patch target updated
from worker.fetch_leaf_cert to probes.tlscert_probe.fetch_leaf_cert
ActiveProbe.run/syslog_fields/publish_payload now accept port=None so
non-port-iterating probes can live in the registry. Ipv6LeakProbe replaces
the hand-rolled _ipv6_leak_phase special case in worker.py; it runs last
via priority=999. _probe_cycle no longer has an ad-hoc phase call.
Fixes three stale test files (test_prober_bus, test_prober_rotation,
test_prober_worker) that were broken since the 916b21b6 registry refactor.
_route_info() calls _ip_route_get once and returns (on_link, iface);
worker._ipv6_leak_phase now calls it instead of the two separate helpers.
Bare except clauses at _ip_route_get and response parse now log at debug.
ingester: wrap bootstrap get_state() in forever-retry loop — MySQL coming
up after the API process killed the ingestion task permanently before it
ever entered _run_loop. Regression test added.
deps: idna 3.13→3.15 (CVE-2026-45409), twisted 26.4.0rc2→26.4.0
(PYSEC-2026-160), pip 26.1→26.1.1 (CVE-2026-3219 resolved upstream),
behave-core/behave-shell renamed from decnet-behave-* and bumped to 0.1.1.
pre-commit hook updated to reflect current ignore list.
Replace _jarm_phase / _hassh_phase / _tcpfp_phase boilerplate (3×~50
lines of identical port-iteration logic) with a metaclass-registered ABC.
Adding a new port-iterating active probe is now one class + three methods.
- decnet/prober/base.py: ActiveProbeMeta auto-registers subclasses by
probe_name; ActiveProbe ABC enforces run/syslog_fields/publish_payload
with env-driven DECNET_PROBE_PORTS_<NAME> port override.
- decnet/prober/probes/{jarm,hassh,tcpfp}.py: concrete probe classes.
- decnet/prober/worker.py: single _run_probe driver replaces the three
phase functions; _probe_cycle iterates ActiveProbeMeta.all(); drops
the ports=/ssh_ports=/tcpfp_ports= kwargs from prober_worker.
- IPv6 leak and TLS cert capture stay as special cases (different call
shapes; intentionally outside the registry).
- tests/prober/test_active_probe_registry.py: registry contents, sort
order, priority-10 override, ABC contract per probe class.
- tests/prober/test_run_probe_driver.py: dedup, success, None-skip,
exception, rotation, publish paths for _run_probe.
- tests/prober/test_prober_worker.py: updated patch targets and
_probe_cycle call sites; port control via monkeypatch.setattr.
- Add "ipv6_leak" to KNOWN_SOURCE_KINDS in ttp/base.py
- Register Ipv6LeakLifter(store) in factory.py get_tagger()
- Subscribe worker to attacker.fingerprinted; route by Event.type
so JARM/HASSH/ipv6_leak share the topic without source_kind collision
- Add bump_attacker_ipv6_leak() to BaseRepository (abstract) +
TTPMixin (implementation): increments ipv6_leak_count, sets last_ipv6_*
denorm fields, appends-with-dedup to AttackerIdentity.ipv6_link_local_iids
- Call bump_attacker_ipv6_leak from _process_event after insert_tags
- Add DummyRepo stub + coverage call in tests/db/test_base_repo.py
Ipv6LeakLifter subscribes to source_kind="ipv6_leak" events from both
the passive sniffer and active prober. Emits T1090 (Proxy) under TA0011
(C2) when fe80:: source address is observed — the attacker's VPN only
tunnels IPv4 so their link-local IID leaks their NIC identity.
Rule R0059 sets base confidence 0.85; iid_kind in the evidence carries
the per-observation strength (eui64 = MAC-derived, deterministic;
stable_privacy = RFC 7217; temporary = RFC 4941).
Add ipv6_leak.py with solicit_ipv6_leak() — sends ICMPv6 Echo to
ff02::1 on the attacker's iface and returns fe80:: evidence when a
link-local response arrives. Gated on _is_on_link(): skips when
attacker is behind a router (no L2 adjacency).
Add _ipv6_leak_phase() to worker.py (Phase 4 in _probe_cycle).
Phase runs once per attacker IP per cycle (sentinel at port 0 in
ip_probed["ipv6_leak"]) and publishes kind="ipv6_leak" via publish_fn.
Add list_v6_addrs(iface) to network.py: returns [(addr, scope)] for
all IPv6 addresses on an interface, required for source-routing ICMPv6
from the correct link-local address.
Add _ipv6_iid_classify() to fingerprint EUI-64 vs stable-privacy IIDs
and derive the MAC OUI from EUI-64-encoded link-local addresses.
SnifferEngine._on_ipv6_packet() observes fe80::/10 sources destined for
known deckies and emits ipv6_link_local_leak syslog + bus events.
on_packet() now dispatches the IPv6 branch before the v4 TCP path.
BPF default widened from "tcp" to "tcp or ip6" so the sniff loop
captures IPv6 frames without config change.
Attacker gains five denormalized cache fields (ipv6_leak_count,
last_ipv6_leak_at, last_ipv6_link_local, last_ipv6_iid_kind,
last_ipv6_mac_oui) mirroring the rotation_count/last_rotation_at pattern.
AttackerIdentity gains ipv6_link_local_iids (JSON list[dict]) for
EUI-64-derived MAC cluster signals that survive VPN/IP rotation.
No ALTER TABLE helpers — direct SQLModel column additions per pre-v1 policy.
Pins the evidence shape for IPv6 link-local leakage findings. All fields
optional (total=False) so partial observation (passive sniffer vs active
solicitation) fills whatever the vector provides. Lifter lands in a
subsequent commit.
- Add dedicated test-schema Makefile target (xdist logical, 600s timeout,
-m fuzz) so schemathesis runs separately from test-fuzz, which was
spinning up competing uvicorn workers per xdist process
- Exclude all test_schemathesis*.py files from FUZZ_FLAGS via --ignore
- Add schema to _ALL_SUITES between api and fuzz
- Add SCHEMA_QUICK env var (default 0): caps every max_examples to 100
across all four schemathesis files (4520 -> 600 total examples)
- Fix pre-push hook: use .311 venv and delegate to make test-all FAIL_FAST=0
instead of hand-rolling five separate pytest invocations
@pytest.fixture on an async fixture ignores loop_scope, so mysql_repo
ran on the per-function loop while mysql_test_db_url's engine was bound
to the module loop — triggering 'Future attached to a different loop'.
After the ingester._sleep alias fix, three tests in test_service_isolation.py
still patched `decnet.web.ingester.asyncio.sleep` (the old global-singleton
path). The ingester now calls `_sleep` directly, so those patches no longer
controlled the ingester's sleep — the worker looped with real asyncio.sleep
and the tests hung indefinitely.
Also: four API lifespan tests had no tarpit_watcher_worker patch, letting the
real tarpit task start. And test_api_survives_db_init_failure patched
`decnet.web.api.asyncio.sleep` (the singleton) instead of the existing
`_retry_sleep` alias.
Fixes:
- patch("decnet.web.ingester._sleep", ...) in the three ingester tests
- add tarpit_watcher_worker patch to all four api lifespan tests
- patch("decnet.web.api._retry_sleep", ...) in db_init_failure test
Two interacting bugs caused asyncio.sleep to be mocked globally,
letting tarpit_watcher_worker spin the event loop on a non-async
mock and accumulate _increment_mock_call records without bound:
1. test_ingester.py patched `decnet.web.ingester.asyncio.sleep` via
the asyncio singleton — any code in the process using asyncio.sleep
(including the tarpit worker) hit the fake_sleep side_effect.
Fix: add `_sleep = asyncio.sleep` alias in ingester.py and patch
`decnet.web.ingester._sleep` instead — scopes the mock to ingester.
2. test_api_startup_guards.py called `_run_lifespan_startup` without
DECNET_CONTRACT_TEST=true, which started the real tarpit task in a
manually-constructed event loop that the tests never cancelled.
Fix: set DECNET_CONTRACT_TEST=true inside _run_lifespan_startup so
the lifespan skips all background workers.
asyncio_default_fixture_loop_scope was 'module', so all async tests in
a module share one event loop. test_lifespan_startup_and_shutdown patched
log_ingestion_worker/log_collector_worker/attacker_profile_worker but not
tarpit_watcher_worker — the real while-True coroutine was created as an
asyncio task on the shared loop and never cancelled. The xdist worker ran
for 4+ hours (confirmed via py-spy + etime=04:48) consuming 15+ GB before
OOM-kill.
Fixes:
- Patch tarpit_watcher_worker in both TestLifespan tests
- Change asyncio_default_fixture_loop_scope to 'function' so each test
gets its own loop; tasks cannot outlive their test
- Add loop_scope='module' to precision_engine which legitimately needs
a module-scoped event loop
Five list columns (greynoise_tags, abuseipdb_categories, threatfox_threat_types,
threatfox_ioc_types, threatfox_malware_families) and four dict columns
(*_raw) are now Column(JSON) with list/dict type annotations and
default_factory=list/dict. Providers return native Python objects; the
application-layer json.dumps/json.loads round-trip and _decode_json_list
helpers are gone. to_intel_event_payload() reads columns directly.
Also caps pytest xdist at -n 4 and excludes tests/api from norecursedirs
to prevent schemathesis workers from OOM-killing the dev loop.
- test_evidence_shape.py: replace broken (command, BehavioralLifter)
pairing with correct (http_fingerprint, HttpFingerprintLifter) case;
expand _LIFTER_CASES to 5-tuples with per-lifter payloads and rule
factories; wire StubRuleStore + _index.install() per lifter; remove
xfail marker — all 4 parametrized cases now pass
- factory.py: add _span() helper gated on _telemetry._ENABLED; wrap
each per-lifter dispatch in _tag_one() that opens a
ttp.lifter.{name} child span per call
- http_fingerprint_lifter.py: add missing name = "http_fingerprint"
- test_tracing.py: replace pytest.fail() stubs in
test_lifter_child_spans_emitted and test_no_pii_canary_in_span_attributes
with real test bodies; remove xfail markers
Removes the E.3.14b xfail marker and writes the test body:
- _StubRepo gains get_attacker_intel_row_by_uuid(uuid) backed by an
optional intel_rows dict; existing tests pass None (no catch-up, no
change to their behaviour).
- The test drives a session.ended event with NO intel.enriched published,
injects an AttackerIntel row into the stub repo, and asserts the
tagger is called with source_kind='intel' carrying the correct payload
fields (abuseipdb_score, greynoise_classification).
- Pins the asymmetry contract: email.received has no catch-up path
(sibling test already green); intel does.
On every attacker.session.ended event, the TTP worker now reads the
persisted AttackerIntel row (if any) and synthesizes an intel-source
TaggerEvent so intel-derived tags emit even when attacker.intel.enriched
was dropped or arrived before the worker started.
Key changes:
- AttackerIntel.to_intel_event_payload() — single source of truth for
the intel-row → lifter payload projection; shared by future callers
without importing decnet.intel.* (no-SPOF contract preserved).
- BaseRepository.get_attacker_intel_row_by_uuid() — returns the live
SQLModel instance so the catch-up path can call to_intel_event_payload().
- _build_intel_catchup_event() in ttp/worker.py — looks up the intel row,
builds the TaggerEvent, returns None on absent row (silence, not error).
- _process_event() extended: appends the catch-up event to tagger_events
when topic contains "session.ended". Deterministic source_id keeps
compute_tag_uuid idempotent across replays; INSERT OR IGNORE deduplicates
against any prior attacker.intel.enriched path.
DummyRepo stub + coverage call added per feedback_run_base_repo_test.md.
Replace pytest.fail() stub with actual test body: constructs IntelLifter
with R0054, feeds score=30 payload, asserts confidence=0.21 (0.70×0.30)
which is below CONFIDENCE_FLOOR. xfail marker removed.
Corrects docstring: R0054 T1110 base_conf=0.70, not 0.85 as originally written.
- TolerantTagger.tag validates evidence keys against EVIDENCE_SCHEMA TypedDicts;
TypeError (programmer error) propagates instead of being swallowed
- IntelEvidence and EmailEvidence expanded from stubs to full per-provider
key sets (total=False); IntelEvidence old stub fields replaced wholesale
- EVIDENCE_SCHEMA map added to models/ttp.py and imported by base.py
- TTPTag __table_args__ gains confidence [0,1] CheckConstraint (DB-enforced)
- xfail removed from test_confidence_outside_range_rejected_at_insert and
test_evidence_shape_violation_propagates_as_typeerror — both now pass
- TypeError removed from _SWALLOWED_EXCS fuzz list; test_intel_evidence_keys
updated to assert the real provider key set
import decnet.cli as _decnet_cli at module level guarantees the app singleton is
built in master mode before any test can set DECNET_MODE=agent. Without this,
test_defence_in_depth_direct_call_fails_in_agent_mode triggered a fresh import
of decnet.cli with DECNET_MODE=agent active, which stripped master-only commands
and wrote the stripped module to sys.modules[decnet].cli — a parent-attribute
corruption that no sys.modules dict restore can fix.
- SSH schema: add user + user_password fields (service extended post-test)
- TopologySummary: repo.get_topology() returns model now, not raw dict
- health live: tarpit_watcher added to get_background_tasks(), add to expected set