DECNET

Author	SHA1	Message	Date
anti	ca39552692	feat(ua): classify User-Agent into scanner/cli/library/bot/nonstandard Every http_useragent bounty now carries a `category` label plus an optional tool name and a signals list. The main analytic win is the `nonstandard` bucket — UAs like "FUCKYOU/1.0" or custom one-off scanner labels that don't match any known pattern, which today silently blend into the generic fingerprint list. Buckets (priority order): - scanner: nmap, nuclei, sqlmap, gobuster, nikto, masscan, zgrab, ffuf, wpscan, katana, burp, acunetix, nessus, openvas, arachni, whatweb, wappalyzer, etc. - cli: curl, wget, httpie, xh, fetch. - library: python-requests, aiohttp, httpx, urllib, Go stdlib, Java, okhttp, Apache HttpClient, axios, node-fetch, got, undici, PHP, Guzzle, Ruby stdlib, Faraday, .NET, PostmanRuntime, Insomnia, etc. - bot: anything containing bot / crawler / spider / slurp / monitor (catches Googlebot, bingbot, Baiduspider — many of which ship a Mozilla/5.0 prefix, so the bot check runs BEFORE the browser regex). - browser: Mozilla/5.0-prefixed UAs that aren't bots. - nonstandard: anything else. The interesting bucket. - empty: literal empty User-Agent header. Side signals computed regardless of category: suspicious_short (<8 chars), suspicious_long (>512 chars), nonprintable (control chars), injection_like (SQLi / XSS / path-traversal / Log4Shell markers). A sqlmap UA with a literal SQL-injection payload embedded fires category=scanner + injection_like — the combination tells the analyst the tool is being operated manually vs. on default config. Classification is deterministic (same UA string → same tuple) so add_bounty's payload-hash dedup continues to collapse repeat rows. UI renderer upgraded from FpGeneric to a dedicated FpUserAgent that colours the category tag by risk (scanner=alert-red, nonstandard=warn-yellow, browser=accent-green, etc.) and renders each signal as its own chip. Makes the interesting rows pop in the fingerprints panel. Also fixed: the ingester was using `_headers.get("User-Agent") or _headers.get("user-agent")`, which short-circuits away empty-string UAs. An explicit empty UA is itself a signal (real clients always send something) — now captured.	2026-04-24 18:17:18 -04:00
anti	6d1d69443a	fix(xff): split leak from spoof — loopback/private claims aren't leaks An attacker hitting /admin with `X-Forwarded-For: 127.0.0.1` was previously flagged as an IP leak. It isn't — that's the classic IP-allowlist / WAF-bypass payload ("treat me as localhost and skip your auth checks"). Misclassifying it as "LEAKED IPs" in the UI confuses analysts and burns trust in the signal. Split by claim category. After pulling the left-most claimed IP from the proxy header, classify: - public (routable) → bounty_type=ip_leak (real attribution leak; the attacker's upstream proxy forwarded their real IP). - loopback / private / link-local / multicast / reserved / unspecified → bounty_type=fingerprint, fingerprint_type= spoofed_source (WAF-bypass / allowlist-probing attempt; the attacker is telling us they know what XFF does). - unparseable → dropped. Same extraction pipeline; diverges only at the last step. A new shared _classify_proxy_header_claim returns (kind, payload); _detect_ip_leak keeps its public-only contract for backward- compat; _detect_spoofed_source is the new sibling. UI renderer FpSpoofedSource shows the claimed IP in warn color with the claim_category tag (LOOPBACK / PRIVATE / ...) and a WAF-BYPASS ATTEMPT badge — distinct visual from the "LEAKED IPs" row which stays reserved for genuine public-IP leaks. Test addresses updated: RFC 5737 doc ranges (198.51.100.0/24, 203.0.113.0/24) are flagged `is_reserved` in Python's ipaddress module, so they now correctly belong to the spoof bucket — tests that meant to exercise real public IPs now use 8.8.8.8 / 1.1.1.1 / Cloudflare DNS. Added eleven new tests locking the classifier + the two detectors' mutual exclusion.	2026-04-24 18:06:29 -04:00
anti	2c876b4d86	fix(bounties): strip per-request fields from fingerprint payloads add_bounty dedups on (attacker_ip, bounty_type, full payload JSON). Three fingerprint-family bounties (http_useragent, ip_leak, http_quirks) were including method/path / header_count in their payloads — fields that vary per request — so a scanner hitting 100 paths produced 100 rows instead of 1, which is what was swelling AttackerDetail. Payloads now carry identity-only fields: - http_useragent: {fingerprint_type, value}. UA + path combinations no longer collide; one row per distinct User-Agent string. - ip_leak: {source_ip, real_ip_claim, source_header, headers_seen}. One row per distinct (proxy source, leaked IP, leaking header) triple; repeat hits with the same header on different paths dedup. - http_quirks: {fingerprint_type, order_hash, order, casing_hash, casing_category, stable_count, tool_guess}. No more header_count (included volatile headers; Cookie-presence variance broke dedup). Per-request context (path, method, etc.) was never load-bearing for analysts — the logs table already answers "when + where" at per-event resolution. The bounty table is for stable identity. UI: - FpHttpQuirks renderer drops the method/path footer line and the header_count/duplicates tags; shows stable_count instead. - LEAKED-IPs tooltip on AttackerDetail swaps "X on GET /path" for "Leaked via X; source 203.0.113.42" — same information, stable. Tests add a "payload stable across paths and methods" assertion on http_quirks — locks the contract so a future regression that sneaks a per-request field back in fails loudly. Existing duplicate bounty rows don't retroactively collapse. Dev: `decnet db-reset --i-know-what-im-doing drop-tables` and restart. Prod: one SQL pass to dedup by (attacker_ip, bounty_type, payload) — trivial but not automated.	2026-04-24 17:58:54 -04:00
anti	dccb410bb3	feat(http): header-quirks fingerprint — order + casing + tool guess Per-request HTTP fingerprint derived from the header dict we already log. Captures: - order_hash: SHA-256 prefix (16 hex) over the lowercased header-name sequence, minus volatile/per-request headers (Content-Length, Cookie, Authorization, XFF family, trace IDs). Stable identity for a given client stack regardless of which target / path is hit. - casing_hash: same shape but over the per-header casing category (Title-Case / lower / UPPER / mixed). Attackers frequently spoof User-Agent but forget their stack sends `user-agent` while browsers send `User-Agent`. - tool_guess: prefix match against curl / python-requests / Go-http-client / nmap-nse signatures. Cheap, best-effort — the hash is the hard signal. - duplicates: reserved for when the HTTP template switches from dict(request.headers) to a list form; today it always fires empty because dict() collapses duplicates. Payload is a fingerprint bounty (bounty_type="fingerprint", fingerprint_type="http_quirks"). Bounty dedup collapses identical hashes per attacker — one row per distinct fingerprint — so a chatty scanner doesn't spam the vault, but a tool-chain change from the same IP surfaces as a new row. UI renderer (FpHttpQuirks) shows the two hashes, tool guess badge in violet, casing/count tags, and a collapsible header-order list. Added to the passiveTypes group so it nests with JA3/JA4L/etc. in the AttackerDetail fingerprints panel. One library note: the naive "title-case" classifier failed on tokens like `X-Forwarded-For` because Python's "".islower() returns False so `p[1:].islower()` rejects single-letter tokens like the `X`. Fix: explicitly accept single-char tokens when uppercase.	2026-04-24 17:51:40 -04:00
anti	2a0c5ca410	feat(attackers): XFF mismatch detection — attacker IP leak bounties Attackers routinely front their scanners with VPNs/proxies, so the TCP source we log is the proxy egress, not the real host. But a surprising number of attacker setups are misconfigured: the proxy forwards the real IP in an X-Forwarded-For (or Forwarded / X-Real-IP / CDN-variant) header. From our side that's a free attribution leak. New _detect_ip_leak extractor in decnet/web/ingester.py fires at ingest time per HTTP request. Logic: 1. Require service=http, source_ip present, headers present. 2. If source_ip ∈ DECNET_TRUSTED_PROXIES (comma-separated IPs or CIDRs) → legitimate reverse-proxy forwarding, skip. 3. Walk proxy-family headers in priority order: Forwarded (RFC 7239) → X-Forwarded-For → X-Real-IP → True-Client-IP → CF-Connecting-IP. 4. Extract the left-most parseable IP from the winning header. 5. If that IP differs from the TCP source → emit a bounty with bounty_type="ip_leak" carrying {source_ip, real_ip_claim, source_header, headers_seen, path, method}. Storage is the existing Bounty table — no schema change; de-dup is handled by Bounty's (attacker_ip, bounty_type, payload_hash) key, so repeat requests with the same leaked IP don't spam. AttackerDetail renders a warn-accent "LEAKED IPs:" row under ORIGIN listing distinct real_ip_claim values; hover tooltip shows the source header + path of the most recent leak. Only shown when at least one ip_leak bounty exists. RFC 7239 Forwarded parser handles the full vocabulary — bare IPv4, IPv4:port, quoted, IPv6 in brackets, IPv6 with port — returning only IPs that actually parse. Closes DEVELOPMENT.md "Network Topology Leakage → X-Forwarded-For mismatches". Phase 3 of the three-phase Attacker Intelligence series (phases 1: scanned-vs-interacted, 2: PTR records already shipped). DECNET_TRUSTED_PROXIES env shape matches THREAT_MODEL DA-08's "revisit when verified-proxy config lands" note — same token set future rate-limit work will consume.	2026-04-24 17:39:03 -04:00
anti	cbb394a160	feat(ingester): publish system.log per committed batch (DEBT-031 worker 6) Ingester connects the bus at startup, emits a batch-committed summary (component/flushed/position) after each successful _flush_batch. Zero- row flushes are suppressed so the topic stays meaningful. Complements the collector's per-line system.log publishes: collector signals ingress, ingester signals DB-persisted progress. Federation forwarder (worker 8) will subscribe to the batch-committed leaf to trigger its upstream push. Bus stays optional: publish_safely swallows failures, get_bus() can return None, DECNET_BUS_ENABLED=false leaves the ingestion loop fully functional.	2026-04-21 16:58:49 -04:00
anti	a10aee282f	perf(ingester): batch log writes into bulk commits The ingester now accumulates up to DECNET_BATCH_SIZE rows (default 100) or DECNET_BATCH_MAX_WAIT_MS (default 250ms) before flushing through repo.add_logs — one transaction, one COMMIT per batch instead of per row. Under attacker traffic this collapses N commits into ⌈N/100⌉ and takes most of the SQLite writer-lock contention off the hot path. Flush semantics are cancel-safe: _position only advances after a batch commits successfully, and the flush helper bails without touching the DB if the enclosing task is being cancelled (lifespan teardown). Un-flushed lines stay in the file and are re-read on next startup. Tests updated to assert on add_logs (bulk) instead of the per-row add_log that the ingester no longer uses, plus a new test that 250 lines flush in ≤5 calls.	2026-04-17 16:37:34 -04:00
anti	70d8ffc607	feat: complete OTEL tracing across all services with pipeline bridge and docs Extends tracing to every remaining module: all 23 API route handlers, correlation engine, sniffer (fingerprint/p0f/syslog), prober (jarm/hassh/tcpfp), profiler behavioral analysis, logging subsystem, engine, and mutator. Bridges the ingester→SSE trace gap by persisting trace_id/span_id columns on the logs table and creating OTEL span links in the SSE endpoint. Adds log-trace correlation via _TraceContextFilter injecting otel_trace_id into Python LogRecords. Includes development/docs/TRACING.md with full span reference (76 spans), pipeline propagation architecture, quick start guide, and troubleshooting.	2026-04-16 00:58:08 -04:00
anti	04db13afae	feat: cross-stage trace propagation and granular per-event spans Collector now creates a span per event and injects W3C trace context into JSON records. Ingester extracts that context and creates child spans, connecting the full event journey: collector -> ingester -> db.add_log + extract_bounty -> db.add_bounty. Profiler now creates per-IP spans inside update_profiles with rich attributes (event_count, is_traversal, bounty_count, command_count). Traces in Jaeger now show the complete execution map from capture through ingestion and profiling.	2026-04-15 23:52:13 -04:00
anti	65ddb0b359	feat: add OpenTelemetry distributed tracing across all DECNET services Gated by DECNET_DEVELOPER_TRACING env var (default off, zero overhead). When enabled, traces flow through FastAPI routes, background workers (collector, ingester, profiler, sniffer, prober), engine/mutator operations, and all DB calls via TracedRepository proxy. Includes Jaeger docker-compose for local dev and 18 unit tests.	2026-04-15 23:23:13 -04:00
anti	89887ec6fd	fix: serialize HTTP headers as JSON so tool detection and bounty extraction work templates/decnet_logging.py calls str(v) on all SD-PARAM values, turning a headers dict into Python repr ('{'User-Agent': ...}') rather than JSON. detect_tools_from_headers() called json.loads() on that string and silently swallowed the error, returning [] for every HTTP event. Same bug prevented the ingester from extracting User-Agent bounty fingerprints. - templates/http/server.py: wrap headers dict in json.dumps() before passing to syslog_line so the value is a valid JSON string in the syslog record - behavioral.py: add ast.literal_eval fallback for existing DB rows that were stored with the old Python repr format - ingester.py: parse headers as JSON string in _extract_bounty so User-Agent fingerprints are stored correctly going forward - tests: add test_json_string_headers and test_python_repr_headers_fallback to exercise both formats in detect_tools_from_headers	2026-04-15 17:03:52 -04:00
anti	63efe6c7ba	fix: persist ingester position and profiler cursor across restarts - Ingester now loads byte-offset from DB on startup (key: ingest_worker_position) and saves it after each batch — prevents full re-read on every API restart - On file truncation/rotation the saved offset is reset to 0 - Profiler worker now loads last_log_id from DB on startup — every restart becomes an incremental update instead of a full cold rebuild - Updated all affected tests to mock get_state/set_state; added new tests covering position restore, set_state call, truncation reset, and cursor restore/cold-start paths	2026-04-15 13:58:12 -04:00
anti	2dcf47985e	feat: add HASSHServer and TCP/IP stack fingerprinting to DECNET-PROBER Extends the prober with two new active probe types alongside JARM: - HASSHServer: SSH server fingerprinting via KEX_INIT algorithm ordering (MD5 hash of kex;enc_s2c;mac_s2c;comp_s2c, pure stdlib) - TCP/IP stack: OS/tool fingerprinting via SYN-ACK analysis using scapy (TTL, window size, DF bit, MSS, TCP options ordering, SHA256 hash) Worker probe cycle now runs three phases per IP with independent per-type port tracking. Ingester extracts bounties for all three fingerprint types.	2026-04-14 12:53:55 -04:00
anti	ce2699455b	feat: DECNET-PROBER standalone JARM fingerprinting service Add active TLS probing via JARM to identify C2 frameworks (Cobalt Strike, Sliver, Metasploit) by their TLS server implementation quirks. Runs as a detached host-level process — no container dependency. - decnet/prober/jarm.py: pure-stdlib JARM implementation (10 crafted probes) - decnet/prober/worker.py: standalone async worker with RFC 5424 + JSON output - CLI: `decnet probe --targets ip:port` and `--probe-targets` on deploy - Ingester: JARM bounty extraction (fingerprint type) - 68 new tests covering JARM logic and bounty extraction	2026-04-14 12:14:32 -04:00
anti	ea340065c6	feat: JA4/JA4S/JA4L fingerprints, TLS session resumption, certificate extraction Extend the passive TLS sniffer with next-gen attacker fingerprinting: - JA4 (ClientHello) and JA4S (ServerHello) computation with supported_versions, signature_algorithms, and ALPN parsing - JA4L latency measurement via TCP SYN→SYN-ACK RTT tracking - TLS session resumption detection (session tickets, PSK, 0-RTT early data) - Certificate extraction for TLS ≤1.2 with minimal DER/ASN.1 parser (subject CN, issuer, SANs, validity period, self-signed flag) - Ingester bounty extraction for all new fingerprint types - 116 tests covering all new functionality (1255 total passing)	2026-04-13 23:20:37 -04:00
anti	3dc5b509f6	feat: Phase 1 — JA3/JA3S sniffer, Attacker model, profile worker Add passive TLS fingerprinting via a sniffer container on the MACVLAN interface, plus the Attacker table and periodic rebuild worker that correlates per-IP profiles from Log + Bounty + CorrelationEngine. - templates/sniffer/: Scapy sniffer with pure-Python TLS parser; emits tls_client_hello / tls_session RFC 5424 lines with ja3, ja3s, sni, alpn, raw_ciphers, raw_extensions; GREASE filtered per RFC 8701 - decnet/services/sniffer.py: service plugin (no ports, NET_RAW/NET_ADMIN) - decnet/web/db/models.py: Attacker SQLModel table + AttackersResponse - decnet/web/db/repository.py: 5 new abstract methods - decnet/web/db/sqlite/repository.py: implement all 5 (upsert, pagination, sort by recent/active/traversals, bounty grouping) - decnet/web/attacker_worker.py: 30s periodic rebuild via CorrelationEngine; extracts commands from log fields, merges fingerprint bounties - decnet/web/api.py: wire attacker_profile_worker into lifespan - decnet/web/ingester.py: extract JA3 bounty (fingerprint_type=ja3) - development/DEVELOPMENT.md: full attacker intelligence collection roadmap - pyproject.toml: scapy>=2.6.1 added to dev deps - tests: test_sniffer_ja3.py (40+ vectors), test_attacker_worker.py, test_base_repo.py / test_web_api.py updated for new surface	2026-04-13 20:22:08 -04:00
anti	435c004760	feat: extract HTTP User-Agent and VNC client version as fingerprint bounties Some checks failed CI / Lint (ruff) (push) Successful in 11s Details CI / SAST (bandit) (push) Successful in 14s Details CI / Dependency audit (pip-audit) (push) Successful in 24s Details CI / Test (Standard) (3.11) (push) Successful in 2m2s Details CI / Test (Standard) (3.12) (push) Successful in 2m5s Details CI / Test (Live) (3.11) (push) Successful in 56s Details CI / Test (Fuzz) (3.11) (push) Failing after 6m25s Details CI / Merge dev → testing (push) Has been skipped Details CI / Prepare Merge to Main (push) Has been skipped Details CI / Finalize Merge to Main (push) Has been skipped Details	2026-04-13 08:14:38 -04:00
anti	035499f255	feat: add component-aware RFC 5424 application logging system - Modify Rfc5424Formatter to read decnet_component from LogRecord and use it as RFC 5424 APP-NAME field (falls back to 'decnet') - Add get_logger(component) factory in decnet/logging/__init__.py with _ComponentFilter that injects decnet_component on each record - Wire all five layers to their component tag: cli -> 'cli', engine -> 'engine', api -> 'api' (api.py, ingester, routers), mutator -> 'mutator', collector -> 'collector' - Add structured INFO/DEBUG/WARNING/ERROR log calls throughout each layer per the defined vocabulary; DEBUG calls are suppressed unless DECNET_DEVELOPER=true - Add tests/test_logging.py covering factory, filter, formatter component-awareness, fallback behaviour, and level gating	2026-04-13 07:39:01 -04:00
anti	b2e4706a14	Refactor: implemented Repository Factory and Async Mutator Engine. Decoupled storage logic and enforced Dependency Injection across CLI and Web API. Updated documentation. Some checks failed CI / Lint (ruff) (push) Successful in 12s Details CI / SAST (bandit) (push) Successful in 13s Details CI / Dependency audit (pip-audit) (push) Successful in 22s Details CI / Test (Standard) (3.11) (push) Failing after 54s Details CI / Test (Standard) (3.12) (push) Successful in 1m35s Details CI / Test (Live) (3.11) (push) Has been skipped Details CI / Test (Fuzz) (3.11) (push) Has been skipped Details CI / Merge dev → testing (push) Has been skipped Details CI / Prepare Merge to Main (push) Has been skipped Details CI / Finalize Merge to Main (push) Has been skipped Details	2026-04-12 07:48:17 -04:00
anti	de84cc664f	refactor: migrate database to SQLModel and implement modular DB structure	2026-04-09 16:43:30 -04:00
anti	69626d705d	feat: implement Bounty Vault for captured credentials and artifacts	2026-04-09 01:52:50 -04:00
anti	ba2faba5d5	chore: enforce strict typing and internal naming conventions across web components	2026-04-07 19:56:15 -04:00
anti	5f637b5272	feat: switch to JSON-based log ingestion for higher reliability	2026-04-07 15:47:29 -04:00
anti	bad90dfb75	feat: implement background log ingestion from local file	2026-04-07 15:30:44 -04:00

24 Commits