DECNET

Author	SHA1	Message	Date
anti	d90c8b70ce	feat(profiler/behave_shell): emit motor.keystroke_cadence BEHAVE-EXTRACTOR.md Phase B Step B.1. * SessionContext gains typing_bursts: tuple[tuple[float, ...], ...] built by _split_typing_bursts(iats) — splits at gaps > IKI_THINK_MAX_S (1.5s) and drops bursts of fewer than 3 IATs. Mirrors prototype's _split_into_bursts at BEHAVE/prototype_extractors/shell/extract.py:275. * _features/motor.py:keystroke_cadence(ctx) emits one Observation in {steady, bursty, hunt_and_peck, machine}. Median CV across typing bursts; mean IKI < IKI_MACHINE_MAX_S paired with CV < CV_MACHINE_MAX → machine. Confidence 0.85/0.70/0.65/0.60 per the prototype's calibration history. * < MIN_INPUTS_FOR_CADENCE inputs or zero typing bursts → skip emission. v0.1 emits only the burst-CV variant; the prototype's NAIVE session-CV variant is parked for v0.2. * Calibration grid widened (PHASE_A_PRIMITIVES → PHASE_AB_PRIMITIVES) to include motor.keystroke_cadence. Grid green across all five shards. Tests: too-few-inputs → no emit, all-think-pauses → no burst → no emit, uniform IATs → steady, sub-5ms → machine, mixed-pace → bursty, extreme bimodal → hunt_and_peck.	2026-05-03 21:24:13 -04:00
anti	640294f3dc	test(profiler/behave_shell): five-class calibration grid lockdown BEHAVE-EXTRACTOR.md Phase A Step 9 — the gate. Runs the pure engine against each of the five 2026-05-02 calibration shards and pins the contract that all subsequent Phase B-G PRs must keep green: every Phase A primitive (motor.input_modality, motor.paste_burst_rate, cognitive.inter_command_latency_class, cognitive.command_branch_diversity, cognitive.feedback_loop_engagement, cognitive.inter_command_consistency) fires at least once per shard. * tests/profiler/behave_shell/test_calibration_grid.py parametrized over (shard_file, class_label) for HUMAN / YOU-sim / LW-sim / CLAUDE-FF / CLAUDE-CL. Skips entirely when BEHAVE_CALIBRATION_DIR is unset (CI provides the path; local dev doesn't have to). * Plus a discrimination-smoke check: at least one primitive produces different majority values across present classes — catches the "constant-output regression" failure mode where the engine quietly degenerates to a stub. Calibration tweak: BRANCH_DIVERSITY_LINEAR_MIN dropped from 0.80 to 0.70 to align with the prototype's empirical anchors (CLAUDE-CL ≈ 0.55-0.60 adaptive; YOU-sim / CLAUDE-FF scripted recon ≈ 0.75+ linear). Test for the middle band re-pinned at the new boundary. Per-class value pinning (e.g. HUMAN must emit inter_command_consistency=bimodal) is intentionally NOT a hard gate yet — v0.1 thresholds put real human sessions in "variable", and true bimodal detection (Hartigan dip / two-peak) is registry-flagged for v0.2. Tighter pinning lands as the corpus grows.	2026-05-03 08:00:50 -04:00
anti	842b7de950	feat(profiler/behave_shell): emit cognitive.inter_command_consistency BEHAVE-EXTRACTOR.md Phase A Step 8. Dispersion / bimodality of inter-command pauses. HUMAN-bimodal vs LLM-metronomic. * _features/cognitive.py:inter_command_consistency(ctx) emits one Observation in {metronomic, variable, bimodal}. * CV = stdev / mean of ctx.inter_cmd_iats. CV < 0.40 → metronomic (LLM-pure; corpus anchor 0.24); CV ≥ 1.50 → bimodal heuristic (LLM-assisted human; v0.1 placeholder, true bimodal via Hartigan dip is registry-flagged for v0.2); else → variable (human; corpus anchor 0.94). * < 2 IATs or zero mean → skip emission. < 5 commands halves confidence (0.40 vs 0.75) per sample-size honesty. Tests: too-few IATs → no emission, uniform → metronomic, human-like dispersion → variable, extreme bursts+gaps → bimodal, low-sample-count → reduced confidence. Step 8 closes the six-primitive calibration floor for Phase A. Step 9 (calibration grid lockdown) is the gate that pins it.	2026-05-03 07:56:49 -04:00
anti	2f8c107e70	feat(profiler/behave_shell): emit cognitive.feedback_loop_engagement BEHAVE-EXTRACTOR.md Phase A Step 7. The orthogonal axis — does the operator's pause-after-command correlate with bytes of output they just saw? Splits HUMAN/CLAUDE-CL (closed_loop) from LW-sim/CLAUDE-FF (fire_and_forget); cuts ACROSS the LLM/human axis. * _features/cognitive.py:feedback_loop_engagement(ctx) emits one Observation in {closed_loop, fire_and_forget, unknown}. * Pearson correlation between ctx.output_per_cmd[i] and ctx.inter_cmd_iats[i] (paired by construction in Step 4); via statistics.correlation with constant-series fallback to "unknown". * r > FEEDBACK_CORRELATION_MIN (0.30) → closed_loop; otherwise (zero, negative, or undefined) → fire_and_forget. * First primitive that depends on output events: zero output events in the shard or fewer than FEEDBACK_MIN_PAIRS (5) pairs → emit "unknown" at confidence 1.0 (the absence-of-data is itself a high-confidence answer). Zero-command session skips entirely. Tests: no-output → unknown, few-pairs → unknown, strong positive r → closed_loop, constant pace → fire_and_forget/unknown, negative r → fire_and_forget.	2026-05-03 07:55:38 -04:00
anti	3fc6ea5f75	feat(profiler/behave_shell): emit cognitive.command_branch_diversity BEHAVE-EXTRACTOR.md Phase A Step 6. Content-based playbook-vs- adaptive split. Splits CLAUDE-FF (linear_playbook, ~10 distinct tools) from CLAUDE-CL (adaptive_branching, 5-6 tools with curl re-invoked) per the 2026-05-02 empirical anchor. * _features/cognitive.py:command_branch_diversity(ctx) emits one Observation in {linear_playbook, adaptive_branching, unknown}. * unique_first_token_hashes / total_commands ratio. ≥ 0.80 → linear_playbook, otherwise adaptive_branching (the doc instructs bias-to-adaptive in the middle band — that's the discriminative signal we actually want). * < 5 commands → "unknown" at confidence 1.0 (the absence of data is itself a high-confidence answer per the registry's allowed vocabulary). Zero-command session skips emission entirely. Tests cover unique-tokens → linear, repeated-tokens → adaptive, middle band → adaptive (bias), under-floor → unknown @ 1.0, plus PII regression: raw tokens never appear in the serialised observation.	2026-05-03 07:54:13 -04:00
anti	e52a0e0381	feat(profiler/behave_shell): emit cognitive.inter_command_latency_class BEHAVE-EXTRACTOR.md Phase A Step 5. Classifies the operator's thinking pace between commands. Splits LW-sim / CLAUDE-FF / CLAUDE-CL. * _features/cognitive.py:inter_command_latency_class(ctx) emits one Observation in {instant, typing_speed, deliberate, llm_lightweight, llm_heavyweight, long}, computed as the median of ctx.inter_cmd_iats bucketed against the prototype thresholds (v0.2 split: lightweight 2-8s, heavyweight 8-30s). * Sample-size honesty: < 5 commands halves confidence (0.40 vs 0.80) per BEHAVE-EXTRACTOR.md. * Threshold consts (INTER_CMD_*_MAX, MIN_COMMANDS_FOR_FULL_CONFIDENCE, plus parked Step 6/7/8 thresholds for the next three commits) added to _thresholds.py. Tests cover all six buckets at empirically-anchored IATs (15s ≈ Claude Opus driving recon via tmux send-keys), plus the single-command no-IAT and low-sample-count paths.	2026-05-03 07:52:39 -04:00
anti	f3880b24d1	feat(profiler/behave_shell): command segmentation in SessionContext BEHAVE-EXTRACTOR.md Phase A Step 4. Pure refactor inside _ctx.py — no new feature emits. Lays the shared utility for the three cognitive primitives next in line (Steps 5-7). * Command dataclass (frozen): start_ts, end_ts, first_token_hash. PII-safe by construction — only the first whitespace-delimited token of the command is retained, and only as a sha256 hash (decnet/profiler/behave_shell/_parse.py:hash_token). * _segment_commands walks input events char-by-char, splits on \r / \n, hashes the first token, drops the rest. * SessionContext gains commands, inter_cmd_iats, output_per_cmd. output_per_cmd[i] counts bytes between commands[i].end_ts and commands[i+1].start_ts — the natural pairing for Step 7 (feedback_loop_engagement). Tests: empty / unterminated streams, single command (CR + LF terminators), paste-with-newline, multi-command IAT pairing, output-byte counting between boundaries, blank-line skip, first-token-only PII discipline.	2026-05-03 07:50:55 -04:00
anti	6763fceb0b	feat(profiler/behave_shell): emit motor.paste_burst_rate BEHAVE-EXTRACTOR.md Phase A Step 3. Same paste-event ratio as motor.input_modality but coarser-bucketed: this is the habit signal (does the operator reach for paste at all?), where input_modality is the dominant-channel signal. * _features/motor.py:paste_burst_rate(ctx) emits one Observation per session in {none, occasional, habitual} with confidence 0.70 / 0.70 / 0.80. * Thresholds: PASTE_RATE_OCCASIONAL_MIN=0.10, PASTE_RATE_HABITUAL_MIN=0.50. Splits YOU-sim from LW/CLAUDE-FF/CLAUDE-CL — LLM-driven sessions paste habitually, real humans rarely paste. Tests: pure-typed → none; 1-paste-in-10 → occasional; paste-majority → habitual; output-only → no observation; habitual confidence > occasional confidence.	2026-05-03 07:49:03 -04:00
anti	879f5e731b	feat(profiler/behave_shell): emit motor.input_modality BEHAVE-EXTRACTOR.md Phase A Step 2. The first primitive — picked first because it has the highest discriminative value (HUMAN vs everyone) and the simplest implementation (paste-event ratio over total inputs). * _features/motor.py:input_modality(ctx) emits one Observation per session in {typed, pasted, mixed} with confidence 0.75 / 0.70. * _features/_emit.py centralises the make_observation helper so every feature module gets the same Window/source/evidence_ref boilerplate without copy-paste. * Thresholds inherited from the prototype's calibration history (MODALITY_PASTED_MIN=0.40, MODALITY_TYPED_MAX=0.05). * Zero-input session skips emission — registry doesn't admit "unknown" here. Tests: pure-typed → typed, pure-pasted → pasted, mixed → mixed, output-only session → no observation, full envelope round-trip.	2026-05-03 07:47:38 -04:00
anti	c9a81a23c2	feat(profiler/behave_shell): asciinema parser + paste-burst detection BEHAVE-EXTRACTOR.md Phase A Step 1. Lays the shared primitives that Steps 2-3 (motor.input_modality, motor.paste_burst_rate) will consume: * parse_shard_line / parse_shard turn a shard JSONL line/file into AsciinemaEvents, skipping headers and malformed records. * PasteBurst dataclass + _detect_paste_bursts group consecutive paste-class input events (len(d) >= 4 chars per the prototype's empirical floor) into contiguous bursts, splitting on IAT gaps larger than PASTE_BURST_MAX_IAT_S (200ms). * SessionContext now carries iats and paste_bursts derivations. * Threshold constants harvested from BEHAVE/prototype_extractors/shell/extract.py — calibrated against the five 2026-05-02 shards. Tests cover pure-typed, pure-pasted, mixed streams; close vs far paste events; typed events breaking a burst; PasteBurst immutability; and the JSON parser's junk handling.	2026-05-03 07:46:01 -04:00
anti	f8eae04e5d	feat(profiler/behave_shell): scaffold extract_session entry point BEHAVE-EXTRACTOR.md Phase A Step 0. Lays the package skeleton (__init__/extract/_parse/_ctx/_thresholds/_features) with empty FEATURES = (), so the worker plumbing in BEHAVE-INTEGRATION Phase 4 has a stable import path before any primitive lands. extract_session() builds a SessionContext once and fans the registered feature functions across it; at Step 0 that fan-out is empty and the function yields nothing. Step 1 (asciinema parser + paste-burst detector) and Step 2 (motor.input_modality) land next. Smoke suite asserts the empty contract: empty stream → no observations, single event → t_start == t_end, multi-event → events routed into input_events / output_events by kind, evidence_ref defaults to "session:<sid>" or honours an explicit override.	2026-05-03 07:42:09 -04:00
anti	a2a61b636e	feat(web): drop SessionProfile, wire observations into AttackerDetail (DEBT-050 / DEBT-036 closure) Destructive half of BEHAVE-INTEGRATION.md Phase 1. SessionProfile + its kd_* columns + the dialect ALTER TABLE migration helpers are deleted outright; pre-v1, the table shipped empty, no migration ceremony required (per the no-new-_migrate_-pre-v1 memory rule). DEBT-036 closes via DEBT-050 supersedure. AttackerDetail's ``observations`` field is wired to the new ``observations`` table and returns an empty list until the BEHAVE-SHELL extractor (DEBT-050 Phase 2) starts emitting. decnet/web/db/models/attackers.py — SessionProfile class deleted (~135 lines), KD_PAUSE_*/KD_START_OF_ACTION_IDLE_S module constants deleted, module docstring updated to point at the observations table. AttackerIdentity.kd_digraph_simhash is KEPT — it's the v2 federation centroid hook, not a SessionProfile field; docstring repointed to the BEHAVE primitive that will populate it. decnet/web/db/sqlmodel_repo/attackers/sessions.py — DELETED. SessionProfilesMixin dropped from the AttackersMixin MRO. decnet/web/db/repository.py — abstract upsert_session_profile + get_session_profile removed. decnet/web/db/sqlite/repository.py + mysql/repository.py — _migrate_session_profile_table helpers and their initialize() calls removed. mysql initialize() now goes attackers → column_types → admin (no session_profile step). decnet/web/db/models/__init__.py — SessionProfile re-export gone. decnet/web/db/models/attacker_intel.py — docstring cross-reference to SessionProfile.schema_version retargeted to AttackerIdentity. decnet/web/router/attackers/api_get_attacker_detail.py — adds ``observations: []`` to the response by calling ``repo.latest_observation_per_primitive(uuid)`` and projecting to a list sorted by primitive path. Empty until the extractor lands; shape matches BEHAVE-INTEGRATION.md §"AttackerDetail consumer". tests/profiler/test_session_profile.py — DELETED (56 lines). tests/db/test_base_repo.py — DummyRepo loses upsert_session_profile and get_session_profile overrides. tests/db/mysql/test_mysql_migration.py — initialize-call-order assertion updated; session_profile step removed from the expected sequence; docstring records why. tests/ttp/test_lifter_absence.py — docstring "no SessionProfile" → "no ObservationRow".	2026-05-03 07:33:37 -04:00
anti	72cc928ebf	feat(prober-cert): roll up fingerprints onto AttackerIdentity Brings the federation-gossip columns on AttackerIdentity to life — ja3_hashes, hassh_hashes, and the new tls_cert_sha256 — by projecting the union of every member observation's fingerprints JSON onto the identity at clusterer create / link / merge time. - decnet/profiler/identity_rollup.py: pure extract_fp_summaries() reads the production bounty shape (payload.fingerprint_type + payload.{ja3,hash,cert_sha256}) and returns deduped+sorted JSON list[str] per family, or None when a family has no signal so the column stays NULL instead of '[]'. - BaseRepository.update_identity_fingerprints + SQLModel impl: one idempotent write that overwrites the three summary columns and bumps updated_at. - ConnectedComponentsClusterer: after every per-component reconciliation (fresh-create OR existing-merge+link), recomputes and writes the rollup for the target identity. Wrapped in a best-effort helper so a write failure logs but never breaks the tick. - Tests: extract_fp_summaries unit (dedup, sort determinism, unknown types ignored, malformed JSON, nested-stringified payloads, non-string values); end-to-end clusterer ticks populate the columns on create + on later observation links; no-fingerprint clusters keep the columns NULL.	2026-04-28 11:28:54 -04:00
anti	00ecea924a	feat(profiler): backfill Credential.attacker_uuid on attacker upsert Credential capture runs before the profiler mints an Attacker, so Credential.attacker_uuid is nullable on write. The profiler now backfills the FK after each successful upsert_attacker. Soft-fail posture matches the surrounding behavior + smtp rollups so a backfill error never blocks the next attacker.	2026-04-26 03:30:44 -04:00
anti	5a34371009	feat(attackers): PTR record (reverse DNS) enrichment Resolve each attacker IP's rDNS name once at first sighting, store on Attacker.ptr_record, render on AttackerDetail under ORIGIN. Many attackers run infrastructure with forgotten rDNS that instantly identifies them once surfaced: scan-node-42.shodan.io, shady-vps.leasecloud.net, etc. Resolver lives in decnet/geoip/ptr.py — colocated with enrich_ip because the shape matches (take an IP, return supplementary metadata, never raise). Uses the OS resolver via socket.gethostbyaddr offloaded to the default executor, wrapped with asyncio.wait_for timeout=2s so a slow authoritative NS can't stall the profiler tick. Profiler side: _WorkerState grows a ptr_attempted: set[str] bounding resolution to once per worker lifetime. Cold-start batches resolve concurrently (Semaphore(_PTR_CONCURRENCY=10)) so a backlog doesn't serialize 2s ceilings. _build_record gains a keyword-only ptr_record parameter that, when _UNSET, omits the key from the record dict — upsert_attacker's attribute-merge loop then preserves whatever's stored on the row. Explicit None is a "fresh failed attempt" signal and gets written through. Env kill-switch DECNET_PTR_ENABLED=false for locked-down deploys where egress DNS is forbidden. Private / loopback / link-local / multicast / reserved addresses short-circuit before any DNS call. IPv6 reverse DNS works transparently through the stdlib resolver. Schema change — run once on upgrade: ALTER TABLE attackers ADD COLUMN ptr_record VARCHAR(256) NULL DEFAULT NULL; Or drop-and-recreate on dev boxes (db-reset's SQLModel.metadata-driven table discovery now picks it up automatically since `ba155b7`). tests/conftest.py disables DECNET_PTR_ENABLED globally for the same reason it disables DECNET_GEOIP_ENABLED — unit tests must never hit the network. tests/geoip/test_ptr.py re-enables explicitly via an autouse fixture.	2026-04-24 17:26:40 -04:00
anti	ec1079e78b	feat(profiler): wire p0f-v2 matcher into sniffer_rollup priority chain The ~30-signature hand-rolled p0f-lite table in decnet/sniffer/p0f.py misses most real-world attackers (yesterday's SLOW SCAN being a textbook case — 9 hours of events, 19 hits, os_guess = NULL). The 375-sig vendored p0f v2 DB was already there; this commit actually calls it. New resolution chain in sniffer_rollup: 1. Enabled OS-fingerprint providers (p0f-v2 default, via DECNET_OSFP_PROVIDERS) tried in declared order. Provider with highest-confidence match across all enabled sources wins. 2. Modal os_guess label from the sniffer's hand-rolled p0f.py. Kept as fallback because v2's DB predates post-2006 kernels. 3. TTL bucket (linux / windows / embedded). Coarse but never wrong. Wiring details: - _match_via_osfp_providers: never raises — factory / provider failures collapse to None and the chain falls through to the old modal-label / TTL path. A corrupt .fp file or misconfigured DECNET_OSFP_PROVIDERS must never wedge a profile rebuild. - tcp_fp_context tracks whether the LATEST tcp_fp snapshot came from a passive SYN ('syn' → p0f.fp) or an active prober probe ('synack' → p0fa.fp). Routes to the right sig list. - initial-TTL normalisation via decnet.sniffer.p0f.initial_ttl. Observation's TTL may be N hops below the OS's initial; v2 signatures match on the canonical bucket. Soft-field semantics on Signature.score(): df and total_len are now skip-checked when the observation is missing them. Sniffer doesn't currently emit either SD field; a literal-constraint sig shouldn't hard-reject a match solely because of upstream incompleteness. Hard fields (window, ttl, options_sig, quirks) still hard-reject on absent/mismatched input — those are the real discriminators. Promote df / total_len back to hard the moment the sniffer starts emitting them. +2 integration tests on TestSnifferRollup, +2 soft-field tests on test_signature. Full regression: 166 tests across tests/prober/osfp + tests/profiler all green.	2026-04-24 11:56:50 -04:00
anti	ea95a009df	refactor(tests): move flat tests/.py into per-subsystem subfolders Groups every flat test_.py under the module it exercises, matching the existing tests/{profiler,sniffer,prober,collector,correlation,cli,web, topology,swarm,bus,updater,api,docker,geoip,...} layout. New folders: services/, fleet/, config/, logging/, db/ (+ db/mysql/), telemetry/, mutator/, core/. Path-dependent __file__ references bumped an extra .parent in three files that moved one level deeper: - tests/sniffer/test_sniffer_ja3.py (template path) - tests/services/test_ssh_capture_emit.py (template path) - tests/cli/test_mode_gating.py (REPO root) - tests/web/test_env_lazy_jwt.py (repo var) Also drops two SQLite runtime artifacts (test_decnet.db-{shm,wal}) that were leaking into the repo from a previous test run. Fixes two test_service_isolation cases that patched asyncio.sleep (no longer on the profiler main-loop hot path — same pre-existing bug I fixed earlier in test_attacker_worker.py) by patching asyncio.wait_for and passing interval=0.	2026-04-23 21:34:25 -04:00
anti	67c2e30f89	feat(profiler): publish attacker.scored per profile upsert (DEBT-031 worker 4) The profiler worker threads its bus publisher through _WorkerState so _update_profiles can emit a compact attacker.scored event for every upsert. Payload carries the headline counts (event/service/decky/ bounty/credential) plus is_traversal, so the MazeNET attacker pool can redraw without a round-trip. Bus stays optional: publish_attacker=None when DECNET_BUS_ENABLED=false or get_bus() fails, and hook exceptions are logged without breaking the upsert path.	2026-04-21 16:54:40 -04:00

18 Commits