Commit Graph

35 Commits

Author SHA1 Message Date
4586e36d63 fix(test/schema): pin xdist_group to prevent multi-server startup, cap workers at 4
Some checks failed
CI / Test (Standard) (3.11) (push) Has been skipped
CI / Test (Live) (3.11) (push) Has been skipped
CI / Merge testing → main (push) Has been skipped
CI / Lint (ruff) (push) Successful in 1m0s
CI / Dependency audit (pip-audit) (push) Failing after 1m2s
CI / SAST (bandit) (push) Successful in 1m11s
CI / Merge dev → testing (push) Has been skipped
2026-05-16 18:36:26 -04:00
0fe9f895d0 feat(test): add test-schema target and SCHEMA_QUICK=1 mode for schemathesis
- Add dedicated test-schema Makefile target (xdist logical, 600s timeout,
  -m fuzz) so schemathesis runs separately from test-fuzz, which was
  spinning up competing uvicorn workers per xdist process
- Exclude all test_schemathesis*.py files from FUZZ_FLAGS via --ignore
- Add schema to _ALL_SUITES between api and fuzz
- Add SCHEMA_QUICK env var (default 0): caps every max_examples to 100
  across all four schemathesis files (4520 -> 600 total examples)
- Fix pre-push hook: use .311 venv and delegate to make test-all FAIL_FAST=0
  instead of hand-rolling five separate pytest invocations
2026-05-16 18:25:40 -04:00
64610bf96e fix(tests): sync 4 tests to current production contracts
- SSH schema: add user + user_password fields (service extended post-test)
- TopologySummary: repo.get_topology() returns model now, not raw dict
- health live: tarpit_watcher added to get_background_tasks(), add to expected set
2026-05-10 06:48:42 -04:00
e4626879f6 perf(pytest): 194s → 4s collection — lazy heavy imports + norecursedirs
Four-part fix for the collection bottleneck that was blocking the dev loop:

1. Lazy mitreattack.stix20 import in attack_stix.py — deferred to first
   _load() call (TYPE_CHECKING guard at top level)

2. Lazy misp_stix_converter import in both MISP export routers — moved
   from module level into the route handler body

3. Lazy attack_catalog / attack_stix in ttp.py repo mixin — thin wrapper
   functions so the import chain never fires at module load time

4. tests/api/conftest.py — `from decnet.web.api import app` moved inside
   the `client()` fixture; `pytest_ignore_collect` broadened to skip all
   test_schemathesis*.py variants (not just test_schemathesis.py), which
   were launching a subprocess server at module-import time

5. pyproject.toml — `norecursedirs` for tests/live, tests/stress,
   tests/service_testing, tests/docker, tests/perf so these directories
   are never entered; `-m` filter removed from addopts (now redundant);
   `--dist loadscope` → `--dist load` to unblock workers immediately

6. behave_core / behave_shell rename — BEHAVE packages dropped the
   `decnet_` prefix; reinstalled editable installs and updated all 14
   import sites across profiler, ttp, bus, and correlation modules
2026-05-10 06:41:25 -04:00
95593cb804 fix(test): access DeckyRow.uuid as attribute, not dict key 2026-05-10 05:36:07 -04:00
16e032b7a5 fix(test): access LANRow.id as attribute, not dict key 2026-05-10 05:26:49 -04:00
33f7d5a9ff feat(web): expose attribution state on AttackerDetail backend (Phase 6)
GET /api/v1/attackers/{uuid}/attribution

Returns the merger output for an attacker's identity:

    {
        "identity_uuid": "abc..." | null,
        "primitives": [
            {primitive, current_value, state, confidence,
             observation_count, last_change_ts, last_observation_ts},
            ...
        ]
    }

Pre-attribution-worker: identity_uuid=null, primitives=[]. Surfacing
identity_uuid keeps the cross-attacker rollup story visible to the
frontend ahead of v1's clusterer landing.

api_events SSE relay also subscribes to attribution.> and forwards
to the AttackerDetail page filtered on payload.identity_uuid (the
identity is resolved at stream open from the URL's attacker_uuid;
attribution payloads are identity-keyed, not attacker-keyed). New
SSE event names: attribution.state_changed,
attribution.multi_actor_suspected.

Frontend (AttackerDetail.tsx badge rendering, useAttackerStream
consumer) deferred — there's already WIP on AttackerDetail.tsx in
the working tree; merging the badge logic is a separate commit
once that lands.

Tests: 4 endpoint scenarios — 401 unauth, 404 unknown attacker,
200 empty (no stub), 200 with primitive-ordered rows.
2026-05-09 02:21:59 -04:00
bb77d13f9a feat(api/attackers): per-attacker SSE events stream
GET /api/v1/attackers/{uuid}/events streams behavioural events for
one attacker. Mirrors decnet/web/router/topology/api_events.py
end-to-end: ?token= auth, require_stream_viewer gate,
sse_connection_slot per-user cap, snapshot-on-connect, three bus
subscriptions (attacker.observation.>, attacker.fingerprint_rotated,
attacker.scored) merged through asyncio.Queue, 15s keepalive,
request.is_disconnected() exit, finally task cancellation.

Per-attacker filter keys on payload['attacker_uuid'] which the
profiler worker stamps onto every published payload (Phase 5 P5.0
amendment) — O(1) drop without a repo round-trip per event.

_sse_name_for derives SSE event names:
  attacker.observation.<primitive> → observation.<primitive>
  attacker.fingerprint_rotated     → fingerprint.rotated
  attacker.scored                  → attacker.scored

10 tests cover snapshot, live forward, per-attacker filter (drops
other attackers' events), fingerprint.rotated forward, 404, 401, and
the sse-name derivation across all four cases. Topology events
regression green.
2026-05-08 20:23:29 -04:00
588ea4e411 refactor(artifacts): extract shard-finder out of transcripts router
Move `_find_shard_with_sid`, `_resolve_shard`, `_validate_names`,
`_get_index`, and the index cache from
`decnet/web/router/transcripts/api_get_transcript.py` into
`decnet/artifacts/shards.py`. The shared module speaks
`ValueError`; the router keeps thin wrappers that translate to
`HTTPException(400)` so the route's error UX is unchanged.

This unblocks the BEHAVE-INTEGRATION Phase 4 worker wiring — the
profiler worker (and the collector's session aggregator) need to
disk-reach asciinema shards but must not import from a FastAPI
router.

11 new unit tests for the shared helper. Existing transcript router
tests pass (the shard fixture's monkeypatch points at the shared
module's ARTIFACTS_ROOT now).
2026-05-08 18:49:11 -04:00
03beff3840 feat(orchestrator): authoritative failure-count badge endpoint (DEBT-042)
New GET /api/v1/orchestrator/events/stats?since=1h&success=false&kind=...
backed by repo.count_orchestrator_failures(since_ts, kind), which
counts failed rows across both orchestrator_events and
orchestrator_emails since the cutoff.

Window parser accepts ^\d+[smhd]$, capped at 7d. Today only
success=false is accepted on this surface so the endpoint isn't
accidentally repurposed before the next consumer is properly
designed.

Orchestrator.tsx polls the endpoint on mount + every 30 s and
renders the authoritative DB-derived count instead of deriving from
the in-memory SSE buffer + one paginated page (which silently
excluded failures older than the local window).
2026-05-03 05:26:45 -04:00
7036a86e76 refactor(artifacts): extract resolve_artifact_path to shared module
Move artifact path validation + symlink-escape check out of the
admin-gated download endpoint into decnet/artifacts/paths.py so the
TTP EmailLifter can disk-reach .eml files at tag-time without
duplicating regex/root logic (DEBT-047).

The router now catches ArtifactPathError and re-raises HTTPException(400);
behavior is unchanged.
2026-05-02 20:02:47 -04:00
42e9492118 feat(ttp): inspector drawer surfaces evidence + rule_id behind each technique
The TTPsObservedSection rollup tells the operator "we saw T1059" but
not why. Click any technique row → side drawer opens listing every
ttp_tag row in scope with the persisted evidence JSON, firing
rule_id / rule_version, source_kind / source_id, confidence, and
created_at. Mirrors the CredentialReuseInspector / BountyInspector
pattern (drawer-backdrop + bd-head/bd-body + kvs grid).

Backend:
- New `GET /api/v1/ttp/tags/by-{scope}/{uuid}/{technique_id}`
  (`scope ∈ {identity, attacker, session}`, optional
  `?sub_technique_id=`, `?limit=` capped to 1000). Returns raw
  TTPTag rows newest-first.
- New `TTPTagDetailRow` Pydantic model + re-export.
- New repo method `list_tags_by_scope_and_technique` on
  TTPMixin (+ abstract on BaseRepository) — single query branched
  on scope; identity scope projects through `Attacker.identity_id`
  the same way `list_techniques_by_identity` does.
- Tests: evidence round-trips, sub_technique filter, JWT-required,
  empty scope, unknown scope rejected.

Frontend:
- New `TTPInspector.tsx` + `TTPInspector.css` (violet accent, slide
  animation, focus-trapped panel matching the existing inspector
  family).
- `TTPsObservedSection`'s TechniqueBar is now click+keyboard
  activatable; clicking opens the inspector for that
  (technique, sub_technique) tuple.

mypy clean. 532 passed in the targeted sweep.
2026-05-02 02:55:05 -04:00
e08bfc4a73 fix(ttp): /api/v1/ttp/rules returns the live rule catalogue
The endpoint was a contract-phase stub returning `[]` even though the
RuleStore loaded all 58 YAML rules at worker startup. UI saw an empty
table; operators couldn't tell whether anything was wired up.

- `api_list_rules` now calls `get_rule_store().load_compiled()` and
  serializes each CompiledRule + its operational state into a
  RuleCatalogueRow. Sorted by rule_id for stable golden snapshots.
- Add `description: str` to RuleSchema (pydantic) and CompiledRule
  (NamedTuple, defaulted) + propagate through `_compile_one` so the
  catalogue surfaces the human-readable YAML description, not just
  the slug-style `name`.
- Update `tests/ttp/test_rule_engine.py` _fields assertion for the
  new column; new `tests/api/ttp/test_rules_catalogue.py` pins the
  catalogue contents (R0001/R0014 presence, row shape, sort order).

Worker behaviour is unchanged: it was already loading rules
correctly. This is purely a read-side wiring fix on the operator API.
2026-05-02 01:54:06 -04:00
9a31d0e50c feat(ttp): E.3.17 worker registration + scoped schemathesis suite
Wires decnet-ttp as a first-class worker:

* `decnet ttp` CLI command (master-only via MASTER_ONLY_COMMANDS)
* deploy/decnet-ttp.service.j2 systemd unit (After= identity / intel
  / reuse-correlator workers; ProtectHome=read-only since
  FilesystemRuleStore only reads ./rules/ttp/)
* deploy/decnet.target Wants= chain extended with decnet-ttp.service
* `ttp` was already in web/worker_registry.KNOWN_WORKERS

tests/api/test_schemathesis_ttp.py: TTP-routes-only schemathesis
suite, filtered via the OpenAPI tags=["TTP Tagging"] annotation
shared by the eight TTP routes. Reuses the live uvicorn subprocess
the wider test_schemathesis spawns; max_examples=400 keeps the
focused gate fast for E.3.13–E.3.16 iteration.

wiki-checkout/Service-Bus.md committed in its own repo: ttp.tagged
and ttp.rule.fired.<id> flipped from "reserved (TTP worker)" to
"decnet.ttp.worker" now that the worker publishes them.
2026-05-01 21:26:46 -04:00
b7f206c8c5 feat(ttp): E.1.9 API contract — seven router endpoints, admin-gated state mutations, response models
Mounts /api/v1/ttp/* with empty-list / empty-Navigator responses.
GET endpoints viewer-gated; POST/DELETE /rules/{rule_id}/state
admin-gated server-side. POST parses JSON manually so a malformed
body returns the documented 400 (per feedback_schemathesis_400).

Drops xfail-strict markers from E.2.8 tests now that the router is
mounted; 26 tests pass against the contract handlers.
2026-05-01 07:20:13 -04:00
b5a19301a2 test(ttp): E.2.8 API shape + auth — GET 200/401 + admin-only POST/DELETE 401/403/200/400 contract 2026-05-01 07:00:41 -04:00
ebe15310ab fix(api): hydrate planner from DB exactly once on first GET, not on every read
get_config was calling planner.apply_payload on every GET request, racing
concurrent reads on module-level globals. Added a _hydrated flag + lock
so DB hydration runs at most once per process lifetime; put_config marks
it done too. Test fixture resets the flag between tests.
2026-04-30 21:17:03 -04:00
542d129d6f refactor(services_live): replace string-sniffed error dispatch with typed exception subclasses
ServiceNotFoundError (→ 404) and ServiceConflictError (→ 409) replace the
"not found" / "already on" / "not on" substring checks in _map_mutation_error;
base ServiceMutationError still maps to 422. Fixes three pre-existing test
status-code assertions (201 vs 200 on POST endpoints).
2026-04-30 20:49:29 -04:00
07b32e2abe fix(tests): patch add_service/remove_service at the router import, not the module
Monkeypatching services_live.add_service had no effect because api_services
already held a local reference to the name. Patch api_services.add_service
and update fake stubs to accept the config kwarg added to the real signature.
2026-04-29 18:50:21 -04:00
75b1ce3a31 feat(api): per-service config schema endpoint + PUT/POST update+apply for fleet & topology
- GET /topologies/services/{name}/schema serves the declared ServiceConfigField
  metadata so the Inspector can auto-render forms.
- PUT  /(topologies/{id}/)deckies/{decky}/services/{svc}/config persists the
  validated dict (DB + compose); container untouched (Save).
- POST /(topologies/{id}/)deckies/{decky}/services/{svc}/apply persists then
  force-recreates <decky>-<svc> so the new env takes effect (Apply, destructive).
- New engine helper update_service_config wires both fleet and topology paths
  through the existing _persist_fleet_change / _rerender_topology_compose
  machinery; emits decky.<name>.service_config_changed on the bus.
2026-04-29 11:38:06 -04:00
6ac8cac908 feat(deckies): live service add/remove without full redeploy
decnet.engine.services_live exposes add_service / remove_service for
both fleet and topology decky scopes.  The host's _compose() wrapper
already supported per-service targeting (up --no-deps -d <svc>,
stop, rm -f); what was missing was the orchestration around it:

* add: validate against decnet.services.registry (rejects unknown +
  fleet_singleton); persist the new services list; re-render the
  per-scope compose file (so future redeploys reflect the change);
  run docker compose up -d --no-deps --build <decky>-<svc>.
* remove: stop + rm -f the service container; persist; re-render
  compose so a future up -d doesn't bring it back.

Both publish decky.<name>.service.added / .removed on the bus, with
the post-mutation services list.  Topic constants added to
decnet.bus.topics; the matching wiki entry in wiki-checkout/Service-Bus.md
ships in a separate commit on the wiki repo (wiki-checkout/ is gitignored).

Four new admin endpoints:

* POST/DELETE /api/v1/deckies/{name}/services{,/svc}
* POST/DELETE /api/v1/topologies/{id}/deckies/{name}/services{,/svc}

ServiceMutationError messages are mapped at the API boundary to 404
(decky/topology missing), 409 (idempotency violation), 422 (unknown
or fleet_singleton service).
2026-04-28 22:51:42 -04:00
0bc4b05c73 feat(deckies): generic file drops on fleet + MazeNET deckies
Extracts the docker-exec-with-base64-stdin pattern out of canary/planter
and orchestrator/drivers/ssh into a shared decnet.decky_io package.
Both consumers now delegate; the canary planter test still proves the
contract end-to-end.

Adds POST/DELETE /api/v1/deckies/files for arbitrary file drops.
Container resolution is shared with the canary path: topology_id absent
means fleet (<name>-ssh), present routes through resolve_decky_container
which picks <name>-ssh when the topology decky exposes ssh, else the
topology base container decnet_t_<id8>_<name>.

Path validation rejects relative paths and '..' traversal at the request
model layer.  Bad base64 → 400; unknown topology → 404; decky not in
topology → 422; docker exec failure → 409.
2026-04-28 22:43:34 -04:00
3fe999d706 feat(canary): allow custom canaries on MazeNET deckies via API
POST /api/v1/canary/tokens grows an optional topology_id field.  When
present, the server hydrates the topology, validates the named decky is
in it, and resolves the docker container via
planter.resolve_topology_container — <name>-ssh if the decky exposes ssh,
else the topology base container.  Absent ⇒ fleet semantics, unchanged.

The token row gets a nullable topology_id column (no migration helper
per pre-v1 policy).  GET /api/v1/canary/tokens accepts ?topology_id= as
a filter.  DELETE re-resolves the container at revoke time so a
redeployed topology is still reachable.

422 when the named decky isn't in the topology; 404 when the topology
itself doesn't exist.
2026-04-28 22:34:45 -04:00
862e4dbb31 merge: testing → main (reconcile 2-week divergence) 2026-04-28 18:36:00 -04:00
f2cc585d72 fix: align tests with model validation and API error reporting 2026-04-13 01:43:52 -04:00
03f5a7826f Fix: resolved sqlite concurrency errors (table users already exists) by moving DDL to explicit async initialize() and implementing lazy singleton dependency. 2026-04-12 08:01:21 -04:00
b2e4706a14 Refactor: implemented Repository Factory and Async Mutator Engine. Decoupled storage logic and enforced Dependency Injection across CLI and Web API. Updated documentation.
Some checks failed
CI / Lint (ruff) (push) Successful in 12s
CI / SAST (bandit) (push) Successful in 13s
CI / Dependency audit (pip-audit) (push) Successful in 22s
CI / Test (Standard) (3.11) (push) Failing after 54s
CI / Test (Standard) (3.12) (push) Successful in 1m35s
CI / Test (Live) (3.11) (push) Has been skipped
CI / Test (Fuzz) (3.11) (push) Has been skipped
CI / Merge dev → testing (push) Has been skipped
CI / Prepare Merge to Main (push) Has been skipped
CI / Finalize Merge to Main (push) Has been skipped
2026-04-12 07:48:17 -04:00
0f63820ee6 chore: fix unused imports in tests and update development roadmap
Some checks failed
CI / Lint (ruff) (push) Successful in 16s
CI / Test (pytest) (3.11) (push) Failing after 34s
CI / Test (pytest) (3.12) (push) Failing after 36s
CI / SAST (bandit) (push) Successful in 12s
CI / Merge dev → testing (push) Has been cancelled
CI / Open PR to main (push) Has been cancelled
CI / Dependency audit (pip-audit) (push) Has been cancelled
2026-04-12 03:46:23 -04:00
ff38d58508 Testing: Stabilized test suite and achieved 93% total coverage.
- Fixed CLI tests by patching local imports at source (psutil, os, Path).
- Fixed Collector tests by globalizing docker.from_env mock.
- Stabilized SSE stream tests via AsyncMock and immediate generator termination to prevent hangs.
- Achieved >80% coverage on CLI (84%), Collector (97%), and DB Repository (100%).
- Implemented SMTP Relay service tests (100%).
2026-04-12 03:30:06 -04:00
08242a4d84 Implement ICS/SCADA and IMAP Bait features 2026-04-10 01:50:08 -04:00
dbf6d13b95 fix: use :memory: + StaticPool for test DBs, eliminates file:testdb_* garbage 2026-04-09 18:39:36 -04:00
d15c106b44 test: fix async fixture isolation, add fuzz marks, parallelize with xdist
- Rebuild repo.engine and repo.session_factory per-test using unique
  in-memory SQLite URIs — fixes KeyError: 'access_token' caused by
  stale session_factory pointing at production DB
- Add @pytest.mark.fuzz to all Hypothesis and Schemathesis tests;
  default run excludes them (addopts = -m 'not fuzz')
- Add missing fuzz tests to bounty, fleet, histogram, and repository
- Use tmp_path for state file in patch_state_file/mock_state_file to
  eliminate file-path race conditions under xdist parallelism
- Set default addopts: -v -q -x -n logical (26 tests in ~7s)
2026-04-09 18:32:46 -04:00
6fc1a2a3ea test: refactor suite to use AsyncClient, in-memory DBs, and parallel coverage 2026-04-09 16:43:49 -04:00
310c2a1fbe feat: add pytest-asyncio, freezegun, schemathesis, pytest-cov to test toolchain 2026-04-09 12:40:59 -04:00
ec66e01f55 fix: add missing __init__.py to tests/api subpackages to fix relative imports 2026-04-09 12:24:09 -04:00