GET /api/v1/orchestrator/events — paginated list with optional
kind=traffic|file filter. GET /api/v1/orchestrator/events/stream —
SSE: snapshot on connect, live forward of orchestrator.> bus events
mapped to 'traffic' / 'file' SSE event names.
Repo gains list_orchestrator_events(limit, offset, kind?, since_ts?),
count_orchestrator_events(kind?), and prune_orchestrator_events
(per_dst_cap=10000) for periodic worker-side trimming.
Adds a new decnet orchestrate worker whose job is to keep the honeypot
ecosystem from looking suspiciously static — a frozen LAN with no
inter-host traffic and no filesystem aging is its own honeypot tell.
MVP scope:
- New OrchestratorEvent table + repo methods (purpose-built sibling
to Log so synthetic events stay separable from attacker-driven ones).
- New orchestrator.{activity,file}.<decky_id> bus topics +
system.orchestrator.health heartbeat.
- SSH-only driver. Traffic action runs python3 inside src container
to TCP-connect dst:22 and read the SSH banner — real on-the-wire
SSH-protocol traffic without shipping creds. File action drops or
refreshes a small file via docker exec on the destination.
- Random scheduler (50/50 traffic/file when >=2 SSH-capable deckies
are running). Diurnal shaping, role-aware pairing, and session-aware
backoff are explicit non-goals for MVP.
- CLI registration, systemd unit (SupplementaryGroups=docker),
worker-registry entry so the dashboard shows orchestrator health.
- 11 tests: scheduler policy, driver argv shape + injection-safety,
end-to-end one-tick integration with FakeBus + SQLite.
Runs the chained identity + campaign clustering pipeline against all
seven fixtures via from_synthetic / from_synthetic_identity adapters
and ratchets every YAML floor to 1.0 — the production clusterer
(and the reference clusterers used in the per-fixture tests) all
score perfectly across ARI / homogeneity / completeness /
singleton_recall on each fixture.
Three substrate fixes surfaced by the ratchet:
- Tuning: shared_infra now Jaccards payload+C2 only; decky_set moved
into cohort_weight to prevent fleet-scarcity false-merges (F1's
shared_wordlist failure mode). Tier weight raised to 1.0 so
shared payload+C2 alone crosses threshold (F5's intended pass).
- Adapter: from_synthetic_identity now reads SyntheticSession
started_at + duration_s for session_windows and per-decky
timestamps (the production-row adapter still uses start_ts/end_ts
when available).
- Fixture data: paused_campaign.yaml's JA3 collided exactly with
vpn_hopping.yaml's (same TLS extension list). The collision
fused two unrelated campaigns under the chained identity layer
in the noise_floor composite. Made paused's JA3 distinct.
Also wires Campaign / CampaignsResponse into models/__init__.py's
__all__ that was missed in the schema commit.
Adds the campaigns table and the BaseRepository / SQLModelRepository
methods that the campaign-clusterer worker (next commit) needs to
populate it. Mirrors the AttackerIdentity layer: schema_version from
day one for federation gossip, soft-merge via merged_into_uuid with a
chain-walking get_campaign_by_uuid, list_campaigns excluding merged-
out rows while list_all_campaigns returns the unfiltered set for the
revoke pass. attacker_identities.campaign_id gets a real FK now that
the target table exists.
Schema-only commit, first of the five-step substrate for identity
resolution. The clusterer that populates identities lands later; this
ships the table empty and the FK uniformly NULL on existing rows.
* decnet/web/db/models/attackers.py — new AttackerIdentity SQLModel
(uuid PK, schema_version, fingerprint summary lists, kd_digraph_simhash,
merged_into_uuid self-FK, all clusterer-populated fields nullable).
Attacker grows a nullable indexed identity_id FK + docstring marking
it as the per-IP observation row.
* decnet/web/db/models/__init__.py — re-exports AttackerIdentity.
* tests/db/test_identity_schema.py — 9 schema invariants: table exists,
identity_id nullable + indexed, FK targets attacker_identities.uuid,
schema_version defaults to 1, attacker rows inserted with NULL
identity_id, FK constraint blocks orphans.
463 unrelated db/web/profiler/correlation tests still green. See
development/IDENTITY_RESOLUTION.md for the full design.
New TTL-cached threat-intel row keyed by attacker IP, with per-provider
verdict/raw/queried_at columns for GreyNoise, AbuseIPDB, abuse.ch Feodo
Tracker and ThreatFox. Carries schema_version from day one (federation
wire-format precedent set by SessionProfile). Repo gains
upsert_attacker_intel, get_attacker_intel_by_ip, and a
get_unenriched_attacker_ips backfill primitive that picks fresh + stale
rows for the forthcoming 'decnet enrich' worker.
Also documents the open-source intel-source backlog in DEVELOPMENT_V2.
Lays the storage and bus substrate for the "credential reuse patterns"
task in DEVELOPMENT.md and scaffolds decnet/vectorstore/ as the future
substrate for statistical attacker re-identification over behavioral
fingerprints. No correlator, profiler, API, or dashboard wiring in
this commit — see TODO.md for the handoff.
Schema:
- Credential.attacker_uuid (nullable FK to attackers.uuid),
backfilled by the profiler post-write to avoid coupling the
capture path to the profiler's ordering.
- CredentialReuse table — UUID PK, JSON list columns for the
accumulating attacker_uuids/ips/deckies/services, target_count
(the discriminative scalar), confidence reserved for a future
fuzzy-credential pass.
Repo:
- upsert_credential_reuse / list_credential_reuses /
get_credential_reuse_by_id / update_credential_attacker_uuid.
- Renamed pre-existing get_credential_reuse(secret_sha256) to
get_credential_attempts_for_secret(secret_sha256) — the new
findings table needs the cleaner name.
Bus topics:
- credential.captured (one per Credential upsert)
- credential.reuse.detected (correlator-emitted on insert/grow)
Vectorstore subpackage (decnet/vectorstore/, flat layout mirroring
decnet/bus/):
- BaseVectorStore ABC keyed by (kind, id) — kind discriminator
means new feature families are additive, no schema migration.
- FakeVectorStore (in-memory L2 KNN), NullVectorStore (no-op for
DECNET_VECTORSTORE_ENABLED=false), SqliteVecVectorStore (lazy
sqlite_vec extension load, one vec0 virtual table per kind).
- get_vectorstore() env-driven dispatch with graceful fallback
to FakeVectorStore when the sqlite-vec extension isn't on the
host, so workers don't crash on a missing optional dep.
Tests: 26 new (11 cred-reuse repo, 15 vectorstore). Existing
credentials and base-repo tests updated for the rename. Total: 34
passing on the touched files.
Replaces the opaque Bounty.bounty_type='credential' path with a
dedicated `credentials` table whose schema is forward-compatible
across every auth-bearing service in the fleet. Hoisted indexed
columns (secret_sha256, principal, service, attacker_ip) carry the
universal reuse-analytics signal; service-specific JSON keys ride
in `fields`. Cross-service reuse queries become an indexed lookup
on secret_sha256 instead of JSON_EXTRACT scans.
Schema decisions baked in (per ANTI):
- New `Credential` table, not extension to Bounty
- Hoisted `principal` column for cross-service principal-reuse
- Standardized JSON keys: every payload carries secret_b64 +
secret_printable + principal universally; service-specific extras
(user, domain, dn, mech, …) ride alongside
The auth-helper SD-block emits the new shape natively. The ingester
forks at _extract_bounty:
- Native shape (SSH/Telnet, future emitters): secret_b64 present →
direct upsert_credential
- Legacy shape (FTP/POP3/IMAP/SMTP today): username + password →
adapter synthesizes secret_{b64,sha256,printable} on the fly,
upserts into the same Credential table. Tracked as DEBT-039;
one-shot bridge until those service templates migrate.
Defense-in-depth across five layers (input validation):
- C helper: bytes outside [0x20, 0x7f) collapse to '?', RFC 5424
escape rules for \\, ", ]; b64 preserves exact bytes
- Ingester native branch: rejects malformed secret_b64 (regex), drops
the credential row but keeps the underlying Log
- Ingester legacy adapter: same printable-ASCII filter as the C
code; sha256 + b64 over the original utf-8 bytes (lossless, even
when secret_printable is sanitized)
- DB column caps with truncation warning; sha256 always over the
full pre-truncation bytes so reuse queries match across truncation
- JSON serialized with ensure_ascii=True so utf8mb4 columns stay
safe even with non-ASCII service-specific keys
Bounty.bounty_type='credential' is no longer written. Pre-v1: no
historical backfill; existing rows stay untouched but unused.
595 tests pass; new tests cover the model + repo (upsert dedup,
null-principal independence, cross-service reuse, filters), both
ingester branches, b64 validation, sanitization preserving the
fingerprinting signal in b64.
Introduces the webhook egress foundation — a new WebhookSubscription
table, admin-gated CRUD under /api/v1/webhooks, and the shared
delivery client that both the test-ping route and the upcoming worker
will use. No worker yet; this commit is API + model + client only.
Simple-mode enum (AttackerDetail / DeckyStatus / SystemStatus) expands
to bus-topic patterns at the router layer; storage is always the raw
pattern list. Advanced mode lets admins supply raw NATS-style patterns
directly. Filter-at-subscribe: the worker (next commit) will subscribe
to the union of patterns across enabled subscriptions.
Delivery client handles HMAC-SHA256 signing (X-DECNET-Signature),
retry on 429/5xx/network errors with jittered backoff, no-retry on
4xx. Secrets never leave the server on GET/LIST — only the create
response carries the secret for copy-out.
CRUD routes publish WEBHOOK_SUBSCRIPTIONS_CHANGED on the bus after
every mutation so the (future) worker can hot-reload.
Opens DEBT-037 for the deferred items (circuit breaker, dead-letter,
batch delivery, payload templates, secret-at-rest).
Every mutation route that returned an untyped dict now declares
response_model at the decorator. MessageResponse covers the eight
{"message": ...} envelopes (change-password, mutate-decky, mutate-
interval, update-deployment-limit, update-global-mutation-interval,
delete-user, update-user-role, reset-user-password). Purpose-built
models cover the richer shapes (DeployResponse for /deckies/deploy,
PurgeResponse for /config/reinit, ReapReportResponse for /reap-orphans,
UserResponse for /config/users). 204-No-Content and Response/
ORJSONResponse routes stay as-is.
The wire shape for clients is unchanged — the envelopes already only
shipped a message field. What changes is that a handler which
accidentally returns a richer dict (e.g. a full user row including
password_hash) would be silently stripped to the declared fields at
serialization time.
Also flips F4/D "expensive LIKE" to accepted (new DA-09) — the /logs
and /attackers search routes LIKE-scan unbounded columns, but both are
admin-gated, limit-capped, and operator rate-limit scope per DA-04.
FTS5 stays a performance TODO, not a security blocker.
New SmtpTarget table records each (attacker, domain) pair observed via
the SMTP honeypots. Only the domain is stored — local-parts are dropped
at ingestion, so this table holds no user-identifying data beyond the
target organisation's identity.
The profiler worker extracts domains from rcpt_to / rcpt_denied /
message_accepted events, normalizes them (lowercase, strip local-part,
drop blocked TLDs), and upserts one row per pair with a running count +
first_seen / last_seen.
Three repo methods shipped:
* increment_smtp_target(attacker, domain) — upsert + bump
* list_smtp_targets(attacker) — per-attacker view
* smtp_target_seen(domain) — cross-attacker aggregate, shaped as the
federation-gossip RPC that V2 will expose.
The gossip-query shape is load-bearing: each operator can answer
"have any of your attackers targeted corp1.com?" without leaking
which attackers or when — the aggregate returns a bool + total count
+ first/last seen, nothing else.
decnet/web/db/models.py was approaching 1000 lines across User/Log/
Attacker/Swarm/Topology/Workers/Updater/Health domains. Split into a
package with one module per domain; __init__.py re-exports every symbol
so all 52 call sites keep importing from decnet.web.db.models
unchanged.