docs: expand Fingerprinting page — all 6 layers, BEHAVE-SHELL primitives, SMTP signals, TTP detection

2026-05-10 04:19:09 -04:00
parent d45fb08b6d
commit e7d3353bfe

@@ -1,37 +1,48 @@
# Fingerprinting # Fingerprinting
DECNET builds a multi-layer fingerprint of every attacker from three DECNET builds a multi-layer fingerprint of every attacker from four
independent sources: **passive wire capture**, **active probing**, and independent sources: **passive wire capture**, **active probing**,
**inline HTTP inspection**. Each layer contributes distinct evidence; **inline HTTP/protocol inspection**, and **behavioural profiling** of
together they let you tell a curl script from a Metasploit operator from a interactive sessions. Each layer contributes distinct evidence; together
they let you tell a curl script from a Metasploit operator from a
nation-state implant even when the source IP changes. nation-state implant even when the source IP changes.
All fingerprint data is stored as `bounty` rows in the DECNET database and All fingerprint data is stored as `bounty` rows (type `fingerprint`) or
surfaces in the **Attacker detail** page under the *Fingerprints* tab. `ObservationRow` entries in the DECNET database and surfaces in the
**Attacker detail** page.
--- ---
## Layer 1 — Passive sniffer (network layer) ## Layer 1 — Passive sniffer (network / TLS layer)
The sniffer runs fleet-wide on the host interface and reads raw packets The sniffer runs fleet-wide on the host interface and reads raw packets
without touching any decky service. It fires on the first packet of each without touching any decky service. It fires on the first packet of each
connection, so it captures the attacker's stack signature before any connection, so it captures the attacker's stack signature before any
application-level exchange. application-level exchange.
| Fingerprint | What it captures | Algorithm | ### TLS ClientHello fingerprints
| Fingerprint | Description | Key fields |
|---|---|---| |---|---|---|
| **JA3 / JA3S** | TLS ClientHello / ServerHello cipher suite and extension order | MD5 of normalised fields per Salesforce spec | | **JA3** | MD5 of normalised ClientHello fields (cipher suites, extensions, elliptic curves) | `ja3`, `tls_version`, `sni`, `raw_ciphers`, `raw_extensions` |
| **JA4 / JA4S / JA4L** | TLS 1.3-aware version; JA4L adds latency timing | FoxIO JA4 spec | | **JA3S** | MD5 of the ServerHello response | `ja3s` |
| **TCP SYN OS** | MSS, window scale, TCP option order from the SYN | Mini-p0f classifier (`decnet/sniffer/p0f.py`) | | **JA4** | TLS 1.3-aware successor to JA3 (FoxIO spec) | `ja4`, `alpn`, `dst_port` |
| **JA4-QUIC** | QUIC Initial ClientHello — QUIC-specific extensions and transport params | FoxIO JA4-QUIC spec | | **JA4S** | ServerHello counterpart to JA4 | `ja4s` |
| **Flow timing** | Round-trip latency and inter-packet timing | Raw timestamps from the sniffer | | **JA4L** | JA4 + latency: client TTL and measured RTT | `ja4l`, `rtt_ms`, `client_ttl` |
| **TLS certificate** | Server cert metadata — useful when the attacker runs their own TLS service | `subject_cn`, `issuer`, `self_signed`, `not_before`, `not_after`, `sans`, `cert_sha256`, `sni`, `target_ip`, `target_port` |
| **TLS resumption** | Session resumption mechanisms advertised (tickets, session IDs) | `mechanisms` |
Sniffer events land as `attacker.observed` or `attacker.fingerprinted` bus ### Network stack fingerprints
events consumed by the correlator and ingester.
> **Limitation:** the sniffer only sees the TLS handshake — it cannot read | Fingerprint | Description | Key fields |
> HTTP headers or QUIC stream frames inside an encrypted session. Layers 2 |---|---|---|
> and 3 fill that gap. | **TCP SYN OS** | Passive OS classifier from SYN options (mini-p0f) | `os_guess`, `mss`, `window_scale`, `sack_ok`, `timestamp`, `options_order` |
| **JA4-QUIC** | QUIC Initial ClientHello — QUIC-specific extensions and transport params | `ja4_quic`, `sni`, `alpn`, `raw_ciphers` |
| **Flow timing** | Inter-packet timing and RTT from the first few packets | stored as `tcp_flow_timing` event |
> The sniffer only sees the TLS handshake — it cannot read HTTP headers or
> QUIC stream frames inside an encrypted session. Layers 3 and 4 fill
> that gap.
--- ---
@@ -39,127 +50,334 @@ events consumed by the correlator and ingester.
After a new attacker is first observed, the prober worker reaches back After a new attacker is first observed, the prober worker reaches back
out to the attacker's IP on a set of default ports to collect out to the attacker's IP on a set of default ports to collect
application-level fingerprints. application-level fingerprints. Probes are stealthy — no DECNET banner,
ordinary client behaviour. See [Security-and-Stealth](Security-and-Stealth).
| Fingerprint | Protocol | Ports probed | | Fingerprint | Protocol | Ports probed | Key fields |
|---|---|---| |---|---|---|---|
| **JARM** | TLS (any HTTPS-ish service) | 443, 8443, 8080, 4443, 50050, 2222, 993, 995, 8888, 9001 | | **JARM** | TLS server fingerprint — 10 hand-crafted ClientHellos, 62-char hash of the responses | 443, 8443, 8080, 4443, 50050, 2222, 993, 995, 8888, 9001 | `hash`, `target_ip`, `target_port` |
| **HASSH** | SSH server | 22, 2222, 22222, 2022 | | **HASSH** | SSH server fingerprint — MD5 of `kex;encryption;mac;compression` from the server `KEXINIT` | 22, 2222, 22222, 2022 | `hash`, `ssh_banner`, `kex_algorithms`, `encryption_s2c`, `mac_s2c`, `compression_s2c`, `target_ip`, `target_port` |
| **TCP fingerprint** | TCP SYN response analysis | 22, 80, 443, 8080, 8443, 445, 3389 | | **TCP fingerprint** | TCP/IP stack OS probe — SYN response TTL, window, options | 22, 80, 443, 8080, 8443, 445, 3389 | `hash`, `raw`, `ttl`, `window_size`, `df_bit`, `mss`, `window_scale`, `options_order` |
Active probes are stealthy: they look like ordinary clients, carry no When any fingerprint changes between probes, an `attacker.fingerprint_rotated`
DECNET-specific banner, and use the same port-rotation patterns an bus event fires — a strong signal of infrastructure churn (VPS swap, cert
informed scanner would use. See [Security-and-Stealth](Security-and-Stealth). rotation, banner rewrite).
When a fingerprint changes between probes, a `attacker.fingerprint_rotated`
bus event fires — that is a strong signal of infrastructure churn (VPS
swap, cert rotation, banner rewrite).
--- ---
## Layer 3 — Inline HTTP fingerprinting (Caddy fp module) ## Layer 3 — Inline protocol inspection (decky services)
### HTTP header fingerprinting (Caddy `decnet_fp` module)
The `http` and `https` decky templates ship with a custom Caddy module The `http` and `https` decky templates ship with a custom Caddy module
(`decnet_fp`) that intercepts connections at the byte level, before that intercepts connections at the **byte level**, before Caddy's HTTP
Caddy's HTTP parser sees them. This gives wire-accurate fingerprints parser. This gives wire-accurate fingerprints that cannot be faked by
that cannot be faked by HTTP-level header manipulation. HTTP-level middleware.
### JA4H (HTTP request header order) #### JA4H (HTTP request header order)
The `decnet_fp` listener wrapper taps the raw TLS stream and buffers the The listener wrapper taps the raw TLS stream:
first request headers of each connection before replaying them to Caddy's
parser.
- **h1:** headers are split by `\r\n` in arrival order. - **HTTP/1.1:** headers split by `\r\n` in arrival order.
- **h2:** a per-connection HPACK decoder maintains the dynamic table and - **HTTP/2:** a per-connection HPACK decoder maintains the dynamic table
emits headers in HPACK decode order — pseudo-headers and emits headers in HPACK decode order — pseudo-headers (`:method`,
(`:method`, `:path`, `:scheme`, `:authority`) appear first, then regular `:path`, `:scheme`, `:authority`) appear first, then regular headers in
headers in the order the client encoded them. the order the client encoded them.
The ordered list feeds `_compute_ja4h` in `syslog_bridge.py`, which The ordered list feeds `_compute_ja4h` in `syslog_bridge.py`, producing a
produces a JA4H hash per the FoxIO spec. JA4H hash per the FoxIO spec. Stored with: `ja4h`, `protocol`, `method`,
`path`, `remote_port`.
> Map-iteration order in Go is randomised; DECNET captures order at the > Map-iteration order in Go is randomised; DECNET captures order at the
> *byte level*, not from `http.Header`, so the JA4H is reproducible and > *byte level*, so the JA4H is reproducible and meaningful.
> meaningful.
### H2 SETTINGS #### Header order and header quirks
During the h2 connection preface, the client sends a `SETTINGS` frame Beyond the JA4H hash, the raw ordered list of header names is stored
listing its implementation parameters. The fp module parses the raw (`headers_ordered`). This lets you cluster:
6-byte `(id, value)` tuples in wire order and records:
- `settings` — map of setting name → value - **Presence/absence of headers** — curl sends no `Accept-Encoding` on
(e.g. `HEADER_TABLE_SIZE`, `MAX_CONCURRENT_STREAMS`, `INITIAL_WINDOW_SIZE`) certain invocations; browsers always send it.
- `frame_order` — setting IDs in the exact order the client sent them - **Header ordering** — different HTTP clients and frameworks have
characteristic orderings even when they send the same headers.
- **Header casing** — some tools send `content-type` (lowercase), others
send `Content-Type`; stored verbatim before normalisation.
Different HTTP/2 implementations (curl, Chrome, Firefox, Go net/http, #### HTTP/2 SETTINGS frame
Java HttpClient) have characteristic SETTINGS maps and orderings.
### H3 SETTINGS During the h2 connection preface the client sends a `SETTINGS` frame.
Stored: `settings` (map of name → value) and `frame_order` (IDs in wire
order). Different h2 implementations have characteristic SETTINGS maps
and orderings.
For HTTP/3, the QUIC server is Caddy with native h3 support. Caddy Known settings captured by name:
exposes the client's h3 SETTINGS frame via the `http3.Settingser` `HEADER_TABLE_SIZE`, `ENABLE_PUSH`, `MAX_CONCURRENT_STREAMS`,
interface on the `ResponseWriter`. The fp module captures: `INITIAL_WINDOW_SIZE`, `MAX_FRAME_SIZE`, `MAX_HEADER_LIST_SIZE`.
- `EnableDatagrams` — whether the client advertised H3 datagram support #### HTTP/3 SETTINGS
- `EnableExtendedConnect` — extended CONNECT (used by WebTransport)
- `Other` — any additional settings (including GREASE entries)
### Source port as fingerprint signal For HTTP/3, the module reads client SETTINGS via the `http3.Settingser`
interface: `EnableDatagrams`, `EnableExtendedConnect`, and any additional
settings (including GREASE entries stored as `GREASE_<hex>`).
`remote_addr` in every fp record is the full `host:port` string from #### User-Agent classification
Go's network layer. The collector strips the port before resolving
attacker identity (so 50 connections from the same IP do not produce 50
attackers), but preserves it as `remote_port` in the structured fields.
An attacker whose tooling consistently originates from the same source Every HTTP request captures the `User-Agent` header and classifies it:
port (or a narrow range) is a meaningful signal — some NAT devices, VPN
clients, and C2 frameworks exhibit this behaviour. `remote_port` is | Signal | Description |
stored in the `fingerprint` bounty payload and visible in the Attacker |---|---|
detail page. | Tool category | browser, scanner, curl, python-requests, Go net/http, Java, custom, unknown |
| Tool name | specific tool if detectable (e.g. `Nikto`, `sqlmap`, `Masscan`) |
| Signals | flags such as `headless_browser`, `vuln_scanner`, `exploit_framework` |
Stored as bounty type `fingerprint`, `fingerprint_type: "http_useragent"`.
#### IP leak / source IP signals
Proxy and forwarding headers are inspected on every HTTP request:
- **`ip_leak`** — the attacker's real public IP appeared in `X-Forwarded-For`,
`Forwarded`, `X-Real-IP`, `CF-Connecting-IP`, or `True-Client-IP`. This
happens when an attacker routes through a misconfigured proxy.
Fields: `claimed_ip`, `header_name`, `source_ip`.
- **`spoofed_source`** — a non-routable IP (RFC1918, loopback, link-local,
reserved) appeared in a proxy header — a WAF bypass attempt.
Fields: `claimed_ip`, `header_name`, `category`.
#### Source port as fingerprint signal
`remote_addr` from Go's network layer is `host:port`. The collector
strips the port before resolving attacker identity (so 50 connections from
the same IP do not produce 50 attacker rows), but preserves it as
`remote_port` in the bounty payload. An attacker whose tooling
consistently originates from the same source port is a meaningful signal
(some NAT devices, VPN clients, and C2 frameworks exhibit this behaviour).
### VNC
| Signal | Description | Field |
|---|---|---|
| **VNC client version** | RFB protocol version string from the VNC client's greeting | `value` |
### SSH / Telnet — session recording and keystroke dynamics
The `sessrec` module records the full PTY byte stream of every interactive
shell session. Two signals are extracted:
#### Commands executed
Every command entered at the shell prompt is captured with:
- `command` — the raw command string
- `timestamp`, `session_id`, `attacker_ip`, `decky`, `service`
- Aggregated on session end into a command list on the `session_recorded`
event.
Command content reveals intent directly: reconnaissance (`id`, `whoami`,
`uname -a`, `cat /etc/passwd`), lateral movement (`ssh`, `scp`),
persistence (`crontab -e`, `echo >> ~/.bashrc`), exfiltration
(`curl`, `wget`, `base64`, `scp`).
#### Keystroke dynamics (BEHAVE-SHELL spec)
The BEHAVE-SHELL spec (`decnet/profiler/behave_shell/`) extracts
fine-grained typing and session behaviour from the PTY stream. These
become **attribution primitives** — per-`(identity_uuid, primitive)`
state-machine entries that accumulate evidence across sessions.
**Motor patterns** (muscle memory, latency):
| Primitive | Description |
|---|---|
| `interarrival_mean_sec` | Mean time between keystrokes/commands |
| `interarrival_p75_sec`, `interarrival_p99_sec` | Tail latency — distinguishes human from bot |
| `flow_rate_cmd_per_sec` | Command execution rate |
| `burst_event_count` | Clustering in time (burst size) |
| `typing_speed_wpm` | Estimated words per minute |
| `error_correction_ratio` | Backspace and correction frequency |
**Cognitive patterns** (decision-making):
| Primitive | Description |
|---|---|
| `command_error_rate` | Failure-command ratio |
| `retry_on_failure_ratio` | Persistence on error |
| `command_redo_rate` | Repeating the same failed command |
| `pipeline_breadth`, `pipeline_depth` | Command composition style |
| `distinct_tools_used` | Toolkit diversity per session |
| `tool_switch_frequency` | How often the operator changes tool |
| `verbose_flag_usage` | `-v`/`-vv` flag frequency (confidence proxy) |
**Temporal patterns** (working hours, rhythm):
| Primitive | Description |
|---|---|
| `activity_hour_of_day_entropy` | Consistency of working hours |
| `activity_day_of_week_entropy` | Weekly routine |
| `session_duration_p50_sec`, `p95_sec` | Session length distribution |
| `gaps_between_sessions_p50_sec` | Rest period / tool pacing |
**Environmental patterns** (operator setup):
| Primitive | Description |
|---|---|
| `shell_type` | bash / sh / zsh / fish / etc. |
| `environment_vars_entropy` | Degree of environment customisation |
| `working_directory_volatility` | Directory-jumping frequency |
| `tty_capabilities` | Terminal rows, cols, and `$TERM` value |
**Operational patterns** (technique selection):
| Primitive | Description |
|---|---|
| `privilege_escalation_attempts` | `sudo` / `su` frequency |
| `lateral_movement_attempts` | SSH/RDP connection attempts |
| `data_exfiltration_indicators` | `scp`, `curl`, `wget`, `base64`, `zcat` |
| `credential_access_attempts` | Greping for passwords, SSH key files |
| `persistence_technique_count` | Crontab edits, `.bashrc` modifications |
Each primitive has a state machine: `unknown → stable → drifting →
conflicted → multi_actor`. When two or more primitives independently flag
`multi_actor` (e.g. two distinct shell types alternating per session),
an `attribution.profile.multi_actor_suspected` bus event fires — a strong
indicator of a shared credential or a compromised operator account.
---
## Layer 4 — SMTP / email identity signals
Every inbound email to an `smtp` or `smtp_relay` decky produces a rich set
of identity signals:
### Attacker domains and sender identity
| Signal | Description |
|---|---|
| `mail_from_domain` | Domain in the SMTP envelope `MAIL FROM` |
| `from_domain` | Domain in the `From:` header (may differ from envelope) |
| `return_path_domain` | `Return-Path:` domain |
| `x_mailer` | `X-Mailer` header — identifies the mail client or framework |
| `dkim_signed` | DKIM signature present (bool) |
| `spf_pass` | SPF check result (bool) |
### Victim domain targeting
| Signal | Description |
|---|---|
| `rcpt_domains` | Set of unique domains in the `RCPT TO` list |
| `rcpt_count` | Number of recipients (bulk vs. targeted) |
### Payload and attachment fingerprints
| Signal | Description |
|---|---|
| `body_simhash` | 16-hex similarity hash of the email body — clusters phishing campaigns |
| `body_sha256` | Exact body hash |
| `attachment_sha256s` | Per-attachment SHA-256 list |
| `attachment_extensions` | File extension set |
| `attachment_macros` | Macro-bearing Office documents detected (bool) |
| `attachment_password_protected` | Encrypted attachment (evasion signal) |
| `html_smuggling` | HTML obfuscation / JS blob smuggling detected (bool) |
| `mal_hash_match` | Any attachment hash matched MalwareBazaar bulk feed (bool) |
| `urls` | Extracted URLs from body |
---
## Layer 5 — TTP and tool detection
The TTP engine (`decnet/ttp/`) maps collected events onto MITRE ATT&CK
techniques. Detected techniques are stored as `ttp_tag` rows and surfaced
in the Attacker detail page.
**Detected tools** are inferred from:
- Command strings matched against known-tool signatures (nmap, Metasploit,
BloodHound, Mimikatz, linpeas, pspy, etc.)
- User-Agent strings for HTTP tools
- SSH banner strings from the HASSH probe
- TLS fingerprints matching known C2 frameworks (Cobalt Strike JARM, etc.)
---
## Layer 6 — Inter-event timing and phase sequence
The correlator and attribution engine track **how** an attacker behaves
across an entire engagement, not just individual connections.
### Inter-event timing
Time deltas between successive events of the same type reveal automation
vs. human operation:
- Sub-second, uniform intervals → scripted scanner or bot.
- Variable intervals with human-range pauses (230 s) → interactive
operator.
- Long gaps between sessions with consistent inter-session intervals →
scheduled beacon or cron-driven implant.
These are captured as attribution primitives (`interarrival_*`) via the
BEHAVE-SHELL profiler and as raw timestamps on `bounty` rows.
### Phase sequence
The correlator classifies each event into an engagement phase:
`reconnaisance`, `exploitation`, `post-exploitation`, `exfiltration`,
`persistence`, `lateral movement`. The sequence of phases across a
session is a fingerprint in itself — some toolkits always run
reconnaissance before exploitation; human operators often skip phases or
return to earlier ones.
Phase-sequence analysis drives the `phase_sequence` attribution primitive
and feeds the campaign clusterer.
--- ---
## Where fingerprints are stored ## Where fingerprints are stored
Every fingerprint event produces a `bounty` row:
| Bounty `fingerprint_type` | Source | Key discriminating fields | | Bounty `fingerprint_type` | Source | Key discriminating fields |
|---|---|---| |---|---|---|
| `ja3` / `ja4` / `ja4s` | Sniffer | `hash`, `tls_version`, `ciphers` | | `ja3` / `ja3s` / `ja4` / `ja4s` | Sniffer | `hash`, `tls_version`, `sni`, `raw_ciphers` |
| `ja4l` | Sniffer | `rtt_ms`, `client_ttl` |
| `ja4_quic` | Sniffer | `ja4_quic`, `sni`, `alpn` | | `ja4_quic` | Sniffer | `ja4_quic`, `sni`, `alpn` |
| `tcp_os` | Sniffer | `os_guess`, `mss`, `window_scale` | | `tls_certificate` | Sniffer + prober | `cert_sha256`, `subject_cn`, `sans` |
| `jarm` | Prober | `jarm_hash`, `port` | | `tls_resumption` | Sniffer | `mechanisms` |
| `hassh` | Prober | `hassh_server`, `port` | | `tcp_os` | Sniffer | `os_guess`, `mss`, `window_scale`, `options_order` |
| `tcpfp` | Prober | `tcp_fp_hash`, `port` | | `jarm` | Prober | `hash`, `target_port` |
| `hassh_server` | Prober | `hash`, `ssh_banner`, `kex_algorithms` |
| `tcpfp` | Prober | `hash`, `ttl`, `window_size`, `df_bit` |
| `ja4h` | Caddy fp module | `ja4h`, `protocol`, `method`, `remote_port` | | `ja4h` | Caddy fp module | `ja4h`, `protocol`, `method`, `remote_port` |
| `http2_settings` | Caddy fp module | `settings`, `frame_order`, `remote_port` | | `http2_settings` | Caddy fp module | `settings`, `frame_order`, `remote_port` |
| `http3_settings` | Caddy fp module | `settings`, `remote_port` | | `http3_settings` | Caddy fp module | `settings`, `remote_port` |
| `http_useragent` | Ingester (HTTP events) | `category`, `tool`, `signals` |
| `http_header_quirks` | Ingester (HTTP events) | `headers_ordered` |
| `vnc_client_version` | Ingester (VNC events) | `value` |
Bounties are deduplicated per `(attacker_uuid, fingerprint_type, hash)` so Bounties are deduplicated per `(attacker_uuid, fingerprint_type, hash)` so
repeated connections from the same attacker produce one row, not thousands. repeated connections produce one row, not thousands.
Non-fingerprint bounty types: `ip_leak`, `spoofed_source`, `artifact`
(captured files and emails), `credential` (harvested secrets).
--- ---
## Enabling inline HTTP fingerprinting ## Enabling inline HTTP fingerprinting
The Caddy fp module is **built into the `http` and `https` decky templates The Caddy fp module is **built into the `http` and `https` decky templates
automatically** — no extra configuration is needed. The module activates automatically** — no configuration is needed. For HTTP/3, ensure `http/3`
when the template is deployed. is listed in the service's `http_versions` setting.
For HTTP/3, ensure `http/3` is listed in the service's `http_versions` SSH/Telnet keystroke dynamics require the `behave_shell` feature to be
setting. Caddy's native h3 stack handles UDP/443; the fp module hooks into enabled on the service (see [Service-Personas](Service-Personas)).
it via the `http3.Settingser` interface.
--- ---
## Related pages ## Related pages
- [Identity-Resolution](Identity-Resolution) — how fingerprints are - [Identity-Resolution](Identity-Resolution) — how fingerprints are
clustered into attacker identities clustered into attacker identities and campaigns
- [OS-Fingerprint-Spoofing](OS-Fingerprint-Spoofing) — how DECNET spoofs - [OS-Fingerprint-Spoofing](OS-Fingerprint-Spoofing) — how DECNET spoofs
*its own* OS fingerprint to look like the target OS *its own* OS fingerprint to look like the target OS
- [Security-and-Stealth](Security-and-Stealth) — probe stealth measures - [Security-and-Stealth](Security-and-Stealth) — probe stealth measures
- [Logging-and-Syslog](Logging-and-Syslog) — how fp socket records flow - [Logging-and-Syslog](Logging-and-Syslog) — how fp socket records flow
through syslog_bridge to the collector through syslog_bridge to the collector
- [Service-Personas](Service-Personas) — configuring BEHAVE-SHELL and
session recording per service