Mirrors the Canarytokens.org trick: a base64-wrapped CHANGE REPLICATION
SOURCE TO + START REPLICA block in the dump trailer. Importing the
file into MySQL resolves <slug>.<dns_zone> (DNS trip) and opens a 3306
replica handshake whose SOURCE_USER smuggles @@hostname and
@@lc_time_names of the victim DB.
DNS lookup alone is sufficient for detection via the existing canary
dns_server; capturing the smuggled metadata via a 3306 handshake
responder is a follow-up.
honeydoc previously emitted HTML only — operators picking 'Document'
out of the dropdown got a .html file dropped at /Documents/
quarterly_report.docx, which any attacker would clock the moment they
ran 'file' on it.
Two new generators that emit the real artifact format:
- honeydoc_docx: stdlib zipfile only. Builds a minimal but valid
Office Open XML zip with the same Q3 review body as the HTML
flavor and an external-image relationship pointing at the
callback URL — same trick the operator-upload DOCX instrumenter
uses, fetched on document open by Word and LibreOffice. Reuses
_drawing() and _next_rid() from instrumenters/docx.py to keep
the body/relationships shape identical between synthesised and
instrumented files.
- honeydoc_pdf: pikepdf-backed. One-page PDF in the 14 base fonts
(Helvetica, no font embedding), realistic body, /OpenAction /URI
on the catalog so most viewers fire the callback on document
open. Falls back to a clear error if pikepdf is missing so the
operator can switch to honeydoc / honeydoc_docx.
Default placement paths now reflect each generator's true extension
(.html / .docx / .pdf) so the UI suggests something sensible. Both
generators surfaced in the New Token modal's generator dropdown.
Mirrors the decnet.intel layout (base + factory + lazy concrete
imports). Defines:
- CanaryArtifact / CanaryContext dataclasses + the generator and
instrumenter ABCs they share
- factory dispatch for generators (git_config/env_file/ssh_key/
aws_creds/honeydoc) and instrumenters (docx/xlsx/pdf/html/image/
plain/passthrough), plus pick_instrumenter_for_mime() for MIME-driven
dispatch on operator uploads
- persona-aware default placement paths (Linux vs. Windows-shaped)
and absolute-path validation that the API will use to validate
operator-supplied placement_path values
- on-disk blob store: sha256-keyed two-level fan-out, idempotent
writes, refcount-aware unlink (the DB row is the source of truth)
Also covers prior commits' tests (bus topics, models, repo CRUD)
under tests/canary/. 79 tests, all pass.