fix: serialize HTTP headers as JSON so tool detection and bounty extraction work

templates/decnet_logging.py calls str(v) on all SD-PARAM values, turning a headers dict into Python repr ('{'User-Agent': ...}') rather than JSON. detect_tools_from_headers() called json.loads() on that string and silently swallowed the error, returning [] for every HTTP event. Same bug prevented the ingester from extracting User-Agent bounty fingerprints. - templates/http/server.py: wrap headers dict in json.dumps() before passing to syslog_line so the value is a valid JSON string in the syslog record - behavioral.py: add ast.literal_eval fallback for existing DB rows that were stored with the old Python repr format - ingester.py: parse headers as JSON string in _extract_bounty so User-Agent fingerprints are stored correctly going forward - tests: add test_json_string_headers and test_python_repr_headers_fallback to exercise both formats in detect_tools_from_headers
2026-04-15 17:03:52 -04:00
parent 02e73a19d5
commit 89887ec6fd
4 changed files with 38 additions and 4 deletions
--- a/tests/test_profiler_behavioral.py
+++ b/tests/test_profiler_behavioral.py
@@ -287,6 +287,18 @@ class TestDetectToolsFromHeaders:
        result = detect_tools_from_headers(events)
        assert result.count("sqlmap") == 1

+    def test_json_string_headers(self):
+        # Post-fix format: headers stored as a JSON string (not a dict).
+        e = _mk(0, event_type="request", service="http",
+                fields={"headers": '{"User-Agent": "Nmap Scripting Engine"}'})
+        assert "nmap" in detect_tools_from_headers([e])
+
+    def test_python_repr_headers_fallback(self):
+        # Legacy format: headers stored as Python repr string (str(dict)).
+        e = _mk(0, event_type="request", service="http",
+                fields={"headers": "{'User-Agent': 'Nmap Scripting Engine'}"})
+        assert "nmap" in detect_tools_from_headers([e])
+

 # ─── phase_sequence ────────────────────────────────────────────────────────