feat(intel,ingester): mal_hash feed + observed_attachments table (DEBT-046)
New MalHashProvider sibling ABC (decnet/intel/base.py) since SHA-256 is a different keyspace from IntelProvider's IPs. MalwareBazaarProvider mirrors FeodoProvider's bulk-feed shape: 24h refresh via _ensure_fresh / _refresh, in-memory set[str] of hex-lowercased hashes, set-membership lookup. Auth-keyed via DECNET_MALWAREBAZAAR_AUTH_KEY; absent key silent-no-ops the lane (single warning, no HTTP traffic). Per-hash observations persist to a new observed_attachments table. DECNET is a honeypot platform — every attachment hash an attacker delivers is intel, regardless of whether anyone classified it. Verdict is sticky: True never downgrades to False/None on subsequent observations. Out of scope: API surface, federation export, retention. Ingester _publish_email_received calls the provider for each attachment sha256, sets mal_hash_match on the bus payload (omitted entirely when the message had no attachments — keeps R0046's `is True` predicate silent on hash-less mail, matching pre-paydown behavior), and upserts the row regardless of provider availability.
This commit is contained in:
@@ -21,7 +21,7 @@ from __future__ import annotations
|
||||
import os
|
||||
from typing import List
|
||||
|
||||
from decnet.intel.base import IntelProvider
|
||||
from decnet.intel.base import IntelProvider, MalHashProvider
|
||||
|
||||
_KNOWN_PROVIDERS = ("greynoise", "abuseipdb", "feodo", "threatfox")
|
||||
|
||||
@@ -37,6 +37,40 @@ def _provider_list() -> list[str]:
|
||||
return [p.strip().lower() for p in raw.split(",") if p.strip()]
|
||||
|
||||
|
||||
_mal_hash_singleton: MalHashProvider | None = None
|
||||
_mal_hash_initialized: bool = False
|
||||
|
||||
|
||||
def get_mal_hash_provider() -> MalHashProvider | None:
|
||||
"""Return the configured malware-hash lookup provider singleton.
|
||||
|
||||
Sibling factory to :func:`get_intel_providers` — different keyspace
|
||||
(file SHA-256 vs IP), different consumer (the email ingester at
|
||||
observation time, not the IP-keyed intel-worker fan-out). Returns
|
||||
``None`` only if intel is disabled wholesale; otherwise returns a
|
||||
provider whose :meth:`is_known_bad` self-disables to a no-op when
|
||||
``DECNET_MALWAREBAZAAR_AUTH_KEY`` is unset, so the ingester never
|
||||
has to special-case "no provider configured."
|
||||
"""
|
||||
global _mal_hash_singleton, _mal_hash_initialized
|
||||
if _mal_hash_initialized:
|
||||
return _mal_hash_singleton
|
||||
_mal_hash_initialized = True
|
||||
if not _enabled():
|
||||
_mal_hash_singleton = None
|
||||
return None
|
||||
from decnet.intel.mal_hash import MalwareBazaarProvider
|
||||
_mal_hash_singleton = MalwareBazaarProvider()
|
||||
return _mal_hash_singleton
|
||||
|
||||
|
||||
def _reset_mal_hash_provider_for_testing() -> None:
|
||||
"""Test hook — drop the singleton so the next call re-reads env."""
|
||||
global _mal_hash_singleton, _mal_hash_initialized
|
||||
_mal_hash_singleton = None
|
||||
_mal_hash_initialized = False
|
||||
|
||||
|
||||
def get_intel_providers() -> List[IntelProvider]:
|
||||
"""Return the configured threat-intel providers.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user