feat(intel,ingester): mal_hash feed + observed_attachments table (DEBT-046)
New MalHashProvider sibling ABC (decnet/intel/base.py) since SHA-256 is a different keyspace from IntelProvider's IPs. MalwareBazaarProvider mirrors FeodoProvider's bulk-feed shape: 24h refresh via _ensure_fresh / _refresh, in-memory set[str] of hex-lowercased hashes, set-membership lookup. Auth-keyed via DECNET_MALWAREBAZAAR_AUTH_KEY; absent key silent-no-ops the lane (single warning, no HTTP traffic). Per-hash observations persist to a new observed_attachments table. DECNET is a honeypot platform — every attachment hash an attacker delivers is intel, regardless of whether anyone classified it. Verdict is sticky: True never downgrades to False/None on subsequent observations. Out of scope: API surface, federation export, retention. Ingester _publish_email_received calls the provider for each attachment sha256, sets mal_hash_match on the bus payload (omitted entirely when the message had no attachments — keeps R0046's `is True` predicate silent on hash-less mail, matching pre-paydown behavior), and upserts the row regardless of provider availability.
This commit is contained in:
@@ -78,3 +78,33 @@ class IntelProvider(ABC):
|
||||
entire IP. Implementations should also respect
|
||||
``self._semaphore`` to bound in-flight calls.
|
||||
"""
|
||||
|
||||
|
||||
class MalHashProvider(ABC):
|
||||
"""Abstract bad-hash lookup provider.
|
||||
|
||||
Sibling to :class:`IntelProvider` — different keyspace (file SHA-256
|
||||
vs IP), different consumer (the email ingester at observation time,
|
||||
not the IP-keyed intel-worker fan-out). Kept as a separate ABC so
|
||||
the ``lookup(ip)`` semantics on ``IntelProvider`` stay honest.
|
||||
|
||||
Concrete impls today:
|
||||
|
||||
* :class:`decnet.intel.mal_hash.MalwareBazaarProvider` — bulk-feed
|
||||
shape mirroring :class:`decnet.intel.feodo.FeodoProvider`.
|
||||
|
||||
Future impls (paid VirusTotal subscription, in-house allowlist) plug
|
||||
in behind the same factory in :func:`decnet.intel.factory.get_mal_hash_provider`.
|
||||
"""
|
||||
|
||||
name: str
|
||||
|
||||
@abstractmethod
|
||||
async def is_known_bad(self, sha256: str) -> bool:
|
||||
"""Return whether *sha256* is on this provider's bad-hash list.
|
||||
|
||||
MUST NOT raise — return ``False`` on any error (the caller is the
|
||||
ingester, not a worker; an exception here would taint a totally
|
||||
unrelated bus payload). The provider is responsible for logging
|
||||
its own errors.
|
||||
"""
|
||||
|
||||
Reference in New Issue
Block a user