- Core Telegram monitoring pipeline (scraper, processor, notifier, downloaders) - Textual TUI frontend with thread-safe event bus - SQLite persistence, severity scoring, dedup cache - Fixed ULP parser: handles https:// truncation, port+path URLs, semicolon separator - Test suite: 88 tests across scorer, cache, database, processor
1.0 KiB
1.0 KiB
utils/cache.py
Tracks already-processed Telegram document IDs to avoid redownloading.
Persists to data/cache.json as a JSON array of integers.
Public API
from utils.cache import is_seen, mark_seen
is_seen(file_id: int) -> bool
Returns True if this document ID has been processed before.
Loads from disk on every call (safe for multi-process, slightly slow for hot loops — not an issue given download cadence).
mark_seen(file_id: int) -> None
Adds file_id to the cache and persists to disk.
Storage
- File:
data/cache.json - Format: JSON array of integers —
[123456789, 987654321, ...] - No expiry — grows indefinitely. Safe to delete to re-process all files.
Notes
is_seen+mark_seenare called incore/scraper.pyafter a successful download+process cycle, not before — so a file that fails mid-process will be retried on next run.- Not thread-safe (load/modify/save is not atomic). Acceptable because downloads are sequential within the bot loop.