- Rename project to stealergram throughout - Add pyproject.toml (replaces requirements.txt split, folds pytest.ini) - Replace all em-dashes with hyphens across all source files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
6.5 KiB
6.5 KiB
ULP Monitor - Quick Reference
For Claude Code: read the per-file
.mdalongside each.pybefore editing.
Full docs inREADME.md.
Project layout
ulp_monitor/
├── main.py Entry point (--no-tui flag for CLI mode)
├── config.py All settings - edit this for keywords, channels, paths
│
├── core/ Telegram I/O pipeline (all async, Telethon-dependent)
│ ├── scraper.py Live listener + backfill orchestration
│ ├── tdl_downloader.py tdl subprocess wrapper + Telethon fallback
│ ├── bot_downloader.py Inline "DOWNLOAD" button click flow
│ ├── processor.py Archive extraction (.zip/.7z/.rar) + line search
│ └── notifier.py Scoring → dedup → DB → hits.txt/csv → Telegram alert
│
├── utils/ Pure logic, no Telegram deps, no async
│ ├── scorer.py Severity scoring (CRITICAL/HIGH/MEDIUM/LOW)
│ ├── cache.py Seen file-ID dedup (data/cache.json)
│ └── database.py SQLite read/write (data/hits.db)
│
├── tui/ Textual TUI - runs in main thread
│ ├── app.py MonitorApp + all screens + bot thread launcher
│ └── events.py Thread-safe queue.Queue event bus
│
└── data/ Runtime output - gitignored
├── hits.db
├── hits.txt
├── hits.csv
├── cache.json
├── dedup.json
└── logs/monitor.log
Data flow
Telegram channel
└─ new message with file / download button
│
├─ core/scraper.py detects + guards (size, extension, dedup)
│
├─ core/tdl_downloader.py downloads via tdl (batched)
│ └─ core/scraper.py Telethon fallback if tdl fails
│
├─ core/bot_downloader.py handles inline button → bot reply flow
│
├─ core/processor.py extracts archive → searches .txt line by line
│
└─ core/notifier.py scores → deduplicates → persists → alerts
├─ utils/scorer.py
├─ utils/database.py
└─ tui/events.py posts EvHit to TUI
Threading architecture
main thread (Textual's event loop)
├─ MonitorApp.on_mount()
│ ├─ bus.init_bus() creates queue.Queue on THIS loop
│ ├─ threading.Thread → _run_bot_thread()
│ └─ set_interval(0.1, _drain_bus)
│
├─ _drain_bus() [every 100ms]
│ └─ queue.Queue.get_nowait() → dispatch to widgets
│
└─ Textual widgets, screens, keybindings
bot thread (own asyncio event loop)
└─ _bot_main()
├─ bot_client.connect() + sign_in()
├─ user_client.connect() + is_user_authorized()
├─ warm_entity_cache()
├─ _make_handler() → NewMessage handler registered
├─ backfill_all()
└─ run_until_disconnected() + _watch_channels() [gathered]
cross-thread communication
bot → TUI: bus.post(event) [queue.Queue.put_nowait, always safe]
TUI → bot: loop.call_soon_threadsafe() [asyncio.Event.set for channel changes]
Config quick reference (config.py)
| Setting | Type | Description |
|---|---|---|
API_ID |
int | From my.telegram.org |
API_HASH |
str | From my.telegram.org |
BOT_TOKEN |
str | From @BotFather |
NOTIFY_CHAT_ID |
int | Your Telegram user/group ID |
SESSION_NAME |
str | Session file name (default: monitor_session) |
TARGET_KEYWORDS |
list[str] | Regex patterns. @-prefixed → employee email (CRITICAL). Plain → domain match (LOW) |
WATCHED_CHANNELS |
list[str|int] | Usernames or -100xxxxxxxxxx IDs |
BACKFILL_LIMIT |
int | Messages to scan per channel on startup (0 = off) |
ALLOWED_EXTENSIONS |
set | .txt .zip .7z .rar |
MAX_FILE_SIZE |
int | Bytes (default 4 GB) |
ARCHIVE_PASSWORDS |
list[bytes] | Tried in order on locked archives |
TDL_NAMESPACE |
str|None | tdl login -n <name> namespace |
TDL_THREADS |
int | Chunk workers per file (-t) |
TDL_PERFILE |
int | Concurrent files per tdl call (-l) |
TDL_AMOUNT |
int | Messages per batch |
TEMP_DIR |
Path | data/tmp |
HITS_FILE |
Path | data/hits.txt |
LOG_FILE |
Path | data/logs/monitor.log |
Severity scoring summary
| Severity | Score | Triggers |
|---|---|---|
| CRITICAL | 40 | Employee email (@myorg.cl in username) · Privileged service URL (admin, vpn, rdp, gitlab…) |
| HIGH | 30 | Internal service URL (intranet, erp, sso, owa…) |
| MEDIUM | 20 | Client-facing URL (app, booking, helpdesk…) |
| LOW | 10 | Org domain appears anywhere in line |
@-keyword rule: pattern requires literal @ before domain - user@gmail.com on a URL containing myorg.cl does not trigger CRITICAL.
TUI keybindings
| Key | Action | Screen |
|---|---|---|
s |
Search hits DB | → SearchScreen |
h |
Browse hits by severity | → HitsDBScreen |
k |
Edit keyword patterns live | → KeywordsScreen |
c |
Clear download + hits logs | main |
r |
Force-refresh stats bar | main |
q / ctrl+c |
Quit | any |
Escape |
Back to main | sub-screens |
1/2/3/4 |
Filter CRITICAL/HIGH/MEDIUM/LOW | HitsDBScreen |
r |
Load recent 50 | HitsDBScreen |
Per-file reference docs
| File | Reference |
|---|---|
utils/scorer.py |
utils/scorer.md |
utils/cache.py |
utils/cache.md |
utils/database.py |
utils/database.md |
core/scraper.py |
core/scraper.md |
core/processor.py |
core/processor.md |
core/notifier.py |
core/notifier.md |
core/tdl_downloader.py |
core/tdl_downloader.md |
core/bot_downloader.py |
core/bot_downloader.md |
tui/app.py |
tui/app.md |
tui/events.py |
tui/events.md |
Common tasks
Add a new keyword at runtime: open the TUI → press k → add pattern → active immediately. Copy to config.TARGET_KEYWORDS to persist.
Add a channel at runtime: type username or numeric ID in the Channels panel → ➕ Add. Handler re-registers immediately. Edit config.WATCHED_CHANNELS to persist.
Query hits from CLI:
sqlite3 data/hits.db "SELECT severity, username, url FROM hits WHERE seen_before=0 ORDER BY score DESC LIMIT 20"
Re-process all files (wipe cache):
rm data/cache.json data/dedup.json
Check what's happening: tail -f data/logs/monitor.log