Files

anti 4c104cddd2 Add web frontend with JWT auth, RBAC, SSE dashboard, and config editor

- FastAPI + htmx + Jinja2 web frontend, started with --web flag
- JWT HS256 auth (WEB_SECRET_KEY) with httpOnly cookies; access (15 min) +
  refresh (7 day) tokens; refresh rotation + JTI revocation in data/web.db
- RBAC: superadmin > admin > reader enforced per route
- Live SSE dashboard fed by tui/events broadcast queue
- Config editor: keyword groups and channel list saved to data/runtime_config.json
  and hot-reloaded in-process (scorer.reload_from_config, signal_channel_changed)
- config.py migrated to load groups/channels from runtime_config.json;
  falls back to hardcoded defaults when file absent
- tui/events.py: subscribe/unsubscribe broadcast, set_bot_context/signal_channel_changed
- utils/scorer.py: import config as _config (fixes local binding); reload_from_config()
- utils/database.py: count_by_severity, recent_for_domains, count_by_severity_for_domains
- 53 new tests (events bus, JWT lifecycle, web DB CRUD, RBAC enforcement,
  config round-trip); total 141 passing

2026-04-02 11:41:46 -03:00

3.6 KiB

Raw Blame History

utils/scorer.py

Severity scoring for credential hits. No Telegram deps. Pure logic.

Public API

from utils.scorer import score_hit, score_hits, summarize, ScoredHit
from utils.scorer import CRITICAL, HIGH, MEDIUM, LOW, SEVERITY_EMOJI, SEVERITY_SCORES

`score_hit(line: str) -> ScoredHit`

Score a single raw credential line. Parses ULP format (url:user:pass), runs all checks, returns a ScoredHit.

`score_hits(lines: list[str]) -> list[ScoredHit]`

Score a list of lines. Returns sorted descending by score.

`summarize(scored: list[ScoredHit]) -> dict`

Returns {CRITICAL: n, HIGH: n, MEDIUM: n, LOW: n}.

ScoredHit dataclass

Field	Type	Description
`raw`	str	Original credential line
`severity`	str	CRITICAL / HIGH / MEDIUM / LOW
`score`	int	40 / 30 / 20 / 10
`reasons`	list[str]	Human-readable match reasons
`url`	str\|None	Parsed URL field
`username`	str\|None	Parsed username/email field
`password`	str\|None	Parsed password field
`.emoji`	property	🔴🟠🟡🟢

Scoring rules (highest match wins)

Severity	Triggers
CRITICAL	Employee email domain after `@` in username/line · Privileged service URL (admin, vpn, ssh, rdp, gitlab, jira…)
HIGH	Internal service URL (intranet, erp, crm, sso, owa, sharepoint…)
MEDIUM	Client-facing URL (app, patient, booking, helpdesk…)
LOW	Org domain appears anywhere in line (baseline)

Check 6 (no severity change): flags weak passwords ≤6 chars or common strings.

Employee domain matching

Keywords in config.TARGET_KEYWORDS containing @ become employee patterns.
Pattern: @<domain>(?:[^a-zA-Z0-9.\-]|$) — requires literal @ before the domain.
user@gmail.com on a URL containing myorg.cl does NOT trigger CRITICAL.

Keywords without @ go only to ORG_DOMAINS (LOW baseline).

ULP line parser (`ULP_PATTERN`)

Separators: : ; , | \t (any of these between the three fields).

The URL field handles two common stealer-log complications:

:// not treated as separator — the optional scheme prefix (?:https?|ftp):// is consumed before the character-class match, so https:// never gets split at the colon.
Port + path consumed into the URL — the optional group (?::\d+/[^\s:;,|\t]*) absorbs :port/path when the port is pure digits immediately followed by /. This correctly handles http://host:8085/path/:user:pass but intentionally skips patterns like :24145487-8 (RUT number — hyphen after digits, no /).

Known limitation: A bare port with no path (e.g. https://host:8080:user:pass) will mis-parse 8080 as the username. This is not observed in practice — stealer logs always include at least a trailing /.

Module-level globals (rebuilt on import + via reload_from_config)

Name	Type	Description
`EMPLOYEE_DOMAINS`	`list[tuple[str, Pattern]]`	`(domain_str, anchored_pattern)` for `@`-keywords
`ORG_DOMAINS`	`list[Pattern]`	Plain domain patterns for all keywords

scorer uses import config as _config (not from config import TARGET_KEYWORDS), so patching config.TARGET_KEYWORDS at runtime is sufficient — _build_* reads the live module attribute.

To rebuild after editing config.TARGET_KEYWORDS at runtime:

import utils.scorer as scorer
scorer.reload_from_config()

`reload_from_config() -> None`

Rebuilds EMPLOYEE_DOMAINS and ORG_DOMAINS from the current config.TARGET_KEYWORDS. Called by web config routes after config.save_runtime_config() writes new keyword groups.

3.6 KiB Raw Blame History