# utils/database.py SQLite persistence layer for credential hits. DB file: `data/hits.db` ## Public API ```python from utils.database import init_db, insert_hits, search, recent, by_severity, stats ``` ### Setup #### `init_db() -> None` Creates `hits` table and indexes if they don't exist. Call once on startup. Safe to call multiple times (idempotent). --- ### Writing #### `insert_hits(scored_hits, source, filename, seen_before=False) -> int` Inserts a list of `ScoredHit` objects. Returns row count inserted. ```python insert_hits(new_hits, source="channelname", filename="combo.zip") insert_hits(dupe_hits, source="channelname", filename="combo.zip", seen_before=True) ``` --- ### Querying #### `search(keyword: str) -> list[sqlite3.Row]` Full-text search across `url`, `username`, `raw`. Returns rows sorted by score DESC, timestamp DESC. #### `recent(limit: int = 50) -> list[sqlite3.Row]` Most recent hits, newest first. #### `by_severity(severity: str) -> list[sqlite3.Row]` All unique (non-duplicate) hits at a given severity, newest first. `severity` must be one of: `"CRITICAL"`, `"HIGH"`, `"MEDIUM"`, `"LOW"` #### `stats() -> dict` Returns summary counters: ```python { "total": int, # all rows "unique": int, # seen_before=0 "duplicates": int, # seen_before=1 "critical": int, # unique CRITICAL "high": int, "medium": int, "low": int, "sources": int, # distinct source channels "top_source": {"source": str, "cnt": int} | None, } ``` --- ## Schema ```sql hits ( id INTEGER PRIMARY KEY AUTOINCREMENT, url TEXT, username TEXT, password TEXT, raw TEXT NOT NULL, -- full original credential line source TEXT, -- channel username or ID filename TEXT, -- downloaded file name timestamp TEXT NOT NULL, -- "YYYY-MM-DD HH:MM:SS UTC" severity TEXT NOT NULL, -- CRITICAL/HIGH/MEDIUM/LOW score INTEGER NOT NULL, -- 40/30/20/10 reasons TEXT, -- pipe-separated reason strings seen_before INTEGER NOT NULL -- 0=new, 1=duplicate ) ``` Indexes: `url`, `username`, `source`, `timestamp`, `severity`. --- ## Notes - Each query opens and closes its own connection via the `_connect()` context manager. - `conn.row_factory = sqlite3.Row` - rows support both index and column-name access. - Transactions: commit on success, rollback on exception.