Files
stealergram/utils/database.md
anti 741e6bb0d3 Rename to stealergram, add pyproject.toml, purge em-dashes
- Rename project to stealergram throughout
- Add pyproject.toml (replaces requirements.txt split, folds pytest.ini)
- Replace all em-dashes with hyphens across all source files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 10:06:30 -04:00

2.5 KiB

utils/database.py

SQLite persistence layer for credential hits.
DB file: data/hits.db

Public API

from utils.database import init_db, insert_hits, search, recent, by_severity, stats

Setup

init_db() -> None

Creates hits table and indexes if they don't exist. Call once on startup.
Safe to call multiple times (idempotent).


Writing

insert_hits(scored_hits, source, filename, seen_before=False) -> int

Inserts a list of ScoredHit objects. Returns row count inserted.

insert_hits(new_hits, source="channelname", filename="combo.zip")
insert_hits(dupe_hits, source="channelname", filename="combo.zip", seen_before=True)

Querying

search(keyword: str) -> list[sqlite3.Row]

Full-text search across url, username, raw. Returns rows sorted by score DESC, timestamp DESC.

recent(limit: int = 50) -> list[sqlite3.Row]

Most recent hits, newest first.

by_severity(severity: str) -> list[sqlite3.Row]

All unique (non-duplicate) hits at a given severity, newest first.
severity must be one of: "CRITICAL", "HIGH", "MEDIUM", "LOW"

stats() -> dict

Returns summary counters:

{
    "total":      int,   # all rows
    "unique":     int,   # seen_before=0
    "duplicates": int,   # seen_before=1
    "critical":   int,   # unique CRITICAL
    "high":       int,
    "medium":     int,
    "low":        int,
    "sources":    int,   # distinct source channels
    "top_source": {"source": str, "cnt": int} | None,
}

Schema

hits (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    url         TEXT,
    username    TEXT,
    password    TEXT,
    raw         TEXT NOT NULL,      -- full original credential line
    source      TEXT,               -- channel username or ID
    filename    TEXT,               -- downloaded file name
    timestamp   TEXT NOT NULL,      -- "YYYY-MM-DD HH:MM:SS UTC"
    severity    TEXT NOT NULL,      -- CRITICAL/HIGH/MEDIUM/LOW
    score       INTEGER NOT NULL,   -- 40/30/20/10
    reasons     TEXT,               -- pipe-separated reason strings
    seen_before INTEGER NOT NULL    -- 0=new, 1=duplicate
)

Indexes: url, username, source, timestamp, severity.


Notes

  • Each query opens and closes its own connection via the _connect() context manager.
  • conn.row_factory = sqlite3.Row - rows support both index and column-name access.
  • Transactions: commit on success, rollback on exception.