feat(db): extend SessionProfile schema with DEBT-036 keystroke features
Adds the three signal columns motivated by the manual keystroke
analysis in DEBT-036 directly to the SessionProfile table. Pre-v1 so
we modify the schema in place — Alembic arrives at v1.
Columns:
- kd_top_bigrams (TEXT) — JSON of top-N most-common digraphs with
mean IAT per bigram. Complements kd_digraph_simhash ("same typist?")
with "same typist in same mental state?" (tired / rested / distracted
shifts bigram-specific IATs measurably).
- kd_start_of_action_latency (REAL/DOUBLE) — median IAT of the first
keystroke after an idle gap > 1s. Separates "initiating a command"
from "executing a remembered one"; real humans have measurable
start-of-action latency, bots don't.
- kd_pause_hist_burst / _think / _distracted (INT) — three-bucket
histogram (counts, <0.2s / 0.2-1.5s / >1.5s). More discriminating
than the existing flat burst_ratio / think_ratio pair: C2 operators
concentrate in burst with a thin tail; opportunistic humans have a
fat think bucket and a long distracted tail.
Both backends get an idempotent ADD COLUMN migration
(_migrate_session_profile_table) wired into initialize() alongside
the existing _migrate_attackers_table path — guards on PRAGMA
table_info (SQLite) / information_schema.COLUMNS (MySQL) so reruns
are safe.
PII discipline comment on kd_digraph_simhash and kd_top_bigrams:
both operate on bigram CHARACTERS, never on raw input stream content.
Attacker passwords typed over SSH must not land here.
Test updated for the MySQL initialize() migration-order contract.
This commit is contained in:
@@ -54,6 +54,31 @@ class SQLiteRepository(SQLModelRepository):
|
||||
"ALTER TABLE attackers ADD COLUMN country_source VARCHAR(16)"
|
||||
))
|
||||
|
||||
async def _migrate_session_profile_table(self) -> None:
|
||||
"""Add DEBT-036 keystroke-dynamics columns (start-of-action latency,
|
||||
three-bucket pause histogram, top-bigrams JSON) to existing tables.
|
||||
|
||||
SQLite's ``ALTER TABLE ADD COLUMN`` fails if the column already
|
||||
exists, so gate on ``PRAGMA table_info`` to stay idempotent.
|
||||
"""
|
||||
async with self.engine.begin() as conn:
|
||||
rows = (await conn.execute(text("PRAGMA table_info(session_profile)"))).fetchall()
|
||||
if not rows:
|
||||
return # table absent; create_all() handles it.
|
||||
existing_cols = {r[1] for r in rows}
|
||||
additions = [
|
||||
("kd_top_bigrams", "TEXT"),
|
||||
("kd_start_of_action_latency", "REAL"),
|
||||
("kd_pause_hist_burst", "INTEGER"),
|
||||
("kd_pause_hist_think", "INTEGER"),
|
||||
("kd_pause_hist_distracted", "INTEGER"),
|
||||
]
|
||||
for col_name, col_type in additions:
|
||||
if col_name not in existing_cols:
|
||||
await conn.execute(text(
|
||||
f"ALTER TABLE session_profile ADD COLUMN {col_name} {col_type}"
|
||||
))
|
||||
|
||||
def _json_field_equals(self, key: str):
|
||||
# SQLite stores JSON as text; json_extract is the canonical accessor.
|
||||
return text(f"json_extract(fields, '$.{key}') = :val")
|
||||
|
||||
Reference in New Issue
Block a user