# DECNET Web & Database Models: Architectural Deep Dive > [!IMPORTANT] > **DEVELOPMENT DISCLAIMER**: DECNET is currently in active development. The storage schemas and API signatures defined in `decnet/web/db/models.py` are subject to radical change as the framework's analytical capabilities and distributed features expand. ## 1. Introduction & Philosophy The `decnet/web/db/models.py` file represents the structural backbone of the DECNET web interface and its underlying analytical engine. It serves a dual purpose that is central to the project's architecture: 1. **Unified Source of Truth**: By utilizing **SQLModel**, DECNET collapses the traditional barrier between Pydantic data validation and SQLAlchemy ORM mapping. This allows a single class definition to act as both a database table and an API data object, drastically reducing the "boilerplate" associated with traditional web-database pipelines. 2. **Analytical Scalability**: The models are designed to scale from small-scale local deployments using **SQLite** to large-scale, enterprise-ready environments backed by **MySQL**. This is achieved through clever usage of SQLAlchemy "Variants" and abstraction layers for large text blobs. --- ## 2. The Database Layer (SQLModel Entities) These models define the physical tables within the DECNET infrastructure. Every class marked with `table=True` is interpreted by the repository layer to generate the corresponding DDL (Data Definition Language) for the target database. ### 2.1 Identity & Security: The `User` Entity The `User` model handles dashboard access control and basic identity management. * `uuid`: A unique string identifier. While integers are often used for IDs, DECNET uses strings to support potential future transitions to UUIDs without schema breakage. * `username`: The primary login handle. It is both `unique` and `indexed` for rapid authentication lookups. * `password_hash`: Stores the Argon2 or bcrypt hash. Length constraints in various routers ensure that raw passwords never exceed 72 characters, preventing "Long Password Denial of Service" attacks on various hashing algorithms. * `role`: A simple string-based permission field (e.g., `admin`, `viewer`). * `must_change_password`: A boolean flag used for fresh deployments or manual administrative resets, forcing the user to rotate their credentials upon their first authenticated session. ### 2.2 Intelligence & Attribution: `Attacker` and `AttackerBehavior` These two tables form the core of DECNET's "Attacker Profiling" system. They are split into two tables to maintain "Narrow vs. Wide" performance characteristics. #### The `Attacker` Entity (Broad Analytics) The `Attacker` table stores the "primary" record for every unique IP discovered by the honeypot fleet. * `ip`: The source IP address. This is the primary key and is heavily indexed. * `first_seen` / `last_seen`: Tracking the lifecycle of an attacker's engagement with the network. * `event_count` / `service_count` / `decky_count`: Aggregated counters used by the stats dashboard to visualize the magnitude of an engagement. * `services` / `deckies`: JSON-serialized lists of every service and machine reached by the attacker. Using `_BIG_TEXT` here allows these lists to grow significantly during long-term campaigns. * `traversal_path`: A string representation (e.g., `omega → epsilon → zulu`) that helps analysts visualize lateral movement attempts recorded by the correlation engine. #### The `AttackerBehavior` Entity (Granular Analytics) This "Wide" table stores behavioral signatures. It is separated from the main `Attacker` record so that high-frequency updates to timing stats or sniffer-derived packet signatures don't lock the primary attribution rows. * `os_guess`: Derived from the `os_fingerprint` and `sniffer` engines, providing an estimate of the attacker's operating system based on TCP/IP stack nuances. * `tcp_fingerprint`: A JSON blob storing the raw TCP signature (Window size, MSS, Option sequence). * `behavior_class`: A classification (e.g., `beaconing`, `interactive`, `brute_force`) derived from log inter-arrival timing (IAT). * `timing_stats`: Stores a JSON dictionary of mean/median/stdev for event timing, used to detect automated tooling. ### 2.3 Telemetry: `Log` and `Bounty` These tables store the "raw" data generated by the honeypots. * **`Log` Table**: The primary event sink. Every line from the collector ends up here. * `event_type`: The MSGID from the RFC 5424 header (e.g., `connect`, `exploit`). * `raw_line`: The full, un-parsed syslog string for forensic verification. * `fields`: A JSON blob containing the structured data (SD-ELEMENTS) extracted during normalization. * **`Bounty` Table**: Specifically for high-value events. When a service detects "Gold" (like a plain-text password or a known PoC payload), it is mirrored here for rapid analyst review. ### 2.4 System State: The `State` Entity The `State` table acts as the orchestrator's brain. It stores the `decnet-state.json` content within the database when the system is integrated with the web layer. * `key`: The configuration key (e.g., `global_config`, `active_deployment`). * `value`: A `MEDIUMTEXT` JSON blob. This is potentially its largest field, storing the entire resolved configuration of every running Decky. --- ## 3. The API Layer (Pydantic DTOs) These models define how data moves across the wire between the FastAPI backend and the frontend. ### 3.1 Authentication Pipeline * `LoginRequest`: Validates incoming credentials before passing them to the security middleware. * `Token`: The standard OAuth2 bearer token response, enriched with the `must_change_password` hint. * `ChangePasswordRequest`: Ensures the old password is provided and the new one meets the project's security constraints. ### 3.2 Reporting & Pagination DECNET uses a standardized "Envelope" pattern for broad analytical responses (`LogsResponse`, `AttackersResponse`, `BountyResponse`). * `total`: The total count of matching records in the database (ignoring filters). * `limit` / `offset`: The specific slice of data returned, supporting "Infinite Scroll" or traditional pagination in the UI. * `data`: A list of dictionaries. By using `dict[str, Any]` here, the API remains flexible with SQLModel's dynamic attribute loading. ### 3.3 System Administration * **`DeployIniRequest`**: The most critical input model. It takes `ini_content` as a validated string. By using the `IniContent` annotated type, the API rejects malformed deployments before they ever touch the fleet builder. * **`MutateIntervalRequest`**: Uses a strict REGEX pattern (`^[1-9]\d*[mdMyY]$`) to ensure intervals like `30m` (30 minutes) or `2d` (2 days) are valid before being applied to the orchestrator. --- ## 4. Technical Foundations ### 4.1 Cross-DB Compatibility Logic The project uses a custom variant system to handle the discrepancies between SQLite (which has simplified typing) and MySQL (which has strict size constraints). ```python _BIG_TEXT = Text().with_variant(MEDIUMTEXT(), "mysql") ``` This abstraction ensures that fields like `Attacker.services` (which can grow to thousands of items) are stored as `MEDIUMTEXT` (16 MiB) on MySQL, whereas standard SQLAlchemy `Text` (often 64 KiB on MySQL) would silently truncate the data, leading to analytical loss. ### 4.2 High-Fidelity Normalization Data arriving from distributed honeypots is often "dirty." The models include custom pre-validators like `_normalize_null`. * **Null Coalescing**: Services often emit logging values as `"null"` or `"undefined"` strings. The `NullableString` type automatically converts these "noise" strings into actual Python `None` types during ingestion. * **Timestamp Integrity**: `NullableDatetime` ensures that various ISO formats or epoch timestamps provided by different service containers are normalized into standard UTC datetime objects. --- ## 5. Integration Case Studies (Deep Analysis) To understand how these models function, we must examine their lifecycle across the web stack. ### 5.1 The Repository Layer (`decnet/web/db/sqlmodel_repo.py`) The repository is the primary consumer of the "Entities." It utilizes the metadata generated by SQLModel to: 1. **Generate DDL**: On startup, the repository calls `SQLModel.metadata.create_all()`. This takes every `table=True` class and translates it into `CREATE TABLE` statements tailored to the active engine (SQLite or MySQL). 2. **Translate DTOs**: When the repository fetches an `Attacker` from the DB, SQLModel automatically populates the Pydantic-style attributes, allowing the repository to return objects that are immediately serializeable by the routers. ### 5.2 The Dashboard Routers Specific endpoints rely on these models for boundary safety: * **`api_deploy_deckies.py`**: Uses `DeployIniRequest`. This ensures that even if a user tries to POST a massive binary file instead of an INI, the Pydantic layer (powered by `decnet.models.validate_ini_string`) will intercept and reject the request with a `422 Unprocessable Entity` error before it reaches the orchestrator. * **`api_get_stats.py`**: Uses `StatsResponse`. This model serves as a "rollup" that aggregates data from the `Log`, `Attacker`, and `State` tables into a single unified JSON object for the dashboard's "At a Glance" view. * **`api_get_health.py`**: Uses `HealthResponse`. This model provides a nested view of the system, where each sub-component (Engine, Collector, DB) is represented as a `ComponentHealth` object, allowing the UI to show granular "Success" or "Failure" states. --- ## 6. Futureproofing & Guidelines As the project grows, the following habits must be maintained: 1. **Keep the Row Narrow**: Always separate behavioral data that updates frequently into auxiliary tables like `AttackerBehavior`. 2. **Use Variants**: Never use standard `String` or `Text` for JSON blobs; always use `_BIG_TEXT` to respect MySQL's storage limitations. 3. **Validate at the Boundary**: Ensure every new API request model uses Pydantic's strict typing to prevent malicious payloads from reaching the database layer.