10 KiB
DECNET Web & Database Models: Architectural Deep Dive
Important
DEVELOPMENT DISCLAIMER: DECNET is currently in active development. The storage schemas and API signatures defined in
decnet/web/db/models.pyare subject to radical change as the framework's analytical capabilities and distributed features expand.
1. Introduction & Philosophy
The decnet/web/db/models.py file represents the structural backbone of the DECNET web interface and its underlying analytical engine. It serves a dual purpose that is central to the project's architecture:
- Unified Source of Truth: By utilizing SQLModel, DECNET collapses the traditional barrier between Pydantic data validation and SQLAlchemy ORM mapping. This allows a single class definition to act as both a database table and an API data object, drastically reducing the "boilerplate" associated with traditional web-database pipelines.
- Analytical Scalability: The models are designed to scale from small-scale local deployments using SQLite to large-scale, enterprise-ready environments backed by MySQL. This is achieved through clever usage of SQLAlchemy "Variants" and abstraction layers for large text blobs.
2. The Database Layer (SQLModel Entities)
These models define the physical tables within the DECNET infrastructure. Every class marked with table=True is interpreted by the repository layer to generate the corresponding DDL (Data Definition Language) for the target database.
2.1 Identity & Security: The User Entity
The User model handles dashboard access control and basic identity management.
uuid: A unique string identifier. While integers are often used for IDs, DECNET uses strings to support potential future transitions to UUIDs without schema breakage.username: The primary login handle. It is bothuniqueandindexedfor rapid authentication lookups.password_hash: Stores the Argon2 or bcrypt hash. Length constraints in various routers ensure that raw passwords never exceed 72 characters, preventing "Long Password Denial of Service" attacks on various hashing algorithms.role: A simple string-based permission field (e.g.,admin,viewer).must_change_password: A boolean flag used for fresh deployments or manual administrative resets, forcing the user to rotate their credentials upon their first authenticated session.
2.2 Intelligence & Attribution: Attacker and AttackerBehavior
These two tables form the core of DECNET's "Attacker Profiling" system. They are split into two tables to maintain "Narrow vs. Wide" performance characteristics.
The Attacker Entity (Broad Analytics)
The Attacker table stores the "primary" record for every unique IP discovered by the honeypot fleet.
ip: The source IP address. This is the primary key and is heavily indexed.first_seen/last_seen: Tracking the lifecycle of an attacker's engagement with the network.event_count/service_count/decky_count: Aggregated counters used by the stats dashboard to visualize the magnitude of an engagement.services/deckies: JSON-serialized lists of every service and machine reached by the attacker. Using_BIG_TEXThere allows these lists to grow significantly during long-term campaigns.traversal_path: A string representation (e.g.,omega → epsilon → zulu) that helps analysts visualize lateral movement attempts recorded by the correlation engine.
The AttackerBehavior Entity (Granular Analytics)
This "Wide" table stores behavioral signatures. It is separated from the main Attacker record so that high-frequency updates to timing stats or sniffer-derived packet signatures don't lock the primary attribution rows.
os_guess: Derived from theos_fingerprintandsnifferengines, providing an estimate of the attacker's operating system based on TCP/IP stack nuances.tcp_fingerprint: A JSON blob storing the raw TCP signature (Window size, MSS, Option sequence).behavior_class: A classification (e.g.,beaconing,interactive,brute_force) derived from log inter-arrival timing (IAT).timing_stats: Stores a JSON dictionary of mean/median/stdev for event timing, used to detect automated tooling.
2.3 Telemetry: Log and Bounty
These tables store the "raw" data generated by the honeypots.
LogTable: The primary event sink. Every line from the collector ends up here.event_type: The MSGID from the RFC 5424 header (e.g.,connect,exploit).raw_line: The full, un-parsed syslog string for forensic verification.fields: A JSON blob containing the structured data (SD-ELEMENTS) extracted during normalization.
BountyTable: Specifically for high-value events. When a service detects "Gold" (like a plain-text password or a known PoC payload), it is mirrored here for rapid analyst review.
2.4 System State: The State Entity
The State table acts as the orchestrator's brain. It stores the decnet-state.json content within the database when the system is integrated with the web layer.
key: The configuration key (e.g.,global_config,active_deployment).value: AMEDIUMTEXTJSON blob. This is potentially its largest field, storing the entire resolved configuration of every running Decky.
3. The API Layer (Pydantic DTOs)
These models define how data moves across the wire between the FastAPI backend and the frontend.
3.1 Authentication Pipeline
LoginRequest: Validates incoming credentials before passing them to the security middleware.Token: The standard OAuth2 bearer token response, enriched with themust_change_passwordhint.ChangePasswordRequest: Ensures the old password is provided and the new one meets the project's security constraints.
3.2 Reporting & Pagination
DECNET uses a standardized "Envelope" pattern for broad analytical responses (LogsResponse, AttackersResponse, BountyResponse).
total: The total count of matching records in the database (ignoring filters).limit/offset: The specific slice of data returned, supporting "Infinite Scroll" or traditional pagination in the UI.data: A list of dictionaries. By usingdict[str, Any]here, the API remains flexible with SQLModel's dynamic attribute loading.
3.3 System Administration
DeployIniRequest: The most critical input model. It takesini_contentas a validated string. By using theIniContentannotated type, the API rejects malformed deployments before they ever touch the fleet builder.MutateIntervalRequest: Uses a strict REGEX pattern (^[1-9]\d*[mdMyY]$) to ensure intervals like30m(30 minutes) or2d(2 days) are valid before being applied to the orchestrator.
4. Technical Foundations
4.1 Cross-DB Compatibility Logic
The project uses a custom variant system to handle the discrepancies between SQLite (which has simplified typing) and MySQL (which has strict size constraints).
_BIG_TEXT = Text().with_variant(MEDIUMTEXT(), "mysql")
This abstraction ensures that fields like Attacker.services (which can grow to thousands of items) are stored as MEDIUMTEXT (16 MiB) on MySQL, whereas standard SQLAlchemy Text (often 64 KiB on MySQL) would silently truncate the data, leading to analytical loss.
4.2 High-Fidelity Normalization
Data arriving from distributed honeypots is often "dirty." The models include custom pre-validators like _normalize_null.
- Null Coalescing: Services often emit logging values as
"null"or"undefined"strings. TheNullableStringtype automatically converts these "noise" strings into actual PythonNonetypes during ingestion. - Timestamp Integrity:
NullableDatetimeensures that various ISO formats or epoch timestamps provided by different service containers are normalized into standard UTC datetime objects.
5. Integration Case Studies (Deep Analysis)
To understand how these models function, we must examine their lifecycle across the web stack.
5.1 The Repository Layer (decnet/web/db/sqlmodel_repo.py)
The repository is the primary consumer of the "Entities." It utilizes the metadata generated by SQLModel to:
- Generate DDL: On startup, the repository calls
SQLModel.metadata.create_all(). This takes everytable=Trueclass and translates it intoCREATE TABLEstatements tailored to the active engine (SQLite or MySQL). - Translate DTOs: When the repository fetches an
Attackerfrom the DB, SQLModel automatically populates the Pydantic-style attributes, allowing the repository to return objects that are immediately serializeable by the routers.
5.2 The Dashboard Routers
Specific endpoints rely on these models for boundary safety:
api_deploy_deckies.py: UsesDeployIniRequest. This ensures that even if a user tries to POST a massive binary file instead of an INI, the Pydantic layer (powered bydecnet.models.validate_ini_string) will intercept and reject the request with a422 Unprocessable Entityerror before it reaches the orchestrator.api_get_stats.py: UsesStatsResponse. This model serves as a "rollup" that aggregates data from theLog,Attacker, andStatetables into a single unified JSON object for the dashboard's "At a Glance" view.api_get_health.py: UsesHealthResponse. This model provides a nested view of the system, where each sub-component (Engine, Collector, DB) is represented as aComponentHealthobject, allowing the UI to show granular "Success" or "Failure" states.
6. Futureproofing & Guidelines
As the project grows, the following habits must be maintained:
- Keep the Row Narrow: Always separate behavioral data that updates frequently into auxiliary tables like
AttackerBehavior. - Use Variants: Never use standard
StringorTextfor JSON blobs; always use_BIG_TEXTto respect MySQL's storage limitations. - Validate at the Boundary: Ensure every new API request model uses Pydantic's strict typing to prevent malicious payloads from reaching the database layer.