REDB Architecture: The Thirteen-Table Schema Behind Typed Persistence

Jun 04, 2026 - 19:49
Updated: 2 hours ago
0 0
REDB Architecture: The Thirteen-Table Schema Behind Typed Persistence

REDB replaces untyped EAV storage with a thirteen-table relational schema that preserves C# type information at the database level. By routing field values into dedicated typed columns and utilizing a denormalized metadata cache, the engine eliminates runtime casting overhead. This architecture enables efficient array and dictionary storage while maintaining strict indexability and predictable query performance across PostgreSQL and Microsoft SQL Server.

Modern application development frequently demands data persistence layers that balance flexibility with rigorous type safety. Traditional object-relational mappers often force developers into rigid schema definitions or retreat into unstructured document storage. A different architectural approach emerges when examining the internal mechanics of REDB, a .NET persistence engine that deliberately abandons classical Entity-Attribute-Value patterns in favor of a highly structured, thirteen-table relational foundation.

REDB replaces untyped EAV storage with a thirteen-table relational schema that preserves C# type information at the database level. By routing field values into dedicated typed columns and utilizing a denormalized metadata cache, the engine eliminates runtime casting overhead. This architecture enables efficient array and dictionary storage while maintaining strict indexability and predictable query performance across PostgreSQL and Microsoft SQL Server.

Why does the engine reject classical EAV storage models?

Database architects frequently encounter the Entity-Attribute-Value pattern when building flexible data models. This approach stores all attributes within a single table using string identifiers for field names and a unified text column for values. While this design offers maximum schema flexibility, it introduces severe performance penalties during query execution. Filtering specific data types requires runtime casting operations that prevent the query planner from utilizing standard B-tree indexes effectively. Complex queries inevitably devolve into multiple self-joins against the same table, producing execution plans that degrade rapidly as data volume increases.

REDB addresses these limitations by embedding runtime type information directly into the relational schema. Instead of relying on string-based attribute names, the engine stores a foreign key that references a dedicated type catalog table. Each row in the primary values table contains multiple nullable columns, each corresponding to a specific C# data type. The persistence layer writes data to the appropriate column based on the declared property type, ensuring that every stored value maintains its original type identity. This design allows the database engine to apply native indexing strategies to numeric and temporal data without requiring expensive conversion operations during query execution.

The distinction between these two models extends beyond mere performance metrics. Classical EAV systems require application logic to reconstruct type boundaries during every read operation, creating a constant dependency between the database layer and the application framework. REDB shifts this responsibility to the database engine itself, allowing the relational system to enforce type boundaries natively. This separation of concerns reduces application complexity and prevents type-related runtime errors from propagating into production environments. The approach aligns with broader industry trends toward strongly typed data persistence, as seen in recent discussions about Eliminating Redundant Database Queries With Window Functions.

How does the thirteen-table architecture organize complex object graphs?

The persistence layer relies on a carefully partitioned schema that separates type definitions, structural metadata, object instances, and actual data values. The type catalog table establishes a foundational registry of supported data types, utilizing negative integer constants to distinguish system-defined types from user-generated identifiers. This separation prevents key collisions and ensures that internal markers for classes, arrays, and dictionaries never interfere with application data. The scheme and structure tables work in tandem to map C# classes to relational rows, capturing property names, ordering, and collection metadata during initialization.

Object instances reside in a dedicated table that supports unbounded hierarchical relationships through a self-referencing foreign key. This tree structure enables sections, categories, and organizational charts to be stored as native relational nodes rather than flattened arrays. The primary values table handles all field assignments by routing data into type-specific columns. Arrays and dictionaries utilize a clever marker row system that distinguishes empty collections from null values. Child elements reference their parent marker through a dedicated foreign key column, while a secondary text column stores positional or key-based identifiers. This approach eliminates the need for separate collection tables while preserving strict referential integrity across nested data structures.

Unique constraints enforce structural consistency across the values table. Three distinct indexes guarantee that scalar fields, collection markers, and individual array elements maintain strict uniqueness relative to their parent objects. These constraints operate at the database level, preventing data corruption before application logic can intervene. The design also accommodates nested class hierarchies by allowing child value rows to reference parent structure markers. This recursive referencing pattern mirrors the original object graph without requiring dynamic table generation or schema migration during runtime. The architecture demonstrates how relational databases can handle complex hierarchical data without sacrificing query performance or data integrity.

What role does the metadata cache play in query optimization?

Every read operation requires the persistence layer to determine which database columns correspond to specific object properties. Resolving this mapping through repeated joins against the structure and type catalog tables would introduce significant overhead during high-throughput workloads. The engine mitigates this cost by maintaining a denormalized metadata cache that precomputes the relationship between schemes and their underlying column definitions. This cache table stores type names, collection markers, and ordering information in a flat structure that mirrors the original schema definitions.

The cache updates automatically when the underlying scheme structure changes, triggering a lazy rebuild process that refreshes the denormalized rows. Query execution paths bypass the heavy join operations entirely by reading directly from this optimized cache. The persistence layer materializes object graphs by fetching all value rows in a single database call, then reconstructing the original C# object hierarchy entirely within application memory. This two-phase approach drastically reduces round-trip latency and allows the database to focus on raw data retrieval rather than structural resolution. The design proves particularly effective for deeply nested object graphs that would otherwise require recursive common table expressions to traverse.

Memory-based reconstruction replaces database recursion during object materialization. The engine loads all relevant value rows into a temporary array, then walks the property tree using in-memory traversal algorithms. This technique eliminates the need for repeated database round trips when resolving nested classes or iterating through array elements. The approach aligns with modern engineering practices that prioritize application-side computation over database-side recursion, as explored in Designing APIs for Agents: Moving Beyond RESTful Conventions. The cache mechanism ensures that schema evolution does not degrade query performance, providing a stable foundation for long-term application development.

How does the engine handle permissions and soft deletion?

Security and data lifecycle management require dedicated architectural components that operate independently from the core data storage layer. The permission system utilizes a many-to-many relationship table that maps user and role identifiers to specific object references. Permission resolution follows a strict inheritance model that walks upward through the object tree to locate the nearest ancestor with an explicit access rule. This recursive lookup process caps at a fixed depth to prevent performance degradation, relying on automatic triggers that cascade resolved permissions to newly created child objects. The system maintains a global fallback configuration that applies default access rules to objects without explicit permission assignments.

Data retention strategies employ a sophisticated soft deletion mechanism that avoids physical row removal during normal operations. The engine maintains a dedicated deleted scheme identifier that serves as a container for archived objects. Marking an object for deletion triggers a recursive reparenting operation that moves the entire object subtree under this archive container. Physical cleanup occurs through a separate batched purge process that runs independently of application transactions. This separation ensures that data lake ingestion tools and change data capture pipelines never encounter destructive delete operations on active data paths.

Cross-database compatibility requires careful handling of cascade constraints and dialect-specific syntax. PostgreSQL implementations rely on native foreign key cascade rules to manage object deletion automatically. Microsoft SQL Server configurations utilize custom trigger logic to replicate this behavior while avoiding multiple cascade path compilation errors. The engine abstracts these dialect differences behind a unified interface, allowing developers to deploy the same persistence layer across different relational platforms without modifying core logic. This abstraction layer demonstrates how modern database engines can maintain consistent application behavior while accommodating underlying platform constraints.

What structural conventions prevent key collisions and schema drift?

System identifiers utilize negative integer constants positioned near the minimum bounds of the bigint data type. This mathematical separation guarantees that application-generated keys, which typically originate from positive sequences, will never overlap with internal markers. The type catalog table stores approximately thirty-seven built-in types, each assigned a unique negative identifier. Numeric types consolidate into long and double columns, while semantic variants like email addresses and phone numbers route through a single string column. Marker types for classes, arrays, and dictionaries occupy dedicated slots without corresponding value columns. This design ensures that collection boundaries remain distinct from scalar data throughout the persistence lifecycle.

Schema evolution relies on a cryptographic hash mechanism that monitors structural changes across application deployments. Each scheme maintains a unique identifier derived from its complete field configuration. When developers modify property definitions, the engine recalculates this hash and compares it against the stored value. A mismatch triggers an automatic synchronization routine that updates the structure table and invalidates the corresponding metadata cache entries. This approach eliminates manual migration scripts and prevents schema drift between development and production environments. The system validates field names during insertion, rejecting reserved keywords and system prefixes before they reach the database layer.

The decimal storage configuration reflects deliberate engineering trade-offs regarding precision and storage overhead. The engine utilizes a thirty-eight digit precision with eighteen decimal places for all monetary and scientific calculations. This configuration prevents floating-point rounding errors that commonly affect financial applications. The additional storage requirement remains negligible compared to the performance gains achieved through native type routing. Developers benefit from predictable numerical behavior without sacrificing memory efficiency. The architecture demonstrates how careful type selection can resolve long-standing precision issues in enterprise data systems.

Architectural implications for modern persistence layers

The design choices within REDB demonstrate a deliberate commitment to predictable performance and type safety over schema flexibility. By embedding type information directly into the relational structure and precomputing metadata mappings, the engine eliminates the runtime overhead that typically plagues flexible data stores. This approach provides a reliable foundation for complex application domains while maintaining compatibility across major relational database platforms. Engineers evaluating persistence solutions should consider how early type enforcement and denormalized metadata can simplify long-term maintenance.

What's Your Reaction?

Like Like 0
Dislike Dislike 0
Love Love 0
Funny Funny 0
Wow Wow 0
Sad Sad 0
Angry Angry 0
Christopher Holloway

Christopher Holloway is the founder and director of Progressive Robot, a UK-based technology company. A full-stack engineer with more than two decades of experience, he works across PHP development, ecommerce, Linux infrastructure, technical SEO and AI automation, and writes here on technology, AI, hardware and software.

Comments (0)

User