High-Level Overview
TundraDB is an in-memory graph database with relational join semantics, bitemporal versioning, and Apache Arrow-based columnar output.
π― Query Layer
TundraQL parser (ANTLR4) β Query builder β Two-phase executor (traverse + BFS materialize) β Arrow Table output.
ποΈ Storage Layer
In-memory node arenas + edge store. Parquet-based persistence with JSON metadata. Snapshot-based commit model.
π Temporal Layer
Bitemporal versioning with VALIDTIME and TXNTIME. Per-node version chains with field-level copy-on-write.
β‘ Compute Layer
Intel TBB for parallel graph traversal. LLVM DenseMap for hot-path lookups. Arrow Compute for aggregations.
Database Layer
The Database class is the top-level entry point. It owns all managers and stores.
Database
βββ SchemaRegistry β schema name β Schema (fields, types, layout)
βββ ShardManager β schema β vector<Shard>
βββ NodeManager β global node registry (arena-based)
βββ EdgeStore β edge_type β source_id β vector<Edge>
βββ Storage β Parquet read/write
βββ MetadataManager β JSON metadata persistence
βββ SnapshotManager β orchestrates COMMIT
auto config = tundradb::make_config()
.with_db_path("./my-database")
.with_persistence_enabled(true)
.with_versioning_enabled(true) // enable temporal
.with_shard_capacity(100000) // nodes per shard
.with_chunk_size(10000) // Arrow chunk size
.with_memory_scale_factor(2.0) // double all pool sizes
.build();
tundradb::Database db(config);
Sharding
Nodes are partitioned into shards by schema. Each shard holds up to shard_capacity nodes and maintains its own memory arena.
ShardManager
βββ "users" β [ Shard(0..99999), Shard(100000..199999), ... ]
βββ "companies" β [ Shard(0..99999) ]
Each Shard contains:
βββ NodeArena β fixed-layout memory for node data
βββ id β NodeHandle β DenseMap for O(1) node lookup
βββ min_id / max_id β ID range this shard covers
βββ Arrow Table β columnar cache (lazy, built on query)
When a shard reaches capacity, a new shard is created automatically. IDs are globally unique (atomic counter) while each schema also has a per-schema index counter.
Node Storage
Nodes are stored in arena-allocated memory with a fixed-layout SchemaLayout that maps field names to offsets.
π SchemaLayout
Pre-computed field offsets, sizes, and null bitmap position for each schema. Enables direct pointer arithmetic for field access β no hash map lookups in the hot path.
π§ NodeArena
Bulk memory allocator for node data. Each node is a contiguous byte region at a known offset. NodeHandle contains a pointer to the base + version chain.
ββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β Null Bitmapβ Field Data (packed by SchemaLayout) β
β (N bits) β [name: StringRef] [age: int64] [...] β
ββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββββββ
β β
1 bit per field Direct memory, no indirection
StringRef β StringArena pool
Edge Store
Edges are stored in an EdgeStore keyed by edge type, then source ID. Each edge has a source ID, target ID, source schema, and target schema.
EdgeStore
βββ "friend" β { 0: [Edge(0β1), Edge(0β3)],
2: [Edge(2β4)] }
βββ "works_at" β { 1: [Edge(1β100)],
3: [Edge(3β101)] }
// O(1) lookup: edge_store.get_edges("friend", source_id=0)
// Returns: [Edge(0β1), Edge(0β3)]
During MATCH execution, the engine iterates over source IDs from the QueryState and for each source does an O(1) lookup into the EdgeStore to find outgoing edges.
Temporal Versioning
TundraDB supports bitemporal versioning β each piece of data has two independent time dimensions:
π VALIDTIME
When the fact was true in the real world. Example: "Alice's salary was $100K from Jan 2024 to Mar 2024."
π₯οΈ TXNTIME
When the database learned about the fact. Automatically set on writes. Enables "what did we know on date X?" queries.
Version Chain
Each node has a linked list of VersionInfo records. Each version stores:
- version_id β monotonically increasing
- valid_from / valid_to β VALIDTIME interval [from, to)
- tx_from / tx_to β TXNTIME interval [from, to)
- updated_fields β only the fields that changed (copy-on-write)
- prev β pointer to previous version
A version V is visible at snapshot (vt, tt) if:
V.valid_from β€ vt < V.valid_to
AND
V.tx_from β€ tt < V.tx_to
The resolver walks the version chain and returns the
first version that satisfies both conditions.
If no version matches β node is invisible at that time.
v1: Alice created (age=25)
valid: [0, β), tx: [0, β)
v2: Alice turns 26
valid: [Tβ, β), tx: [Tβ, β) β only age field stored in delta
v3: Retroactive correction
valid: [Tβ, Tβ), tx: [Tβ, β) β fix: Alice was actually 24 at creation
updated_fields: SmallDenseMap<uint16_t, char*>. Unchanged fields resolve to the previous version. A lazy field cache (field_cache_) avoids repeated chain traversals.
Memory Architecture
TundraDB uses a hierarchy of arena allocators to minimize allocation overhead and maximize cache locality.
Database (1 GB default pool)
βββ ShardManager (100 MB pool)
β βββ pmr::unordered_map for shard lookups
βββ Per-Shard (10 MB each)
β βββ NodeArena β bulk node data
β βββ FreeListArena β reusable node slots
βββ StringArena (per NodeManager)
βββ Pool 0: FIXED_STRING16 (16 bytes/slot)
βββ Pool 1: FIXED_STRING32 (32 bytes/slot)
βββ Pool 2: FIXED_STRING64 (64 bytes/slot)
βββ Pool 3: Variable STRING (heap-allocated)
String values use a tiered StringArena. Short strings (β€16, β€32, β€64 bytes) go into fixed-size pools, giving predictable allocation sizes. Longer strings fall back to heap allocation. All strings are stored as StringRef (pointer + length).
new/delete causes fragmentation and cache misses. Arenas allocate in large contiguous blocks, improving locality and enabling bulk deallocation.
Persistence
Data is persisted using Apache Parquet (columnar format) with JSON metadata. The COMMIT command triggers a full snapshot write.
my-database/
βββ metadata.json β schemas, shard metadata, edge metadata
βββ data/
βββ users/
β βββ shard_0.parquet
β βββ shard_1.parquet
βββ companies/
β βββ shard_0.parquet
βββ edges/
βββ friend.parquet
βββ works_at.parquet
Commit Flow
1. SnapshotManager triggered
Collects all shards, edges, and schema metadata.
2. Shard β Arrow Table β Parquet
Each shard is converted to an Arrow Table, then written as a Parquet file via Storage::write_shard().
3. Edges β Parquet
Edge data for each type is serialized to Parquet.
4. Metadata β JSON
Schema definitions, shard ranges, and edge metadata written to metadata.json.
On startup, if persistence is enabled, the database reads metadata.json and restores all shards and edges from Parquet files.
Technology Stack
πΉ Apache Arrow
Columnar in-memory format for query result tables. Enables zero-copy interop with analytics tools. Arrow Compute for filtering and aggregation.
π¦ Apache Parquet
Columnar on-disk format for persistence. Efficient compression and encoding. Direct Arrow β Parquet conversion.
β‘ Intel TBB
Thread Building Blocks for parallel graph traversal. Batch processing of large ID sets across traversal phases.
π§ LLVM
DenseMap and SmallVector for cache-efficient hot-path data structures. Lower overhead than std::unordered_map.
π ANTLR4
Parser generator for TundraQL. Grammar β C++ lexer + parser. Powers the tundra_shell interactive console.
π spdlog
Fast, header-only logging library. Structured logging throughout the database engine.