TundraDB - Architecture

High-Level Overview

TundraDB is an in-memory graph database with relational join semantics, bitemporal versioning, and Apache Arrow-based columnar output.

🎯 Query Layer

TundraQL parser (ANTLR4) → Query builder → Two-phase executor (traverse + BFS materialize) → Arrow Table output.

🗂️ Storage Layer

In-memory node arenas + edge store. Parquet-based persistence with JSON metadata. Snapshot-based commit model.

🕐 Temporal Layer

Bitemporal versioning with VALIDTIME and TXNTIME. Per-node version chains with field-level copy-on-write.

⚡ Compute Layer

Intel TBB for parallel graph traversal. LLVM DenseMap for hot-path lookups. Arrow Compute for aggregations.

Database Layer

The Database class is the top-level entry point. It owns all managers and stores.

Core components

Database
├── SchemaRegistry        — schema name → Schema (fields, types, layout)
├── ShardManager          — schema → vector<Shard>
├── NodeManager           — global node registry (arena-based)
├── EdgeStore             — edge_type → source_id → vector<Edge>
├── Storage               — Parquet read/write
├── MetadataManager       — JSON metadata persistence
└── SnapshotManager       — orchestrates COMMIT

Configuration (config.hpp)

auto config = tundradb::make_config()
    .with_db_path("./my-database")
    .with_persistence_enabled(true)
    .with_versioning_enabled(true)     // enable temporal
    .with_shard_capacity(100000)       // nodes per shard
    .with_chunk_size(10000)            // Arrow chunk size
    .with_memory_scale_factor(2.0)     // double all pool sizes
    .build();

tundradb::Database db(config);

Defaults: 100K nodes/shard 10MB shard pool 100MB manager pool 1GB database pool

Sharding

Nodes are partitioned into shards by schema. Each shard holds up to shard_capacity nodes and maintains its own memory arena.

Shard structure

ShardManager
  └── "users" → [ Shard(0..99999), Shard(100000..199999), ... ]
  └── "companies" → [ Shard(0..99999) ]

Each Shard contains:
  ├── NodeArena         — fixed-layout memory for node data
  ├── id → NodeHandle   — DenseMap for O(1) node lookup
  ├── min_id / max_id   — ID range this shard covers
  └── Arrow Table       — columnar cache (lazy, built on query)

When a shard reaches capacity, a new shard is created automatically. IDs are globally unique (atomic counter) while each schema also has a per-schema index counter.

Node Storage

Nodes are stored in arena-allocated memory with a fixed-layout SchemaLayout that maps field names to offsets.

📐 SchemaLayout

Pre-computed field offsets, sizes, and null bitmap position for each schema. Enables direct pointer arithmetic for field access — no hash map lookups in the hot path.

🧊 NodeArena

Bulk memory allocator for node data. Each node is a contiguous byte region at a known offset. NodeHandle contains a pointer to the base + version chain.

Memory layout of a single node

┌────────────┬───────────────────────────────────────────┐
│ Null Bitmap│  Field Data (packed by SchemaLayout)      │
│  (N bits)  │  [name: StringRef] [age: int64] [...]     │
└────────────┴───────────────────────────────────────────┘
     ↑                      ↑
  1 bit per field       Direct memory, no indirection
                        StringRef → StringArena pool

Edge Store

Edges are stored in an EdgeStore keyed by edge type, then source ID. Each edge has a source ID, target ID, source schema, and target schema.

Edge lookup

EdgeStore
  └── "friend"   → { 0: [Edge(0→1), Edge(0→3)],
                      2: [Edge(2→4)] }
  └── "works_at" → { 1: [Edge(1→100)],
                      3: [Edge(3→101)] }

// O(1) lookup: edge_store.get_edges("friend", source_id=0)
// Returns: [Edge(0→1), Edge(0→3)]

During MATCH execution, the engine iterates over source IDs from the QueryState and for each source does an O(1) lookup into the EdgeStore to find outgoing edges.

Temporal Versioning

TundraDB supports bitemporal versioning — each piece of data has two independent time dimensions:

📅 VALIDTIME

When the fact was true in the real world. Example: "Alice's salary was $100K from Jan 2024 to Mar 2024."

🖥️ TXNTIME

When the database learned about the fact. Automatically set on writes. Enables "what did we know on date X?" queries.

Version Chain

Each node has a linked list of VersionInfo records. Each version stores:

version_id — monotonically increasing
valid_from / valid_to — VALIDTIME interval [from, to)
tx_from / tx_to — TXNTIME interval [from, to)
updated_fields — only the fields that changed (copy-on-write)
prev — pointer to previous version

Version visibility rule

A version V is visible at snapshot (vt, tt) if:

    V.valid_from ≤ vt  <  V.valid_to
        AND
    V.tx_from    ≤ tt  <  V.tx_to

The resolver walks the version chain and returns the
first version that satisfies both conditions.
If no version matches → node is invisible at that time.

v1: Alice created (age=25)

valid: [0, ∞), tx: [0, ∞)

v2: Alice turns 26

valid: [T₁, ∞), tx: [T₁, ∞) — only age field stored in delta

v3: Retroactive correction

valid: [T₀, T₁), tx: [T₂, ∞) — fix: Alice was actually 24 at creation

Field-level COW: Each version only stores the fields that changed via updated_fields: SmallDenseMap<uint16_t, char*>. Unchanged fields resolve to the previous version. A lazy field cache (field_cache_) avoids repeated chain traversals.

Memory Architecture

TundraDB uses a hierarchy of arena allocators to minimize allocation overhead and maximize cache locality.

Arena hierarchy

Database (1 GB default pool)
├── ShardManager (100 MB pool)
│   └── pmr::unordered_map for shard lookups
├── Per-Shard (10 MB each)
│   ├── NodeArena          — bulk node data
│   └── FreeListArena      — reusable node slots
└── StringArena (per NodeManager)
    ├── Pool 0: FIXED_STRING16  (16 bytes/slot)
    ├── Pool 1: FIXED_STRING32  (32 bytes/slot)
    ├── Pool 2: FIXED_STRING64  (64 bytes/slot)
    └── Pool 3: Variable STRING (heap-allocated)

String values use a tiered StringArena. Short strings (≤16, ≤32, ≤64 bytes) go into fixed-size pools, giving predictable allocation sizes. Longer strings fall back to heap allocation. All strings are stored as StringRef (pointer + length).

Why arenas? Graph databases create millions of small, long-lived objects. Traditional new/delete causes fragmentation and cache misses. Arenas allocate in large contiguous blocks, improving locality and enabling bulk deallocation.

Persistence

Data is persisted using Apache Parquet (columnar format) with JSON metadata. The COMMIT command triggers a full snapshot write.

On-disk layout

my-database/
├── metadata.json           — schemas, shard metadata, edge metadata
└── data/
    ├── users/
    │   ├── shard_0.parquet
    │   └── shard_1.parquet
    ├── companies/
    │   └── shard_0.parquet
    └── edges/
        ├── friend.parquet
        └── works_at.parquet

Commit Flow

1. SnapshotManager triggered

Collects all shards, edges, and schema metadata.

2. Shard → Arrow Table → Parquet

Each shard is converted to an Arrow Table, then written as a Parquet file via Storage::write_shard().

3. Edges → Parquet

Edge data for each type is serialized to Parquet.

4. Metadata → JSON

Schema definitions, shard ranges, and edge metadata written to metadata.json.

On startup, if persistence is enabled, the database reads metadata.json and restores all shards and edges from Parquet files.

Technology Stack

🏹 Apache Arrow

Columnar in-memory format for query result tables. Enables zero-copy interop with analytics tools. Arrow Compute for filtering and aggregation.

📦 Apache Parquet

Columnar on-disk format for persistence. Efficient compression and encoding. Direct Arrow ↔ Parquet conversion.

⚡ Intel TBB

Thread Building Blocks for parallel graph traversal. Batch processing of large ID sets across traversal phases.

🔧 LLVM

DenseMap and SmallVector for cache-efficient hot-path data structures. Lower overhead than std::unordered_map.

📝 ANTLR4

Parser generator for TundraQL. Grammar → C++ lexer + parser. Powers the tundra_shell interactive console.

📊 spdlog

Fast, header-only logging library. Structured logging throughout the database engine.

← Home TundraQL Reference →

System Architecture