Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Roadmap and Phases

AxiomDB is developed in phases, each of which adds a coherent vertical slice of functionality. The design is organized in three blocks:

  • Block 1 (Phases 1–7): Core engine — storage, indexing, WAL, transactions, SQL parsing, and concurrent MVCC.
  • Block 2 (Phases 8–14): SQL completeness — full query planner, optimizer, advanced SQL features, and MySQL wire protocol.
  • Block 3 (Phases 15–34): Production hardening — replication, backups, distributed execution, column store, and AI/ML integration.

Current Status

Last completed subphase: 40.1b CREATE INDEX on clustered tables — removed ensure_heap_runtime guard; CREATE INDEX / CREATE UNIQUE INDEX now work on clustered (PRIMARY KEY) tables using ClusteredSecondaryLayout-based index build with partial index, NULL-skipping, and uniqueness enforcement at build time.

Active development: Phase 40 — Clustered engine performance optimizations (40.1 ClusteredInsertBatch done; 40.1b CREATE INDEX on clustered tables done; statement plan cache, transaction write set, vectorized scan next)

Next milestone: 40.2 — Statement plan cache (per-session CachedPlanSource with OID-based invalidation)

Concurrency note: the current server already supports concurrent read-only queries, but mutating statements are still serialized through a database-wide Arc<RwLock<Database>> write guard. The next concurrency milestone is Phase 13.7 row-level locking, followed by deadlock detection and explicit locking clauses.


Phase Progress

Block 1 — Core Engine

PhaseNameStatusKey deliverables
1.1Workspace setupCargo workspace, crate structure
1.2Page format16 KB pages, header, CRC32c checksum
1.3MmapStoragemmap-backed storage engine
1.4MemoryStorageIn-memory storage for tests
1.5FreeListBitmap page allocator
1.6StorageEngine traitUnified interface + heap pages
2.1B+ Tree insert/splitCoW insert with recursive splits
2.2B+ Tree deleteRebalance, redistribute, merge
2.3B+ Tree range scanRangeIter with tree traversal
2.4Prefix compressionCompressedNode for internal keys
3.1WAL entry formatBinary format, CRC32c, backward scan
3.2WAL writerWalWriter with file header
3.3WAL readerForward and backward iterators
3.4TxnManagerBEGIN/COMMIT/ROLLBACK, snapshot
3.5Checkpoint5-step checkpoint protocol
3.6Crash recoveryCRASHED→RECOVERING→REPLAYING→VERIFYING→READY
3.7Durability tests9 crash scenarios
3.8Post-recovery checkerHeap structural + MVCC invariants
3.9Catalog bootstrapaxiom_tables, axiom_columns, axiom_indexes
3.10Catalog readerMVCC-aware schema lookup
3.17WAL batch appendrecord_insert_batch(): O(1) write_all for N entries via reserve_lsns+write_batch
3.18WAL PageWriteEntryType::PageWrite=9: 1 WAL entry/page vs N/row; 238× fewer for 10K-row insert
3.19WAL Group CommitCommitCoordinator: batches fsyncs across connections; up to 16× concurrent throughput
4.1SQL ASTAll statement types
4.2SQL lexerlogos DFA, ~85 tokens, zero-copy
4.3DDL parserCREATE/DROP/ALTER TABLE, CREATE/DROP INDEX
4.4DML parserSELECT (all clauses), INSERT, UPDATE, DELETE
4.17Expression evaluatorThree-valued NULL logic, all operators
4.18Semantic analyzerBindContext, col_idx resolution
4.18bType coercion matrixcoerce(), coerce_for_op(), CoercionMode strict/permissive
4.23QueryResult typeRow, ColumnMeta, QueryResult (Rows/Affected/Empty)
4.5bTable engineTableEngine scan/insert/delete/update over heap; later generalized by Phase 39 table-root metadata
4.5 + 4.5aBasic executorSELECT/INSERT/UPDATE/DELETE, DDL, txn control, SELECT without FROM
4.25 + 4.7Error handling frameworkComplete SQLSTATE mapping; ErrorResponse{sqlstate,message,detail,hint}
4.8JOIN (nested loop)INNER/LEFT/RIGHT/CROSS; USING; multi-table; FULL→NotImplemented
4.9a+4.9c+4.9dGROUP BY + Aggregates + HAVINGCOUNT/SUM/MIN/MAX/AVG; hash-based; HAVING; NULL grouping
4.10+4.10b+4.10cORDER BY + LIMIT/OFFSETMulti-column; NULLS FIRST/LAST; LIMIT/OFFSET pagination
4.12DISTINCTHashSet dedup on output rows; NULL=NULL; pre-LIMIT
4.24CASE WHENSearched + simple form; NULL semantics; all contexts
4.6INSERT … SELECTReuses execute_select; MVCC prevents self-reads
6.1–6.3Secondary indexes + plannerCREATE INDEX, index maintenance, B-Tree point/range lookup
6.4Bloom filter per indexBloomRegistry; zero B-Tree reads for definite-absent keys (1% FPR)
6.5/6.6Foreign key constraintsREFERENCES, ALTER TABLE FK; INSERT/DELETE/CASCADE/SET NULL enforcement
6.7Partial UNIQUE indexCREATE INDEX … WHERE predicate; soft-delete uniqueness pattern
6.8Fill factorWITH (fillfactor=N) on CREATE INDEX; B-Tree leaf split at ⌈FF×ORDER_LEAF/100⌉
6.9FK + Index improvementsPK B-Tree population; FK composite key index; composite index planner
6.10–6.12Index statistics + ANALYZEPer-column NDV/row_count; planner cost gate (sel > 20% → Scan); ANALYZE command; staleness tracking
6.16PK SELECT planner parityPRIMARY KEY equality/range now participate in single-table SELECT planning; PK equality bypasses the scan-biased cost gate
6.17Indexed UPDATE candidate pathUPDATE now discovers PK / indexed candidates through B-Tree access before entering the 5.20 write path
6.18Indexed multi-row INSERT batch pathImmediate multi-row VALUES statements now reuse grouped heap/index apply on indexed tables while preserving strict same-statement UNIQUE semantics
6.19WAL fsync pipeline🔄Server commits now use an always-on leader-based fsync pipeline and the old timer-based CommitCoordinator path is gone, but the single-connection insert_autocommit benchmark still misses target throughput
6.20UPDATE apply fast pathPK-range UPDATE now batches candidate heap reads, skips no-op rows, batches UpdateInPlace WAL writes, and groups per-index delete+insert/root persistence
5Executor (advanced)⚠️ PlannedJOIN, GROUP BY, ORDER BY, index lookup, aggregate
6.8+Index statistics, FK improvements⚠️ PlannedFill factor, composite FKs, ON UPDATE CASCADE, ANALYZE, index-only scans
7Full MVCC⚠️ PlannedSSI, write-write conflicts, epoch reclamation

Block 2 — SQL Completeness

PhaseNameStatusKey deliverables
8Advanced SQL⚠️ PlannedWindow functions, CTEs, recursive queries
9VACUUM / GC⚠️ PlannedDead row cleanup, freelist compaction
10MySQL wire protocol⚠️ PlannedCOM_QUERY, result set packets, handshake
11TOAST⚠️ PlannedOut-of-line storage for large values
12Full-text search⚠️ PlannedInverted index, BM25 ranking
13Foreign key checks⚠️ PlannedConstraint validation on insert/delete
14Vectorized execution⚠️ PlannedSIMD scans, morsel-driven pipeline

Block 3 — Production Hardening

PhaseNameStatus
15Connection pooling⚠️ Planned
16Replication (primary-replica)⚠️ Planned
17Point-in-time recovery (PITR)⚠️ Planned
18Online DDL⚠️ Planned
19Partitioning⚠️ Planned
20Column store (HTAP)⚠️ Planned
21VECTOR index (ANN)⚠️ Planned
22–34Distributed, cloud-native, AI/ML⚠️ Future

Block 4 — Platform Surfaces and Storage Evolution

PhaseNameStatusKey deliverables
35Deployment and DevEx⚠️ PlannedDocker, config tooling, release UX
36AxiomQL Core⚠️ PlannedAlternative read query language over the same AST/executor
37AxiomQL Write + DDL + Control⚠️ PlannedAxiomQL DML, DDL, control flow, maintenance
38AxiomDB-Wasm⚠️ PlannedBrowser runtime, OPFS backend, sync, live queries
39Clustered index storage engine🔄 In progressInline PK rows, clustered internal/leaf pages, PK bookmarks in secondary indexes, logical clustered WAL/rollback, clustered crash recovery, clustered-aware CREATE TABLE

Completed Phases — Summary

Phase 1 — Storage Engine

A generic storage layer with two implementations: MmapStorage for production disk use and MemoryStorage for tests. Every higher-level component uses only the StorageEngine trait — storage is pluggable. Pages are 16 KB with a 64-byte header (magic, page type, CRC32c checksum, page_id, LSN, free pointers). Heap pages use a slotted format: slots grow from the start, tuples grow from the end toward the center.

Phase 2 — B+ Tree CoW

A persistent, Copy-on-Write B+ Tree over StorageEngine. Keys up to 64 bytes; ORDER_INTERNAL = 223, ORDER_LEAF = 217 (derived to fill exactly one 16 KB page). Root is an AtomicU64 — readers are lock-free by design. Supports insert (with recursive split), delete (with rebalance/redistribute/merge), and range scan via RangeIter. Prefix compression for internal nodes in memory.

Phase 3 — WAL and Transactions ✅ 100% complete

Append-only Write-Ahead Log with binary entries, CRC32c checksums, and forward/backward scan iterators. TxnManager coordinates BEGIN/COMMIT/ROLLBACK with snapshot assignment. Five-step checkpoint protocol. Crash recovery state machine (five states). Catalog bootstrap creates the three system tables on first open. CatalogReader provides MVCC-consistent schema reads. Nine crash scenario tests with a post-recovery integrity checker.

Phase 3 late additions (3.17–3.19):

  • 3.17 WAL batch appendrecord_insert_batch() uses WalWriter::reserve_lsns(N) + write_batch() to write N Insert WAL entries in a single write_all call. Reduces BufWriter overhead from O(N rows) to O(1) for bulk inserts.

  • 3.18 WAL PageWriteEntryType::PageWrite = 9. One WAL entry per affected heap page instead of one per row. new_value = post-modification page bytes (16 KB) + embedded slot IDs for crash recovery undo. For a 10K-row bulk insert: 42 WAL entries instead of 10,000 — 238× fewer serializations and 30% smaller WAL file.

  • 3.19 WAL Group CommitCommitCoordinator batches DML commits from concurrent connections. DML commits write to the WAL BufWriter, register with the coordinator, and release the Database lock before awaiting fsync confirmation. A background Tokio task performs one flush+fsync per batch window (group_commit_interval_ms), then notifies all waiting connections. Enables near-linear concurrent write scaling.

Phase 4 — SQL Processing

SQL AST covering all DML (SELECT, INSERT, UPDATE, DELETE) and DDL (CREATE/DROP/ALTER TABLE, CREATE/DROP INDEX). logos-based lexer with ~85 tokens, case-insensitive keywords, zero-copy identifiers. Recursive descent parser with full expression precedence. Expression evaluator with three-valued NULL logic (AND, OR, NOT, IS NULL, BETWEEN, LIKE, IN). Semantic analyzer with BindContext, qualified/unqualified column resolution, ambiguity detection, and subquery support. Row codec with null bitmap, u24 string lengths, and O(n) encoded_len().


Near-Term Priorities

Phase 13 — Row-Level Writer Concurrency

The current implementation uses Arc<tokio::sync::RwLock<Database>>: reads can overlap, but mutating statements are still serialized at whole-database scope. Phase 13.7 removes that bottleneck with row-level locking. Phase 13.8 adds deadlock detection, and 13.8b adds SELECT ... FOR UPDATE, NOWAIT, and SKIP LOCKED.

Phase 5

Phase 5 is now complete. The last close was:

  • 5.15 DSN parsing — AxiomDB-owned surfaces now accept typed DSNs: AXIOMDB_URL for server bootstrap plus Db::open_dsn, AsyncDb::open_dsn, and axiomdb_open_dsn for embedded mode. mysql:// and postgres:// are parse aliases only; the server still speaks MySQL wire only and embedded mode still accepts only local-path DSNs.

Phase 5 also closed the recent runtime/perf subphases:

  • 5.11c Explicit connection state machine — the MySQL server now has an explicit CONNECTED → AUTH → IDLE → EXECUTING → CLOSING transport lifecycle with fixed auth timeout, wait_timeout vs interactive_timeout behavior, net_write_timeout for packet writes, and socket keepalive configured separately from SQL session state.
  • 5.19a Executor decomposition — the SQL executor now lives in a responsibility-based executor/ module tree instead of one monolithic file, which lowers the cost of later DML and planner work.
  • 5.19 B+Tree batch delete — DELETE WHERE and the old-key half of UPDATE now stage exact encoded keys per index and remove them with one ordered delete_many_in(...) pass per tree instead of one delete_in(...) traversal per row.
  • 5.19b Eval decomposition — the expression evaluator now lives under a responsibility-based eval/ module tree with the same public API, which lowers the cost of future built-in and collation work without changing SQL behavior.
  • 5.20 Stable-RID UPDATE fast path — UPDATE can now rewrite rows in the same heap slot when the new encoded row fits, preserve the RecordId, and skip unnecessary index maintenance for indexes whose logical key membership is unchanged.
  • 5.21 Transactional INSERT staging — explicit transactions now buffer consecutive INSERT ... VALUES statements per table and flush them together on COMMIT or the next barrier statement, preserving savepoint semantics by flushing before the next statement savepoint whenever the batch cannot continue.

Phase 6 closing note — Integrity and recovery

Phase 6 now closes with startup index integrity verification:

  • every catalog-visible index is compared against heap-visible rows after WAL recovery
  • readable divergence is repaired automatically from heap contents
  • unreadable index trees fail open with IndexIntegrityFailure

SQL REINDEX remains deferred to the later diagnostics / administration phases.

Phase 6 closing note — Indexed multi-row INSERT on indexed tables

Phase 6 also closes the remaining immediate multi-row VALUES debt on indexed tables:

  • shared batch-apply helpers are now reused by both 5.21 staging flushes and the immediate INSERT ... VALUES (...), (... ) path
  • PRIMARY KEY and secondary indexes no longer force a per-row fallback for multi-row VALUES statements
  • same-statement UNIQUE detection remains strict because the immediate path does not reuse the staged committed_empty shortcut
  1. Index range scan — range predicate via RangeIter.
  2. Projection — evaluate SELECT expressions over rows from the scan.
  3. Filter — apply WHERE expression using the evaluator from Phase 4.17.
  4. Nested loop join — INNER JOIN, LEFT JOIN.
  5. Sort — ORDER BY with NULLS FIRST/LAST.
  6. Limit/Offset — LIMIT n OFFSET m.
  7. Hash aggregate — GROUP BY with COUNT, SUM, AVG, MIN, MAX.
  8. INSERT / UPDATE / DELETE — write path with WAL integration.

The executor will be a simple volcano-model interpreter in Phase 5. Vectorized execution (morsel-driven, SIMD) is planned for Phase 14.


AxiomQL — Alternative Query Language (Phases 36-37)

AxiomDB will support two query languages sharing one AST and executor:

SQL stays as the primary language with full wire protocol compatibility. Every ORM, client, and tool works without changes.

AxiomQL is an optional method-chain alternative designed to be learned in minutes by any developer who already uses .filter().sort().take() in JavaScript, Python, Rust, or C#:

users
  .filter(active, age > 18)
  .join(orders)
  .group(country, total: count())
  .sort(total.desc)
  .take(10)

Both languages compile to the same Stmt AST — zero executor overhead, every SQL feature automatically available in AxiomQL. Planned after Phase 8 (wire protocol).

PhaseScope
36AxiomQL parser: SELECT, filter, join, group, subqueries, let bindings
37AxiomQL write + DDL: insert, update, delete, create, transaction, proc