References & credits¶
Moon stands on the shoulders of decades of systems research and a vibrant open-source ecosystem. This page lists the projects, papers, and specifications that directly shaped Moon's design, along with the core runtime dependencies and the rationale for each.
Design inspirations¶
Moon's architecture is not invented — it's assembled from ideas that have been validated in production by others. Credit where it's due.
- Dragonfly — C++ shared-nothing, thread-per-core Redis alternative. Validated that the thread-per-core model is the right answer for key-value stores in 2024+. Moon follows the same top-level architecture with a Rust implementation.
- ScyllaDB / Seastar — pioneered thread-per-core shared-nothing for databases. The
CachePadded<T>discipline, SPSC cross-shard channels, and "never share data, always share work" philosophy come from Seastar. - KeyDB — multi-threaded Redis fork. A useful counter-example: demonstrated the spinlock ceiling at ~4 threads, which is exactly what shared-nothing avoids.
- Garnet (Microsoft Research) — .NET Redis alternative with a Tsavorite log-structured store. Validated that RESP compatibility plus a modern storage engine can beat Redis on both latency and memory.
- TiKV — large-scale Rust KV store. Production-proven jemalloc tuning and tracing patterns Moon adopts.
- ByteDance Monoio — the thread-per-core
io_uringruntime Moon uses by default on Linux. Production-proven at ByteDance scale.
Research papers¶
Algorithms Moon implements directly, with papers worth reading if you want to understand why the code looks the way it does.
- Dash: Scalable Hashing on Persistent Memory (VLDB 2020) — the segmented-hash-table design that
DashTableinsrc/storage/dashtable/is based on. Optimized for cache-line locality and concurrent lock-free reads. - Swiss Table / Abseil — SIMD control-byte probing. Moon uses Swiss Table-style probing within each Dash segment: one SIMD load scans 16 slots' worth of metadata before touching any key data. See
src/storage/dashtable/segment.rs. - VLL: Very Lightweight Locking for Main Memory Database Systems (VLDB 2012) — multi-key transaction coordination across shards without heavy locking. Informs Moon's approach to cross-shard
MGET/MSET/MULTIand future cluster-wide transactions. - TurboQuant: Fast 4-bit Vector Quantization — the foundation for Moon's
TQ-4bitvector compression. Seesrc/vector/turbo_quant/for the implementation anddocs/vector-search-guide.mdfor how it's wired intoFT.CREATE. - HNSW: Efficient and Robust Approximate Nearest Neighbor Search (Malkov & Yashunin, 2016) — the graph index used for
FT.SEARCH. Moon's HNSW implementation lives insrc/vector/hnsw/. - The Probabilistic Relevance Framework: BM25 and Beyond (Robertson & Zaragoza, 2009) — the Okapi BM25 ranking function Moon implements for full-text scoring. The
score = Σ IDF(t)·(tf·(k1+1))/(tf + k1·(1−b+b·dl/avgdl))formula and thek1=1.2, b=0.75defaults live insrc/text/bm25.rs. - Reciprocal Rank Fusion outperforms Condorcet and individual Rank Learning Methods (Cormack, Clarke & Büttcher, SIGIR 2009) — the RRF algorithm behind Moon's three-way hybrid fusion (BM25 + dense + sparse). See
src/command/vector_search/hybrid.rsandhybrid_multi.rs. - HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm (Flajolet, Fusy, Gandouet & Meunier, 2007) — the cardinality sketch behind
PFADD/PFCOUNT. Moon'ssrc/storage/hll.rsis byte-identical with the Redis 7.x HYLL wire format (HLL_P=14, 16384 registers, 6-bit). It estimates with the Ertl improved estimator (Ertl, 2017) (hll_sigma/hll_tau) rather than Google's HLL++ bias-correction tables. - Fast String Correction with Levenshtein-Automata (Schulz & Mihov, 2002) — the Levenshtein-automaton construction behind
FT.SEARCHtypo tolerance. Moon builds a term-dictionary FST (src/text/fst_dict.rs, via thefstandlevenshtein_automatacrates) and intersects it with a Levenshtein automaton for bounded-edit-distance fuzzy expansion. - Efficient IO with io_uring (Jens Axboe) — the canonical design document for the Linux
io_uringinterface. Background for Moon'smonoio-based thread-per-core driver and itstokiobridge. - Coordinated Omission (Gil Tene) — why closed-loop benchmarking under-reports tail latency. Moon's benchmark methodology follows open-loop principles.
Protocol & compatibility specifications¶
Moon targets drop-in Redis compatibility. These are the specs it implements.
- Redis Protocol Specification (RESP2/RESP3) — the wire protocol parsed in
src/protocol/. - Redis Commands Reference — command semantics Moon preserves. Any deviation is a bug.
- Redis Cluster Specification — 16,384 hash slots, gossip protocol,
MOVED/ASKredirections, epoch-based failover. - PSYNC2 Replication Protocol — partial resynchronization. Moon's replication in
src/replication/implements PSYNC2 so Redis replicas can peer with Moon primaries and vice versa.
Core runtime dependencies¶
Each dependency was chosen for a specific, load-bearing reason. Swapping any of them would require measurable justification.
| Crate | Purpose | Why chosen |
|---|---|---|
| monoio | Thread-per-core async runtime | io_uring on Linux, kqueue on macOS. Production-proven at ByteDance. Lower per-op overhead than Tokio for request-response workloads. |
| tokio | Fallback async runtime | Broad ecosystem, cross-platform, CI-friendly. Used as portable alternative behind a feature flag. |
| tikv-jemallocator | Memory allocator | Reduced fragmentation for long-running servers. Production-proven by TiKV. |
| rustls | TLS implementation | Pure Rust, no OpenSSL dependency, async-native. |
| aws-lc-rs | Cryptographic backend | FIPS-capable, high-performance AES-GCM and ChaCha20. |
| mlua | Lua 5.4 VM | Redis EVAL/EVALSHA compatibility via safe Rust bindings. |
| memchr | SIMD byte search | ~6.5× faster CRLF scanning than std. SSE2, AVX2, NEON dispatched at runtime. |
| fst | Finite-state term dictionary | Compact, sorted FST map for FT.SEARCH prefix and fuzzy term expansion. Feature-gated behind text-index. |
| levenshtein_automata | Bounded-edit fuzzy matching | Builds Levenshtein automata intersected with the term FST for typo-tolerant full-text search. |
| bumpalo | Bump allocation arenas | ~2 ns allocation, O(1) bulk deallocation per request. Used for per-request scratch buffers. |
| bytes | Zero-copy buffers | Bytes::freeze() for shared response data without copying; reference-counted slices for pipeline batches. |
| xxhash-rust | Non-cryptographic hashing | Fast key hashing for DashTable segment routing. |
| crossbeam-utils | Concurrency primitives | CachePadded<T> for false-sharing prevention on hot atomics. |
| ringbuf | SPSC ring buffer | Lock-free cross-shard message passing. |
| phf | Perfect hash map | Static command metadata registry; constant-time lookup for cold commands (hot commands use a direct match — see dispatch hot-path recovery). |
| ordered-float | Total-ordered floats | Sorted-set score keys. |
| parking_lot | Non-poisoning locks | Faster, poisoning-free RwLock/Mutex for per-shard state. |
Benchmarking methodology¶
Moon's benchmark numbers follow industry-standard practices. These are the references worth understanding before interpreting any results.
- Redis vs Dragonfly Performance (Redis blog) — fair comparison methodology: same cores, cluster vs single-process, honest caveats.
- memtier_benchmark — industry-standard Redis benchmarking tool. Moon's benchmark scripts use
redis-benchmarkfor parity with published Redis numbers, andmemtier_benchmarkfor latency distribution analysis. - Coordinated Omission (Gil Tene) — why closed-loop benchmarking lies about tail latency, and how to measure honestly.
License & attribution¶
Moon is Apache 2.0 licensed. Each dependency carries its own license — see Cargo.lock and cargo tree --format "{p} {l}" for the full list. If you ship Moon in a product, comply with the combined license set. The maintainers recommend running cargo about generate to produce a THIRD-PARTY.md for your distribution.