References & credits¶

Moon stands on the shoulders of decades of systems research and a vibrant open-source ecosystem. This page lists the projects, papers, and specifications that directly shaped Moon's design, along with the core runtime dependencies and the rationale for each.

Design inspirations¶

Moon's architecture is not invented — it's assembled from ideas that have been validated in production by others. Credit where it's due.

Dragonfly — C++ shared-nothing, thread-per-core Redis alternative. Validated that the thread-per-core model is the right answer for key-value stores in 2024+. Moon follows the same top-level architecture with a Rust implementation.
ScyllaDB / Seastar — pioneered thread-per-core shared-nothing for databases. The CachePadded<T> discipline, SPSC cross-shard channels, and "never share data, always share work" philosophy come from Seastar.
KeyDB — multi-threaded Redis fork. A useful counter-example: demonstrated the spinlock ceiling at ~4 threads, which is exactly what shared-nothing avoids.
Garnet (Microsoft Research) — .NET Redis alternative with a Tsavorite log-structured store. Validated that RESP compatibility plus a modern storage engine can beat Redis on both latency and memory.
TiKV — large-scale Rust KV store. Production-proven jemalloc tuning and tracing patterns Moon adopts.
ByteDance Monoio — the thread-per-core io_uring runtime Moon uses by default on Linux. Production-proven at ByteDance scale.

Research papers¶

Algorithms Moon implements directly, with papers worth reading if you want to understand why the code looks the way it does.

Dash: Scalable Hashing on Persistent Memory (VLDB 2020) — the segmented-hash-table design that DashTable in src/storage/dashtable/ is based on. Optimized for cache-line locality and concurrent lock-free reads.
Swiss Table / Abseil — SIMD control-byte probing. Moon uses Swiss Table-style probing within each Dash segment: one SIMD load scans 16 slots' worth of metadata before touching any key data. See src/storage/dashtable/segment.rs.
VLL: Very Lightweight Locking for Main Memory Database Systems (VLDB 2012) — multi-key transaction coordination across shards without heavy locking. Informs Moon's approach to cross-shard MGET/MSET/MULTI and future cluster-wide transactions.
TurboQuant: Fast 4-bit Vector Quantization — the foundation for Moon's TQ-4bit vector compression. See src/vector/turbo_quant/ for the implementation and docs/vector-search-guide.md for how it's wired into FT.CREATE.
HNSW: Efficient and Robust Approximate Nearest Neighbor Search (Malkov & Yashunin, 2016) — the graph index used for FT.SEARCH. Moon's HNSW implementation lives in src/vector/hnsw/.
The Probabilistic Relevance Framework: BM25 and Beyond (Robertson & Zaragoza, 2009) — the Okapi BM25 ranking function Moon implements for full-text scoring. The score = Σ IDF(t)·(tf·(k1+1))/(tf + k1·(1−b+b·dl/avgdl)) formula and the k1=1.2, b=0.75 defaults live in src/text/bm25.rs.
Reciprocal Rank Fusion outperforms Condorcet and individual Rank Learning Methods (Cormack, Clarke & Büttcher, SIGIR 2009) — the RRF algorithm behind Moon's three-way hybrid fusion (BM25 + dense + sparse). See src/command/vector_search/hybrid.rs and hybrid_multi.rs.
HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm (Flajolet, Fusy, Gandouet & Meunier, 2007) — the cardinality sketch behind PFADD/PFCOUNT. Moon's src/storage/hll.rs is byte-identical with the Redis 7.x HYLL wire format (HLL_P=14, 16384 registers, 6-bit). It estimates with the Ertl improved estimator (Ertl, 2017) (hll_sigma/hll_tau) rather than Google's HLL++ bias-correction tables.
Fast String Correction with Levenshtein-Automata (Schulz & Mihov, 2002) — the Levenshtein-automaton construction behind FT.SEARCH typo tolerance. Moon builds a term-dictionary FST (src/text/fst_dict.rs, via the fst and levenshtein_automata crates) and intersects it with a Levenshtein automaton for bounded-edit-distance fuzzy expansion.
Efficient IO with io_uring (Jens Axboe) — the canonical design document for the Linux io_uring interface. Background for Moon's monoio-based thread-per-core driver and its tokio bridge.
Coordinated Omission (Gil Tene) — why closed-loop benchmarking under-reports tail latency. Moon's benchmark methodology follows open-loop principles.

Protocol & compatibility specifications¶

Moon targets drop-in Redis compatibility. These are the specs it implements.

Redis Protocol Specification (RESP2/RESP3) — the wire protocol parsed in src/protocol/.
Redis Commands Reference — command semantics Moon preserves. Any deviation is a bug.
Redis Cluster Specification — 16,384 hash slots, gossip protocol, MOVED/ASK redirections, epoch-based failover.
PSYNC2 Replication Protocol — partial resynchronization. Moon's replication in src/replication/ implements PSYNC2 so Redis replicas can peer with Moon primaries and vice versa.

Core runtime dependencies¶

Each dependency was chosen for a specific, load-bearing reason. Swapping any of them would require measurable justification.

Crate	Purpose	Why chosen
monoio	Thread-per-core async runtime	`io_uring` on Linux, `kqueue` on macOS. Production-proven at ByteDance. Lower per-op overhead than Tokio for request-response workloads.
tokio	Fallback async runtime	Broad ecosystem, cross-platform, CI-friendly. Used as portable alternative behind a feature flag.
tikv-jemallocator	Memory allocator	Reduced fragmentation for long-running servers. Production-proven by TiKV.
rustls	TLS implementation	Pure Rust, no OpenSSL dependency, async-native.
aws-lc-rs	Cryptographic backend	FIPS-capable, high-performance AES-GCM and ChaCha20.
mlua	Lua 5.4 VM	Redis `EVAL`/`EVALSHA` compatibility via safe Rust bindings.
memchr	SIMD byte search	~6.5× faster CRLF scanning than `std`. SSE2, AVX2, NEON dispatched at runtime.
fst	Finite-state term dictionary	Compact, sorted FST map for `FT.SEARCH` prefix and fuzzy term expansion. Feature-gated behind `text-index`.
levenshtein_automata	Bounded-edit fuzzy matching	Builds Levenshtein automata intersected with the term FST for typo-tolerant full-text search.
bumpalo	Bump allocation arenas	~2 ns allocation, O(1) bulk deallocation per request. Used for per-request scratch buffers.
bytes	Zero-copy buffers	`Bytes::freeze()` for shared response data without copying; reference-counted slices for pipeline batches.
xxhash-rust	Non-cryptographic hashing	Fast key hashing for DashTable segment routing.
crossbeam-utils	Concurrency primitives	`CachePadded<T>` for false-sharing prevention on hot atomics.
ringbuf	SPSC ring buffer	Lock-free cross-shard message passing.
phf	Perfect hash map	Static command metadata registry; constant-time lookup for cold commands (hot commands use a direct match — see dispatch hot-path recovery).
ordered-float	Total-ordered floats	Sorted-set score keys.
parking_lot	Non-poisoning locks	Faster, poisoning-free `RwLock`/`Mutex` for per-shard state.

Benchmarking methodology¶

Moon's benchmark numbers follow industry-standard practices. These are the references worth understanding before interpreting any results.

Redis vs Dragonfly Performance (Redis blog) — fair comparison methodology: same cores, cluster vs single-process, honest caveats.
memtier_benchmark — industry-standard Redis benchmarking tool. Moon's benchmark scripts use redis-benchmark for parity with published Redis numbers, and memtier_benchmark for latency distribution analysis.
Coordinated Omission (Gil Tene) — why closed-loop benchmarking lies about tail latency, and how to measure honestly.

License & attribution¶

Moon is Apache 2.0 licensed. Each dependency carries its own license — see Cargo.lock and cargo tree --format "{p} {l}" for the full list. If you ship Moon in a product, comply with the combined license set. The maintainers recommend running cargo about generate to produce a THIRD-PARTY.md for your distribution.