Skip to content

Clustering and replication¶

Moon supports Redis-compatible replication and cluster mode for high availability and horizontal scaling.

Replication¶

Moon implements PSYNC2-compatible replication with per-shard WAL streaming and partial resync support.

Warning

v0.1.x limitation — master must run --shards 1.

PSYNC on a multi-shard master (--shards N where N > 1) currently returns -ERR PSYNC across multiple shards is not yet supported (use --shards 1 on the master). The master's N shard-local databases cannot yet be serialized into a single consistent RDB stream for a replica to consume.

Supported deployment shape for v0.1.x: - Master: --shards 1 (single-core writer, ~1–1.5 M ops/s ceiling) - Replicas: any --shards N (multi-core read scaling is unaffected)

Multi-shard master replication is scheduled for v0.2 (see .planning/rfcs/multi-shard-replication-design.md). If you need a multi-core master today, run without replication; if you need replication, accept the single-shard master ceiling.

Observability caveats (v0.1.x): - WAIT returns 0 until the master parses REPLCONF ACK <offset> (v0.2 scope). - CLIENT LIST TYPE replica has no predicate yet; returns all clients. - master_link_status in INFO replication correctly reflects the handshake state — use it to detect a failed REPLICAOF.

Set up a replica¶

Start the leader (must be --shards 1 in v0.1.x)¶

./target/release/moon --port 6379 --shards 1

Start the replica (any shard count)¶

./target/release/moon --port 6380 --shards 4

Connect the replica to the leader¶

redis-cli -p 6380 REPLICAOF 127.0.0.1 6379
redis-cli -p 6380 INFO replication | grep master_link_status
# Expect: master_link_status:up

Replication features¶

  • PSYNC2 protocol — compatible with Redis replication clients
  • Per-shard WAL streaming — each shard streams its own WAL independently (once connected)
  • Partial resync — reconnecting replicas resume from where they left off via the replication backlog
  • Lazy backlog — the replication backlog is only allocated when the first replica handshake begins (REPLCONF), saving ~12 MB baseline memory

Cluster mode¶

Moon implements the Redis Cluster specification with 16,384 hash slots, gossip protocol, and automatic failover.

Start a cluster¶

# Start three nodes
./target/release/moon --port 7000 --cluster-enabled true --shards 2
./target/release/moon --port 7001 --cluster-enabled true --shards 2
./target/release/moon --port 7002 --cluster-enabled true --shards 2

# Join nodes
redis-cli -p 7000 CLUSTER MEET 127.0.0.1 7001
redis-cli -p 7000 CLUSTER MEET 127.0.0.1 7002

# Assign slots (roughly equal distribution)
redis-cli -p 7000 CLUSTER ADDSLOTS {0..5461}
redis-cli -p 7001 CLUSTER ADDSLOTS {5462..10922}
redis-cli -p 7002 CLUSTER ADDSLOTS {10923..16383}

Cluster features¶

  • 16,384 hash slots with CRC16-based routing
  • Gossip protocol for node discovery and failure detection
  • MOVED/ASK redirections for client-side routing
  • Live slot migration for rebalancing without downtime
  • Majority consensus failover with automatic promotion

Hash tags¶

Use {tag} in key names to co-locate related keys on the same slot:

# All these keys route to the same slot
SET user:{1234}:name "Alice"
SET user:{1234}:email "alice@example.com"
MGET user:{1234}:name user:{1234}:email

This eliminates cross-shard dispatch overhead for multi-key operations like MGET and MSET.

Cluster commands¶

CLUSTER INFO, CLUSTER NODES, CLUSTER SLOTS, CLUSTER MEET, CLUSTER ADDSLOTS, CLUSTER DELSLOTS, CLUSTER SETSLOT, CLUSTER FAILOVER, CLUSTER MYID

Configuration¶

Flag Default Description
--cluster-enabled false Enable cluster mode
--cluster-node-timeout 15000 Node timeout in ms before failover