Runbook: OOM During Snapshot (BGSAVE)¶
Symptoms¶
- Moon process killed by OOM killer during BGSAVE
dmesg | grep oomshows moon process- Snapshot file is incomplete or missing
Root Cause¶
BGSAVE requires serializing all data to disk. Unlike Redis (which forks), Moon uses forkless compartmentalized snapshots, but the serialization buffers can spike memory usage.
Recovery Steps¶
Step 1: Restart Moon¶
Step 2: Verify data integrity¶
Step 3: Address the OOM root cause¶
# Check current memory usage
redis-cli -p 6379 INFO memory
# Option A: Increase available memory
# Option B: Set maxmemory to leave headroom for snapshots
# Rule of thumb: maxmemory = 75% of available RAM
redis-cli -p 6379 CONFIG SET maxmemory <bytes>
# Option C: Use AOF-only persistence (no BGSAVE spikes)
redis-cli -p 6379 CONFIG SET save ""
Step 4: Monitor¶
- Set up RSS alerts at 80% of available memory
- Monitor
moon_rss_bytesPrometheus metric (if admin port enabled)