02 · OPERATIONS

Backup & restore

Segments are immutable once written and the WAL is append-only, so backups are a plain filesystem-level operation. No snapshot API, no coordination dance — you copy the data directory while the server is running and restore it by copying it back.

Hot backup

Segments never change after they land on disk, so a running rsync of the data directory is consistent per-segment. The only moving pieces are:

The tail WAL file (may get appends during the copy).
The active memtables (not on disk yet).

Before the rsync, call POST /v1/admin/flush to force a memtable flush so recent writes are persisted as segments:

$ curl -sXPOST -H "Authorization: ApiKey $XERJ_API_KEY" \
    http://127.0.0.1:8080/v1/admin/flush
{"flushed_indices": 4, "segments_written": 4, "wal_checkpoint": 12847}

Then rsync the data dir to your backup target. Repeat rsync at the snapshot window you want — because segments are content-addressed, subsequent rsyncs only transfer new segments.

$ rsync -aH --delete \
    /var/lib/xerj/ \
    backup.internal:/srv/backups/xerj/$(date +%F)/

This is compatible with ZFS snapshots, LVM snapshots, or any filesystem-level backup tool. If you use ZFS, the snapshot after a flush is genuinely point-in-time consistent and you don't need the rsync step.

S3 / object store backup

XERJ exposes POST /v1/admin/backup which streams a full copy of the data directory to any S3-compatible endpoint:

$ curl -sXPOST -H "Authorization: ApiKey $XERJ_API_KEY" \
    -H "Content-Type: application/json" \
    http://127.0.0.1:8080/v1/admin/backup \
    -d '{
      "destination": "s3://my-backups/xerj/nightly",
      "credentials": {
        "endpoint":   "https://s3.us-east-1.amazonaws.com",
        "access_key": "AKIA...",
        "secret_key": "..."
      },
      "flush_first": true
    }'
{"backup_id":"bk-2026-04-15T03-00-00Z","bytes":142857600,"duration_ms":8421}

Backups are incremental by default — only segments the destination doesn't already hold are uploaded.

Restore

Stop the server, replace the data directory, start the server. No import step — the WAL replay on startup reconstructs memtable state from whatever files are on disk.

$ sudo systemctl stop xerj
$ sudo rm -rf /var/lib/xerj/*
$ sudo rsync -aH backup.internal:/srv/backups/xerj/2026-04-15/ /var/lib/xerj/
$ sudo chown -R xerj:xerj /var/lib/xerj
$ sudo systemctl start xerj
$ curl -sf http://127.0.0.1:8080/v1/health/ready

Per-index backup

To back up a single index (e.g. before a destructive migration), copy its subdirectory after a flush:

$ curl -sXPOST -H "Authorization: ApiKey $XERJ_API_KEY" \
    http://127.0.0.1:8080/v1/indices/events/flush
$ sudo tar czf events-$(date +%F).tar.gz -C /var/lib/xerj events

Clustered backup

In a cluster, each node holds a subset of shards — you need a consistent backup of every node. The simplest pattern is to drain one node at a time, snapshot it, and reactivate:

# Drain
$ curl -sXPOST .../v1/cluster/nodes/b/drain
# Wait for "shards_remaining": 0
$ rsync -aH /var/lib/xerj/ backup.internal:/srv/backups/xerj/b/
$ curl -sXPOST .../v1/cluster/nodes/b/activate

Alternatively, take per-node S3 backups nightly and trust that Raft replay will reconcile the metadata when a restored node rejoins.

Source · engine/crates/storage/src/backend.rs · engine/crates/api/src/native.rs

◀ PREVRunning in production

NEXT ▶Upgrades