03 · START
Migrate from Elasticsearch
XERJ runs a second API listener on 9200 that speaks the Elasticsearch wire format. The fast path to trying XERJ is to point your existing ES client at it and let the compat layer take the calls. It isn't 100% coverage; this page says exactly what is and isn't there.
What works day one
- Index CRUD:
PUT /:index,DELETE /:index,GET /:index,PUT /:index/_mapping,GET /:index/_mapping. - Document CRUD:
POST /:index/_doc,PUT /:index/_doc/:id,GET /:index/_doc/:id,DELETE /:index/_doc/:id. - Bulk:
POST /:index/_bulkwith index/create/update/delete actions. - Query DSL: 32 query types including match, bool, range, wildcard, phrase, span, fuzzy, and AI-native (knn, semantic_search, hybrid).
- Aggregations: terms, range, histogram, date_histogram, filter, missing, composite, plus the metric family (avg, sum, min, max, stats, cardinality, percentiles).
- Cluster endpoints:
GET /_cluster/health,GET /_cat/indices. - Delete by query:
POST /:index/_delete_by_query.
What needs rewriting
- Scripted fields / scripted metrics. Painless is not implemented. Use
function_scorefor the cases it covers. - Runtime fields. Not supported. Compute at ingest or query time.
- Cross-cluster search. Single-cluster only in v0.1.
- Machine-learning jobs. Not a feature.
- Watcher / alerting framework. Use the
/v1/explain-planendpoint + an external rule runner. - ILM (Index Lifecycle Management). Retention is controlled per-index via
[logs] retention_days.
What's different (for the better)
- No heap tuning. No
-Xmx, no GC pauses, no page-fault storms. - Single binary. 11 MB static executable vs 620 MB ES tarball.
- First-class explain plan.
/v1/explain-planreturns the optimizer tree, not a profile dump. - Columnar ingest. Encodings chosen at write time, per column — delta-of-delta, dictionary, ZSTD, SQ8 for vectors.
- Vector dims up to 16384. ES caps at 4096.
A 10-minute migration
# 1. run XERJ alongside ES
$ xerj --config xerj.toml &
# 2. reindex one index, one-shot
$ curl -sX POST http://es:9200/logs-2026-04/_search?scroll=1m \
-d '{"size":1000,"query":{"match_all":{}}}' \
| jq -c '.hits.hits[]._source' \
| xerj-ingest http://localhost:9200 logs-2026-04
# 3. compare one query
$ diff <(curl -s http://es:9200/logs-2026-04/_search -d @q.json) \
<(curl -s http://localhost:9200/logs-2026-04/_search -d @q.json)
Source · engine/crates/api/src/es_compat.rs