sidebar_position | slug |
---|---|
1 |
/benchmark |
This document compares the following key specifications of Elasticsearch, Qdrant, and Infinity:
- Recall
- Time to insert & build index
- Time to import & build index
- query latency
- QPS
You need to watch resource (persisted index size, peak memory, peak cpu, system load etc.) manually.
Keep the environment clean to ensure that the database under test is able to use up all resource of the system.
Avoid to run multiple databases at the same time, as each one is a significant resource consumer.
Test environment:
- OS: OpenSUSE Tumbleweed x86_64
- CPU: Intel CORE i5-13500H 16vCPU
- RAM: 32GB
- Disk: 1TB
Version | |
---|---|
Elasticsearch | v8.13.4 |
Qdrant | v1.9.2 |
Infinity | v0.1.0 |
- Install necessary dependencies.
cd python/benchmark
pip install -r requirements.txt
Preprocess dataset:
sed '1d' datasets/enwiki/enwiki-20120502-lines-1k.txt > datasets/enwiki/enwiki.csv
- Start up the databases to compare:
mkdir -p $HOME/elasticsearch
docker run -d --name elasticsearch --network host -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms16384m -Xmx32000m" -e "xpack.security.enabled=false" -v $HOME/elasticsearch:/usr/share/elasticsearch elasticsearch:8.13.4
mkdir -p $HOME/qdrant/storage
docker run -d --name qdrant --network host -v $HOME/qdrant/storage:/qdrant/storage qdrant/qdrant:v1.9.2
mkdir -p $HOME/infinity
docker run -d --name infinity -v $HOME/infinity:/var/infinity --ulimit nofile=500000:500000 --network=host infiniflow/infinity:0.1.0
- Run Benchmark:
Drop file cache before benchmark.
echo 3 | sudo tee /proc/sys/vm/drop_caches
Tasks of the Python script run.py
include:
- Generate fulltext query set.
- Measure the time to import data and build index.
- Measure the query latency.
- Measure the QPS.
$ python run.py -h
usage: run.py [-h] [--generate] [--import] [--query QUERY] [--query-express QUERY_EXPRESS] [--concurrency CONCURRENCY] [--engine ENGINE] [--dataset DATASET]
RAG Database Benchmark
options:
-h, --help show this help message and exit
--generate Generate fulltext query set based on the dataset (default: False)
--import Import dataset into database engine (default: False)
--query QUERY Run the query set only once using given number of clients with recording the result and latency. This is for result validation and latency analysis (default: 0)
--query-express QUERY_EXPRESS
Run the query set randomly using given number of clients without recording the result and latency. This is for QPS measurement. (default: 0)
--concurrency CONCURRENCY
Choose concurrency mechanism, one of: mp - multiprocessing(recommended), mt - multithreading. (default: mp)
--engine ENGINE Choose database engine to benchmark, one of: infinity, qdrant, elasticsearch (default: infinity)
--dataset DATASET Choose dataset to benchmark, one of: gist, sift, geonames, enwiki (default: enwiki)
Following are commands for engine infinity
and dataset enwiki
:
python run.py --generate --engine infinity --dataset enwiki
python run.py --import --engine infinity --dataset enwiki
python run.py --query=16 --engine infinity --dataset enwiki
python run.py --query-express=16 --engine infinity --dataset enwiki
Following are commands to issue a single query so that you can compare results among several engines.
curl -X GET "http://localhost:9200/elasticsearch_enwiki/_search" -H 'Content-Type: application/json' -d'{"size":10,"_source":"doctitle","query": {"match": { "body": "wraysbury istorijos" }}}'
psql -h 0.0.0.0 -p 5432 -c "SELECT doctitle, ROW_ID(), SCORE() FROM infinity_enwiki SEARCH MATCH TEXT ('body', 'wraysbury istorijos', 'topn=10;block_max=true');"
- Metric: L2
- 10000 queries
QPS | Recall | Time to insert & build index | Time to import & build index | Disk | Peak memory | |
---|---|---|---|---|---|---|
Elasticsearch | 934 | 0.992 | 131 s | N/A | 874 MB | 1.463 GB |
Qdrant | 1303 | 0.979 | 46 s | N/A | 418 MB | 1.6 GB |
Infinity | 16320 | 0.973 | 74 s | 28 s | 792 MB | 0.95 GB |
- Metric: L2
- 1000 queries
QPS | Recall | Time to insert & build index | Time to import & build index | Disk | Peak memory | |
---|---|---|---|---|---|---|
Elasticsearch | 305 | 0.885 | 872 s | N/A | 13 GB | 6.9 GB |
Qdrant | 339 | 0.947 | 366 s | N/A | 4.4 GB | 7.3 GB |
Infinity | 2200 | 0.946 | 463 s | 112 s | 4.7 GB | 6.0 GB |
- 33000000 documents
- 100000
OR
queries generated based on the dataset. All terms are extracted from the dataset and very rare(occurrence < 100) terms are excluded. The number of terms of each query match the weight[0.03, 0.15, 0.25, 0.25, 0.15, 0.08, 0.04, 0.03, 0.02]
.
Time to insert & build index | Time to import & build index | P95 Latency(ms) | QPS (16 python clients) | Memory | vCPU | |
---|---|---|---|---|---|---|
Elasticsearch | 2289 s | N/A | 14.75 | 1340 | 21.0GB | 10.6 |
Infinity | 1562 s | 2244 s | 1.86 | 12328 | 10.0GB | 11.0 |
Infinity provides a Python script for benchmarking the SIFT1M and GIST1M datasets.
You have two options for building Infinity. Choose the option that best fits your needs:
To obtain the benchmark datasets, you have the option to download them using the wget command.
#download sift benchmark
wget ftp://ftp.irisa.fr/local/texmex/corpus/sift.tar.gz
#download gist benchmark
wget ftp://ftp.irisa.fr/local/texmex/corpus/gist.tar.gz
Alternatively, you can manually download the benchmark datasets by visiting http://corpus-texmex.irisa.fr/.
# Unzip and move the SIFT1M benchmark file.
tar -zxvf sift.tar.gz
mv sift/sift_base.fvecs test/data/benchmark/sift_1m/sift_base.fvecs
mv sift/sift_query.fvecs test/data/benchmark/sift_1m/sift_query.fvecs
mv sift/sift_groundtruth.ivecs test/data/benchmark/sift_1m/sift_groundtruth.ivecs
# Unzip and move the GIST1M benchmark file.
tar -zxvf gist.tar.gz
mv gist/gist_base.fvecs test/data/benchmark/gist_1m/gist_base.fvecs
mv gist/gist_query.fvecs test/data/benchmark/gist_1m/gist_query.fvecs
mv gist/gist_groundtruth.ivecs test/data/benchmark/gist_1m/gist_groundtruth.ivecs
cd python
pip install -r requirements.txt
pip install .
cd benchmark
# options:
# -h, --help show this help message and exit
# -d DATA_SET, --data DATA_SET
python remote_benchmark_knn_import.py -d sift_1m
python remote_benchmark_knn_import.py -d gist_1m
# options:
# -h, --help show this help message and exit
# -t THREADS, --threads THREADS
# -r ROUNDS, --rounds ROUNDS
# -d DATA_SET, --data DATA_SET
# ROUNDS indicates the number of times Python executes the benchmark, and the result represents the average duration for each run.
# Perform a latency benchmark on the SIFT1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 1 -r 1 -d sift_1m
# Perform a latency benchmark on the GIST1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 1 -r 1 -d gist_1m
# Perform a QPS benchmark on the SIFT1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 16 -r 1 -d sift_1m
# Perform a latency benchmark on the GIST1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 16 -r 1 -d gist_1m
- Hardware: Intel i5-12500H, 16C, 16GB
- Operating system: Ubuntu 22.04
- Dataset: SIFT1M; topk: 100; recall: 97%+
- P99 QPS: 15,688 (16 clients)
- P99 Latency: 0.36 ms
- Memory usage: 408 MB