Home
Softono
GustANN

GustANN

Open source Cuda
27
Stars
5
Forks
0
Issues
0
Watchers
1 month
Last Commit

About GustANN

GustANN is a high-throughput, cost-effective billion-scale vector search engine designed to run on a single GPU. Based on research published at SIGMOD 26, it leverages a graph-based architecture to achieve approximately 250,000 queries per second on billion-scale datasets like SIFT1B with 90 percent recall, outperforming existing solutions like DiskANN by a factor of 7.81. The system is engineered for memory efficiency, requiring only around 40GB of total memory for both GPU and CPU components even when handling vast datasets. It offers flexible deployment options, supporting search modes that are SSD-based, fully in DRAM, or entirely on the GPU, and integrates with various storage backends including SPDK, liburing, and libaio. Designed for x86 systems with NVIDIA GPUs such as the A100, GustANN supports datasets up to 2 billion vectors with record sizes under 4KB. It relies on DiskANN for index construction and features an automated setup process for rapid deployment. The software is suitable for large-scale

Platforms

Web Self-hosted

Languages

Cuda

Links

GustANN

GustANN is a high-throughput, billion-scale, graph-based vector store built for GPUs. It is based on our SIGMOD '26 paper:

πŸ“„ High-Throughput, Cost-Effective Billion-Scale Vector Search with a Single GPU.

✨ Key Features

  • πŸš€ High Throughput: Achieves ~250K QPS on billion-scale datasets (SIFT1B, top-10, recall=0.9)β€”7.81x faster than DiskANN.
  • 🧠 Memory Efficient: Requires only ~40GB of memory for both GPU and CPU on billion-scale datasets.
  • πŸ”€ Flexible Interface: Supports multiple search modes (SSD-based, all-in-DRAM, all-in-GPU) and storage backends (SPDK, liburing, libaio).

[!TIP] For convenience, we highly recommend modifying scripts/setup.sh first to specify your file paths and save locations, and then using the provided automated scripts.

If you prefer to execute the commands step-by-step manually (or need deeper customization), please refer to our guide.


⚑ Quickstart

You can quickly set up a 1M vector database to try GustANN using our automated script:

./scripts/quick_start.sh

πŸ’» System Requirements

To ensure optimal performance, please verify your hardware meets the following configurations:

  • CPU: x86 CPU supporting huge pages (Verify with: grep pdpe1gb /proc/cpuinfo).
  • System RAM (DRAM): ~40GB minimum for vector search. (Note: Building the index requires additional memory).
  • GPU: ~40GB VRAM required for billion-scale search (e.g., NVIDIA A100).
  • Dataset Constraints: Maximum of < 2 Billion vectors (to avoid integer overflow). Record size (vector_size + 4 + 4 * num_neighbors) must be < 4KB.
  • Storage (SSD): ~700GB for SIFT1B or ~1TB for DEEP1B. Multi-SSD configurations are supported.
    • Recommendation: Use SPDK for maximum performance. Alternatively, use io-uring, aio, or in-memory indexing.

πŸ› οΈ Installation & Build

1. Install Dependencies

GustANN relies on DiskANN for index building. Install the following system dependencies (Ubuntu 22.04):

sudo apt update
sudo apt install make cmake g++ libaio-dev libgoogle-perftools-dev clang-format libboost-all-dev libmkl-full-dev libjemalloc-dev

[!NOTE] You must also install CUDA following NVIDIA's official instructions.

2. Clone the Repository

git clone https://github.com/thustorage/GustANN.git --recursive
cd GustANN

3. Build GustANN

mkdir -p build && cd build
cmake .. # Use flags here for specific backends (see below)
make -j
cd ..

Note: To build with a specific storage backend, append the switch to the CMake command: -DCMAKE_USE_{SPDK,URING,AIO}=ON.


πŸ“Š Dataset and Index Preparation

[!NOTE] If you already have a built index, you can skip this step. For complete dataset preparation instructions from scratch, refer to the PipeANN repository.

1. Build DiskANN

First, compile DiskANN:

cd deps/DiskANN
mkdir build && cd build
cmake .. && make -j
cd ../../..

2. Convert Dataset Format

To build a DiskANN index, you need to prepare your dataset in bin format. DiskANN provides a utility to convert from bvec/fvec formats (e.g., the format used by the SIFT dataset):

./deps/DiskANN/build/apps/utils/fvecs_to_bin <float/uint8> input_vecs output_bin

3. Build the DiskANN Index

Once the dataset is converted, update scripts/setup.sh with your paths, and run:

./scripts/build_disann_index.sh <pq_size> <memory>
  • pq_size: 3.3 for 100M-scale datasets, 33 for 1B-scale datasets (generates 32-bit PQ vectors).
  • memory: Maximum memory available for building the index.

4. Prepare the GustANN Index (Pivot Graph)

In addition to the DiskANN index, GustANN requires building a pivot graph. Update scripts/setup.sh, then run:

./scripts/gen_pivot_graph.sh

πŸƒ Running GustANN

Mode A: In-Memory Search (Recommended for < 100M vectors)

For smaller datasets, keeping data in DRAM or GPU memory yields the best performance. After updating scripts/setup.sh, run:

./scripts/run_mem.sh --topk <topk> --ef_search <L> [<L2> ...] -R <R> [-G]
  • -G Flag: Enables pure GPU search (Fastest, but limited to small datasets).
  • -L Flag: Number of vectors stored during search (Higher = more accurate).
  • -R Flag: Repeat the query R times for accurate benchmarking on small query sets.

Mode B: On-SSD Search (Billion-Scale)

You can run GustANN using SPDK (fastest), liburing, or libaio.

πŸ”₯ SPDK Setup (Highest Performance)

[!WARNING] SPDK requires root privileges. There must be NO partitions or filesystems on the target SSDs. Using nvme format WILL ERASE ALL DATA ON THE DISK. Do this at your own risk!

  1. Build SPDK & GustANN:

    git clone https://github.com/spdk/spdk deps/spdk 
    # Note: We have tested GustANN on git commit 7c0720d1d
    
    cd deps/spdk
    sudo scripts/pkgdep.sh # Install the dependency of SPDK
    ./configure && make -j
    cd ../..
    
    # Remember to rebuild GustANN with SPDK enabled: 
    # cd build && cmake .. -DCMAKE_USE_SPDK=ON && make -j && cd ..
  2. Setup SPDK:

    sudo ./deps/spdk/scripts/setup.sh
    sudo ./deps/spdk/build/examples/hello_world # Verify it works
  3. Prepare SSD List & Write Index: Collect the PCIe addresses of your SSDs into a file (ssd_list.txt), update scripts/setup.sh, and run:

    sudo ./scripts/write_spdk.sh

βš™οΈ liburing / libaio Setup

  • liburing:

    git clone https://github.com/axboe/liburing.git deps/liburing 
    # Note: We have tested on commit 20b3fe67
    
    cd deps/liburing
    ./configure && make -j
    cd ../..
    
    # Remember to rebuild GustANN with: 
    # cd build && cmake .. -DCMAKE_USE_URING=ON && make -j && cd ..
  • libaio: Ensure your Linux kernel supports libaio. Rebuild GustANN with -DCMAKE_USE_AIO=ON.

▢️ Execute On-SSD Search

Update scripts/setup.sh, then run the appropriate script for your backend:

sudo ./scripts/run_spdk.sh  --topk <topk> --ef_search <L> -B <B> -T <T> -C <C> -R <R> # for SPDK
./scripts/run_uring.sh      --topk <topk> --ef_search <L> -B <B> -T <T> -C <C> -R <R> # for liburing
./scripts/run_aio.sh        --topk <topk> --ef_search <L> -B <B> -T <T> -C <C> -R <R> # for libaio
./scripts/run.sh            --topk <topk> --ef_search <L> -B <B> -T <T> -C <C> -R <R> # for in-memory fallback testing

[!IMPORTANT]
Crucial Tuning Parameters (B, T, C): Different I/O backends favor different worker configurations:

  • SPDK: -T 2 (Worker threads), -C 20 (Minibatches/thread), -B >=1000 (Minibatch size)
  • Uring / Memory: -T 20, -C 1, -B >=1000
  • AIO: -T 2, -C 10, -B 256 (Setting B too large may cause AIO to crash!)

After execution, the runtime, total SSD I/Os, and recall metrics will be printed to stdout.


πŸ§ͺ Experimental Features

  • GPU Direct-IO Support: See bam.md for experimental GPU Direct Storage setup.

πŸ“š Citation

If you find GustANN useful in your research, please cite our SIGMOD '26 paper:

@inproceedings{sigmod26gustann,
    author = {Haodi Jiang and Hao Guo and Minhui Xie and Jiwu Shu and Youyou Lu},
    title = {{High-Throughput, Cost-Effective Billion-Scale Vector Search with a Single GPU}},
    year = {2026},
    publisher = {Association for Computing Machinery},
    booktitle = {Proceedings of the 2026 International Conference on Management of Data},
    address = {Bengaluru, India},
    series = {SIGMOD '26}
}

πŸ™ Acknowledgements

Some GPU kernel implementations are adapted from CuHNSW. We greatly appreciate their open-source contributions.