MLGraph Documentation

Distributed Vector Search, Actually Scalable

Built on FAISS, optimized for SSDs, and distributed from the ground up. Because vector databases shouldn't cost you 500GB of RAM.

SSD-First Architecture

Inspired by turbopuffer. Store billions of vectors without breaking the bank.

Distributed Training

Train across multiple nodes. 100M vectors in hours, not days.

BTree Index

O(log n) lookups. Find any vector instantly, even with billions.

Production-Ready

Replication, health monitoring, circuit breakers. Built for the real world.

Quick Example

Your First Vector Search in 5 Minutes

Create an index, add vectors, and search—it's that simple

#include "client/CentroidClient.h"

int main() {
    // Connect to MLGraph server
    CentroidClient client("localhost:50051");

    // Configure index
    centroidservice::IndexConfig config;
    config.set_index_id("my_embeddings");
    config.set_dimension(1536);        // OpenAI ada-002
    config.set_nlist(1000);            // 1000 clusters
    config.set_data_dir("/data/mlgraph");

    // Initialize
    client.Initialize({config});

    // Add vectors (your embeddings here)
    std::vector<float> embedding = get_embedding("Hello world");
    client.AddVector("my_embeddings", 1, embedding, 0);

    // Search
    auto results = client.Search("my_embeddings", query, k=10);
    for (const auto& result : results) {
        std::cout << "Found: " << result.id
                  << " (distance: " << result.distance << ")\n";
    }
}
Architecture

Three-Tier Storage for Maximum Efficiency

IDMap (Hot) - In-Memory

Lightning-fast access for actively queried vectors. ~1-2ms latency. Like your favorite coffee shop—small but perfect.

IVF (Warm) - Memory-Mapped

Balanced performance for medium-frequency access. ~5-10ms latency. Your seasonal wardrobe—ready when you need it.

OnDisk (Cold) - SSD-Backed

Massive scale with optimized I/O. ~50-100ms latency. The storage unit—unlimited capacity, accessed when needed.

Deep Dive into Features
IDMap: 100K vectors
Latency: 1.2ms p50 | 800 QPS
IVF: 10M vectors
Latency: 8.5ms p50 | 120 QPS
OnDisk: 100M vectors
Latency: 45ms p50 | 22 QPS
Mixed Workload: ~35ms p50, 150 QPS

Questions? We're Here to Help

Join our community, check GitHub issues, or reach out directly. We actually respond to emails.