MLGraph Documentation

Distributed Vector Search, Actually Scalable

Built on FAISS, optimized for SSDs, and distributed from the ground up. Because vector databases shouldn't cost you 500GB of RAM.

Get Started API Reference

SSD-First Architecture

Inspired by turbopuffer. Store billions of vectors without breaking the bank.

Distributed Training

Train across multiple nodes. 100M vectors in hours, not days.

BTree Index

O(log n) lookups. Find any vector instantly, even with billions.

Production-Ready

Replication, health monitoring, circuit breakers. Built for the real world.

Documentation

Everything you need to build with MLGraph, from zero to production

Getting Started

Installation, first index creation, and basic operations

10 min read

Architecture

S3-backed storage, TLS security, Parquet support, and system design

15 min read

Cluster Management

Distributed deployment, replication, sharding, failover, and scaling

20 min read

Performance

Benchmarks, scalability analysis, and production tuning tips

15 min read

API Reference

Complete gRPC and REST API documentation with examples

30 min read

Quick Example

Your First Vector Search in 5 Minutes

Create an index, add vectors, and search—it's that simple

#include "client/CentroidClient.h"

int main() {
    // Connect to MLGraph server
    CentroidClient client("localhost:50051");

    // Configure index
    centroidservice::IndexConfig config;
    config.set_index_id("my_embeddings");
    config.set_dimension(1536);        // OpenAI ada-002
    config.set_nlist(1000);            // 1000 clusters
    config.set_data_dir("/data/mlgraph");

    // Initialize
    client.Initialize({config});

    // Add vectors (your embeddings here)
    std::vector<float> embedding = get_embedding("Hello world");
    client.AddVector("my_embeddings", 1, embedding, 0);

    // Search
    auto results = client.Search("my_embeddings", query, k=10);
    for (const auto& result : results) {
        std::cout << "Found: " << result.id
                  << " (distance: " << result.distance << ")\n";
    }
}

Full Tutorial

Architecture

Three-Tier Storage for Maximum Efficiency

IDMap (Hot) - In-Memory

Lightning-fast access for actively queried vectors. ~1-2ms latency. Like your favorite coffee shop—small but perfect.

IVF (Warm) - Memory-Mapped

Balanced performance for medium-frequency access. ~5-10ms latency. Your seasonal wardrobe—ready when you need it.

OnDisk (Cold) - SSD-Backed

Massive scale with optimized I/O. ~50-100ms latency. The storage unit—unlimited capacity, accessed when needed.

Deep Dive into Features

IDMap: 100K vectors
Latency: 1.2ms p50 | 800 QPS
IVF: 10M vectors
Latency: 8.5ms p50 | 120 QPS
OnDisk: 100M vectors
Latency: 45ms p50 | 22 QPS
Mixed Workload: ~35ms p50, 150 QPS

Questions? We're Here to Help

Join our community, check GitHub issues, or reach out directly. We actually respond to emails.

GitHub Contact Support