Architecture
Distribution

Centroid Allocation Strategies

Choose how centroids are distributed across nodes to optimize for simplicity, balance, or locality.

Centroid Allocation Strategies Comparison

Comparison of allocation strategies with trade-offs

Allocation Strategies

Round-Robin Allocator

Simple, deterministic assignment. Centroid N goes to Node (N % num_nodes).

Pros

  • • Predictable placement
  • • Zero coordination
  • • Even distribution

Cons

  • • No load awareness
  • • Reshuffles on node change
  • • No locality

Load-Balanced Allocator

Tracks centroid counts and vector counts per node. Assigns to least-loaded.

Pros

  • • Handles skewed clusters
  • • Adapts to load
  • • Better resource usage

Cons

  • • Requires coordination
  • • Less predictable
  • • Rebalancing needed

Locality-Aware Allocator
Recommended

Groups similar centroids on the same node. Uses k-means on centroid vectors.

Pros

  • • Reduces query fanout
  • • Better cache locality
  • • Zone-aware option

Cons

  • • Complex algorithm
  • • Requires recomputation
  • • May create hotspots

Configuration

{
  "allocation": {
    "strategy": "locality-aware",  // round-robin, load-balanced, locality-aware

    "roundRobin": {
      // No additional config needed
    },

    "loadBalanced": {
      "rebalanceThreshold": 0.2,    // Trigger rebalance at 20% imbalance
      "rebalanceInterval": "1h"
    },

    "localityAware": {
      "groupingMethod": "kmeans",   // kmeans or hierarchical
      "numGroups": null,            // Auto = num_nodes
      "zoneAware": true,
      "zones": {
        "us-east": ["node-1", "node-2"],
        "us-west": ["node-3", "node-4"]
      }
    }
  }
}

Query Fanout Impact

The allocation strategy directly affects how many nodes a query touches:

Strategynprobe=8nprobe=32nprobe=128
Round-robin (5 nodes)5 nodes (100%)5 nodes (100%)5 nodes (100%)
Load-balanced (5 nodes)4-5 nodes (80-100%)5 nodes (100%)5 nodes (100%)
Locality-aware (5 nodes)1-2 nodes (20-40%)2-3 nodes (40-60%)4-5 nodes (80-100%)

When to Use Each

  • Round-robin: Development, uniform workloads, simple deployments
  • Load-balanced: Skewed cluster sizes, heterogeneous nodes
  • Locality-aware: Large clusters, multi-region, latency-sensitive