Home
Softono
m2m-vector-search

m2m-vector-search

Open source Python
24
Stars
8
Forks
0
Issues
0
Watchers
1 month
Last Commit

About m2m-vector-search

Edge Vector search engine with Vulkan GPU acceleration, hierarchical indexing (HRM2), and native LangChain integration. Gaussian splat-based architecture for similarity search on resource-constrained devices.

Platforms

Web Self-hosted

Languages

Python

Links

Version Python License Tests Backends

M2M Vector Search

Machine-to-Memory — A vector search engine with probabilistic Gaussian Splats, online learning via feedback, and multi-backend GPU acceleration.

Quick StartFeaturesArchitectureMemoryBenchmarks


Features

Feature Description
๐Ÿ”ฎ Gaussian Splats Probabilistic vector representation: score(x, i) = ฮฑแตข ยท exp(-ฮบแตข ยท โ€–x โˆ’ ฮผแตขโ€–ยฒ)
๐Ÿ“ˆ Online Learning Hebbian update rules adapt ฮฑ, ฮบ, ฮผ from user feedback (relevant / not_relevant)
โšก Multi-GPU Backend CPU, NVIDIA CUDA, and AMD Vulkan via a single API
๐Ÿง  Energy-Based Model Native uncertainty quantification via energy landscape
๐Ÿ” HRM2 Engine Hierarchical Routing with Mixture Models and adaptive probing
๐Ÿงน SOC Consolidation Self-Organized Criticality prunes low-ฮฑ splats automatically
๐Ÿ—„๏ธ Semantic Memory Hybrid BM25 + vector search with Reciprocal Rank Fusion
๐Ÿ”— LangChain Ready Native Retriever interface

Quick Start

Install

pip install m2m-vector-search

Minimal Example

from m2m import SimpleVectorDB
import numpy as np

# Create a vector database
db = SimpleVectorDB(latent_dim=128)

# Add vectors
vectors = np.random.randn(1000, 128).astype(np.float32)
db.add(vectors=vectors, ids=[f"doc_{i}" for i in range(1000)])

# Search
query = np.random.randn(128).astype(np.float32)
results = db.search(query, k=10)

for r in results:
    print(f"  {r.id}: score={r.score:.4f}")

Gaussian Splats with Online Learning

This is the core differentiator. Each vector is a Gaussian Splat with three learnable parameters:

  • ฮผ (mean): position in embedding space
  • ฮบ (concentration): how sharply the splat responds โ€” higher ฮบ = more precise match required
  • ฮฑ (amplitude): how "important" the memory is โ€” grows with positive feedback, decays with negative
from m2m import AdvancedVectorDB
import numpy as np

db = AdvancedVectorDB(latent_dim=128, use_gaussian_splats=True)
vectors = np.random.randn(500, 128).astype(np.float32)
db.add(vectors=vectors, ids=[f"item_{i}" for i in range(500)])

# Search uses Gaussian scoring: ฮฑยทexp(-ฮบยทโ€–xโˆ’ฮผโ€–ยฒ)
query = np.random.randn(128).astype(np.float32)
results = db.search(query, k=10)

# Provide feedback โ€” the system learns from it
db.feedback(
    query=query,
    relevant_ids=[results[0].id, results[1].id],
    irrelevant_ids=[results[8].id, results[9].id],
)

# Next search adapts: strong splats get promoted, weak ones demoted
results2 = db.search(query, k=10)

Update rules (Hebbian + temporal decay):

Event ฮฑ (importance) ฮบ (concentration) ฮผ (position)
Relevant feedback ฮฑ += lr_ฮฑ ยท ฮฑ ฮบ += lr_ฮบ ยท โ€–x โˆ’ ฮผโ€–โปยฒ ฮผ += lr_ฮผ ยท (x โˆ’ ฮผ) (drift toward query)
Irrelevant feedback ฮฑ *= (1 โˆ’ lr_ฮฑ) ฮบ -= 0.5 ยท lr_ฮบ โ€”
Temporal decay ฮฑ *= exp(-ฮปยทฮ”t) โ€” โ€”

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   REST API (FastAPI)             โ”‚
โ”‚            Collections ยท CRUD ยท Search           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚           SemanticMemoryDB / VectorDB            โ”‚
โ”‚      Hybrid Search ยท Fusion ยท Temporal Decay     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Splats  โ”‚  HRM2    โ”‚  EBM      โ”‚  SOC          โ”‚
โ”‚  (ฮผ,ฮบ,ฮฑ) โ”‚  Engine  โ”‚  Energy   โ”‚  Consolidate  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚              Backend Layer (pluggable)           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   CPU   โ”‚  CUDA    โ”‚  Vulkan  โ”‚  Transformed    โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚              Storage Layer                       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  WAL    โ”‚  Persistence    โ”‚  GPUVectorIndex     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Semantic Memory System

from m2m import SemanticMemoryDB
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("BAAI/bge-small-en-v1.5")
encoder = lambda text: model.encode(text, show_progress_bar=False)

mem = SemanticMemoryDB(
    encoder=encoder,
    latent_dim=384,
    fusion_method="rrf",           # Reciprocal Rank Fusion
    temporal_decay=True,            # Recent memories rank higher
    temporal_half_life_days=30.0,
    auto_categorize=True,
)

mem.store("User prefers dark mode for coding", metadata={"category": "preference"})
mem.store("We decided to use Qdrant for production", metadata={"category": "decision"})

results = mem.search("what did we decide about databases?", k=5)

Hybrid Search

Method Tuning Required Best For
RRF No General-purpose (recommended)
Weighted Yes Domain-specific with known priorities
vector_only No Pure semantic search
bm25_only No Pure keyword search

Benchmarks

All data below is measured. No synthetic or estimated numbers.

System: AMD Ryzen 5 3400G, 16 GB RAM, NVIDIA RTX 3090, Python 3.12

CPU Scale Progression (dim=640, k=64)

Splats (N) Linear Scan (ms) M2M HRM2 (ms) Speedup QPS
100 0.12 โ€” โ€” 8,337
1,000 1.45 โ€” โ€” 691
10,000 10.04 โ€” โ€” 100
100,000 94.79 0.99 32.4x 1,013

At 100K splats, HRM2 hierarchical routing achieves 32x speedup over linear scan.

Gaussian Scoring Overhead

The Gaussian re-ranking step (ฮฑยทexp(-ฮบยทdยฒ)) adds ~0.1ms on top of the candidate retrieval, and promotes high-ฮฑ splats in the ranking.


Development

git clone https://github.com/schwabauerbriantomas-gif/m2m-vector-search.git
cd m2m-vector-search
pip install -e ".[all]"

# Run tests
pytest tests/ -v  # 394 tests

# Code quality
black src/ tests/
flake8 src/ tests/

License

GNU Affero General Public License v3.0 โ€” see LICENSE for details.