memfuse

About memfuse

Official Core Services for MemFuse - the lightning-fast open-source memory layer that gives LLMs persistent, queryable memory across conversations and sessions.

m

Published by

memfuse

Visit View Profile

README.md

View on GitHub

MemFuse Core Services
The official core services for MemFuse, the open-source memory layer for LLMs.
Explore the Docs »

View Demo · Report Bug · Request Feature

Table of Contents

Why MemFuse?
Key Features
Quick Start
Documentation
Roadmap
Community & Support

Why MemFuse?

Large language model applications are inherently stateless by design. When the context window reaches its limit, previous conversations, user preferences, and critical information simply disappear.

MemFuse bridges this gap by providing a persistent, queryable memory layer between your LLM and storage backend, enabling AI agents to:

Remember user preferences and context across sessions
Recall facts and events from thousands of interactions later
Optimize token usage by avoiding redundant chat history resending
Learn continuously and improve performance over time

This repository contains the official server core services for seamless integration with MemFuse Client SDK. For comprehensive information about the MemFuse Client features, please visit the MemFuse Client Python SDK.

✨ Key Features

Category	What you get
Lightning Fast	Efficient buffering with write aggregation, intelligent prefetching, and query caching for exceptional performance
Unified Cognitive Search	Seamlessly combines vector, graph, and keyword search with intelligent fusion and re-ranking for superior accuracy and insights
Cognitive Memory Architecture	Human-inspired layered memory system: M0 (raw data/episodic), M1 (structured facts/semantic), and M2 (knowledge graph/conceptual)
Local-First	Run the server locally or deploy with Docker — no mandatory cloud dependencies or fees
Pluggable Backends	Built on TimescaleDB with custom pgai implementation, compatible with Qdrant, Neo4j, Redis, and expanding backend support
Multi-Tenant Support	Secure isolation between users, agents, and sessions with robust scoping and access controls
Framework-Friendly	Seamless integration with LangChain, AutoGen, Vercel AI SDK, and direct OpenAI/Anthropic/Gemini/Ollama API calls
Production-Ready Testing	Comprehensive layered testing framework (smoke, integration, e2e, performance) ensuring reliability at scale
Apache 2.0 Licensed	Fully open source — fork, extend, customize, and deploy as you need

🚀 Quick Start

Installation

Note: This repository contains the MemFuse Core Server. If you need to know more about the standalone Python SDK for client applications, please visit the MemFuse Client Python SDK.

Setting Up the MemFuse Server

To set up the MemFuse server locally:

Clone this repository:

git clone https://github.com/memfuse/memfuse.git
cd memfuse

Verify Docker availability and pull the required database image:

# Ensure Docker daemon is running first
docker --version  # Verify Docker is available

# Pull the TimescaleDB image
docker pull timescale/timescaledb-ha:pg17

Install dependencies and run the server using one of the following methods:

Using Poetry (Recommended)

poetry install
poetry run python scripts/memfuse_launcher.py

Using pip

pip install -e .
python scripts/memfuse_launcher.py

Installing the Client SDK

To use MemFuse in your applications, install the Python SDK simply from PyPI

pip install memfuse

Database Requirements

MemFuse uses TimescaleDB as its primary database backend with a custom pgai-like implementation for advanced vector operations and automatic embedding generation.

Prerequisites:

Docker daemon must be running (required for database startup)
Docker and Docker Compose installed
TimescaleDB: Automatically provided via timescale/timescaledb-ha:pg17 Docker image
pgvector: Vector similarity search (included in TimescaleDB image)
Custom pgai: Event-driven embedding system (built-in, no external extension needed)

⚠️ Important: Make sure Docker daemon is running before starting MemFuse. The service automatically starts the TimescaleDB container.

Note: MemFuse implements its own pgai-like functionality and does not require TimescaleDB's official pgai extension.

For detailed installation instructions, configuration options, and troubleshooting tips, see the online Installation Guide.

Basic Usage

Here's a comprehensive example demonstrating how to use the MemFuse Python SDK with OpenAI to interact with the MemFuse server:

from memfuse.llm import OpenAI
from memfuse import MemFuse
import os

memfuse_client = MemFuse(
  # api_key=os.getenv("MEMFUSE_API_KEY")
  # base_url=os.getenv("MEMFUSE_BASE_URL"),
)

memory = memfuse_client.init(
  user="alice",
  # agent="agent_default",
  # session=<randomly-generated-uuid>
)

# Initialize your LLM client with the memory scope
llm_client = OpenAI(
    api_key=os.getenv("OPENAI_API_KEY"),  # Your OpenAI API key
    memory=memory
)

# Make a chat completion request
response = llm_client.chat.completions.create(
    model="gpt-4o", # Or any model supported by your LLM provider
    messages=[{"role": "user", "content": "I'm planning a trip to Mars. What is the gravity there?"}]
)

print(f"Response: {response.choices[0].message.content}")
# Example Output: Response: Mars has a gravity of about 3.721 m/s², which is about 38% of Earth's gravity.

Contextual Follow-up

Now, ask a follow-up question. MemFuse will automatically recall relevant context from the previous conversation:

# Ask a follow-up question. MemFuse automatically recalls relevant context.
followup_response = llm_client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What are some challenges of living on that planet?"}]
)

print(f"Follow-up: {followup_response.choices[0].message.content}")
# Example Output: Follow-up: Some challenges of living on Mars include its thin atmosphere, extreme temperatures, high radiation levels, and the lack of liquid water on the surface.

🔥 That's it! Every subsequent call under the same scope automatically stores notable facts to memory and retrieves them when relevant.

Database Management

MemFuse provides comprehensive database management tools:

# Check database status and extensions
poetry run python scripts/database_manager.py status

# Reset data (keep schema) 
poetry run python scripts/database_manager.py reset

# Validate schema and triggers
poetry run python scripts/database_manager.py validate

# Complete schema rebuild (⚠️ destroys data)
poetry run python scripts/database_manager.py recreate

For detailed database management, see the Scripts Documentation.

📚 Documentation

Installation Guide: Comprehensive instructions for installing and configuring MemFuse
Getting Started: Step-by-step guide to integrating MemFuse into your projects
Examples: Sample implementations for chatbots, autonomous agents, customer support, LangChain integration, and more
Testing Guide: Practical guide for running tests with different configurations

🛣 Roadmap

📦 Phase 1 – MVP ("Fast & Transparent Core")

[x] Lightning-fast performance — Efficient buffering with write aggregation, intelligent prefetching, and query caching
[x] Level 0 Memory Layer — Raw chat history storage and retrieval
[x] Multi-tenant support — Secure user, agent, and session isolation
[x] Level 1 Memory Layer — Semantic and episodic memory processing
[x] TimescaleDB Integration — Production-ready database backend with custom pgai implementation
[x] Enhanced Memory Architecture — M0 (raw), M1 (episodic), M2 (semantic), M3 (procedural), MSMG (graph) layers
[x] Complete REST API — Users, Agents, Sessions, Messages, and Knowledge endpoints
[x] Re-ranking plugin — LLM-powered memory relevance scoring
[x] Python SDK — Complete client library for Python applications
[x] Benchmarks — LongMemEval and MSC evaluation frameworks

🧭 Phase 2 – Temporal Mastery & Quality

[ ] JavaScript SDK — Client library for Node.js and browser applications
[ ] Multimodal memory support — Image, audio, and video memory capabilities
[ ] Level 2 KG memory support — Knowledge graph-based conceptual memory
[ ] Time-decay policies — Automatic forgetting of stale information

💡 Have an idea? Open an issue or participate in our discussion board!

🤝 Community & Support

GitHub Discussions: Participate in roadmap votes, RFCs, and Q&A sessions
Issues: Report bugs and request new features
Documentation: Comprehensive guides and API references

If MemFuse enhances your projects, please ⭐ star the repository — it helps the project grow and reach more developers!

License

This MemFuse Server repository is licensed under the Apache 2.0 License. See the LICENSE file for complete details.