NoKV

About NoKV

AI native file system

f

Published by

feichai0017

Visit View Profile

README.md

View on GitHub

Give your agents the interface they were trained on.

NoKV is a metadata control plane for agent workspaces — one filesystem-shaped namespace, built in Rust, over the runs, logs, checkpoints, and artifacts your AI work produces.

Docs · Quick Start · Benchmarks · Discussions

Listed In The AI-Native Storage Ecosystem

Linux Foundation CNCF Landscape
_{Listed in AI Native Infra / Storage and Cloud Native Storage.}

DBDB.io Database of Databases
_{Historical database profile; current NoKV is the Rust filesystem product line.}

What Is NoKV?

To your tools and agents, NoKV looks like a filesystem: paths, folders, files — mountable, listable, readable. Underneath, file bodies live as immutable blocks in S3-compatible object storage such as RustFS, MinIO, Ceph RGW, or AWS S3, and NoKV's built-in path-native metadata engine (Holt) keeps the namespace — what exists, where, in which version — transactional, queryable, and snapshot-able.

FUSE / SDK / CLI
  -> NoKV metadata service     (self-contained; no separate metadata DB to run)
  -> Holt inode/dentry metadata
  -> S3-compatible object store for file bodies

NoKV owns namespace truth, metadata transactions, snapshots, watches, and object-reference GC. The object store owns byte durability and replication. The metadata engine is built in, so local deployments operate a filesystem rather than a filesystem plus a separate Redis, MySQL, or TiKV cluster.

Why NoKV

Agent workflows are artifact-heavy; their workspaces aren't. Every run leaves behind configs, metrics, logs, checkpoints — and that state scatters across folders, JSON files, object-store keys, and database rows. Agents pay a navigation tax in tokens every time they go looking. NoKV gives that state one address, with the metadata guarantees the workload actually needs:

Checkpoints publish atomically. Readers see the complete new checkpoint or the previous one — never a half-written file, even across a crash.
Snapshots are time travel. Pin a frozen view of any subtree and keep reading it while jobs write; GC never deletes what a snapshot still needs.
Changes are events, not polls. Every create, rename, and publish lands as a typed, replayable event with a cursor.
Artifacts carry body references and digests, with cleanup of failed staged uploads.
Bodies are immutable, versioned blocks. Replacement publishes a new generation, so node-local caches never invalidate object bytes after publish.

The primary write model is write-once publish, matching how datasets, checkpoints, and artifacts are commonly written.

🤖 The Agent Interface

ls · stat · catalog · find · aggregate · read · grep

Seven verbs, one progressive-disclosure surface: an agent discovers what exists, learns what is queryable, and pays to read only what it needs. Predicates, sort, and projection are pushed into the engine, so a "top-5 runs by val_loss" report costs two calls — one catalog, one find. grep sweeps a subtree and returns line-numbered matches with citable evidence URIs (nokv-native://path@generation:N#L3).

The verbs are defined in nokv-client: the tool definitions are LLM-ready JSON schemas, and execute_agent_tool routes calls over the same AgentNamespace trait whether the namespace is remote (metadata RPC) or embedded.

Today the agent verbs ship in the Rust SDK; filesystem operations ship in the nokv CLI and FUSE mount. An MCP server is in development — follow #354.

📊 Measured Evidence

Agent interface. We gave the same agent (gpt-5.4-mini) the same 875-run experiment corpus through two surfaces — raw SQL over SQLite, and the NoKV namespace — across five tasks, 10 repeats per arm and task (100 fully stateless runs), judged against deterministic gold facts neither arm can see:

Set mean (per 5-task pass)	Raw SQLite	NoKV namespace
Tasks solved correctly	4.40 / 5	4.50 / 5
Prompt tokens (incl. cached)	151,572	82,827 (−45%)
Cost (USD, list rates)	$0.0708	$0.0433 (−39%)

In this 10-repeat sample, the token gap widens to ~2.4× on the compound-exploration subset, and SQL won the single-shot analytics task — per-task results, wins and losses both, are in the report. Harness, tasks, judge, and the raw telemetry of all 100 runs are committed, so every published number is recomputable: see bench/agent-interface/.

Storage engine. Local engineering baselines, not official MLPerf results. Single-node service numbers are release builds through the NoKV server and Holt metadata path. FUSE comparison numbers depend on kernel/FUSE, object backend, cache settings, and workload shape.

Workload	Result
Metadata create (`mdtest`, 65k records)	~127K ops/s (single-writer, batched service path)
Same, one directory of 65k entries	Same order of throughput; path-native ART does not degrade on large directories
Checkpoint publish (1 MiB blocks, concurrency 16)	~1.1 GiB/s in the service/object benchmark
Dataset read (16 KiB samples, concurrency 16)	~3,000 samples/s in the service/object benchmark
Resident metadata	~1.5 KiB / file in the measured shape
Atomic checkpoint	Object bytes land first; metadata publishes a new generation atomically

Same-machine FUSE-vs-FUSE smoke against one RustFS endpoint currently shows NoKV behind JuiceFS on the end-to-end mounted path. That gap is expected to come from FUSE/RPC fixed costs and data-plane cache/writeback maturity, not from the Holt metadata engine alone.

NoKV vs JuiceFS

NoKV follows the same high-level separation used by systems like JuiceFS and 3FS: metadata is separate from file body storage. The difference is that NoKV ships its metadata engine as part of the filesystem and optimizes for agent-workspace and artifact publish/read patterns first.

	JuiceFS	NoKV
Metadata engine	External DB such as Redis, MySQL, or TiKV	Built-in, path-native Holt engine
Atomic checkpoint publish	POSIX rename/write semantics over the metadata engine	First-class publish-by-generation primitive
Block model	Slice/block model supporting broad POSIX behavior	Immutable object blocks plus new-generation manifests
Workspace-native primitives	Layered on top of the filesystem	Snapshots, typed watch, body descriptors, and GC floors are core metadata concepts
Agent query surface	None	`ls`/`stat`/`catalog`/`find`/`aggregate`/`read`/`grep` with push-down and line-numbered evidence
POSIX completeness	Mature production filesystem	P0 subset implemented; still hardening compatibility gates
Maturity	Production, large deployments	Young Rust implementation; single-node local mode is usable, replication is roadmap

NoKV is currently a usable single-node object-backed filesystem with a built-in Holt metadata engine behind a long-running metadata server. It is not yet a JuiceFS/3FS class distributed filesystem.

🏗️ Architecture

crates/
  nokv-types     storage-neutral namespace model types
  nokv-protocol  framed metadata RPC DTOs and binary codec
  nokv-meta      schema, MetadataCommand, Holt store, service core
  nokv-object    S3-compatible object body storage helpers
  nokv-client    Rust SDK over metadata service and object backend
  nokv-fuse      low-level FUSE frontend
  nokv-server    long-running metad process and framed RPC service
  nokv           CLI binary

bench/             system workload benchmark harness
docs/              product, architecture, layout, RustFS, and benchmark docs

For artifact and checkpoint publish, object bytes are uploaded first, then the metadata commit publishes the dentry, inode projection, and body manifest atomically. A crash between the two leaves orphan objects for GC, never a corrupt namespace. See Architecture.

🚦 Quick Start

Build and test:

cargo test --workspace
cargo build --release -p nokv --bin nokv

Start a local RustFS-compatible S3 endpoint and create the default bucket:

mkdir -p /tmp/rustfs-data
RUSTFS_ACCESS_KEY=rustfsadmin \
RUSTFS_SECRET_KEY=rustfsadmin \
rustfs server --address 127.0.0.1:9000 /tmp/rustfs-data &

AWS_ACCESS_KEY_ID=rustfsadmin \
AWS_SECRET_ACCESS_KEY=rustfsadmin \
aws --endpoint-url http://127.0.0.1:9000 \
  s3api create-bucket --bucket nokv

By default NoKV expects bucket nokv at http://127.0.0.1:9000 with development credentials rustfsadmin / rustfsadmin. See docs/rustfs.md for other deployment modes.

Start the metadata server, then initialize the namespace. Every other command talks to the server on 127.0.0.1:7777, so keep it running:

cargo run --release -p nokv --bin nokv -- serve &

cargo run --release -p nokv --bin nokv -- init

Publish and read an artifact:

cargo run --release -p nokv --bin nokv -- \
  put-artifact /runs/1/checkpoint.bin ./checkpoint.bin

cargo run --release -p nokv --bin nokv -- \
  cat /runs/1/checkpoint.bin > restored.bin

Mount with FUSE:

mkdir -p /tmp/nokv-mount

cargo run --release -p nokv --bin nokv -- \
  mount /tmp/nokv-mount

On macOS this requires macFUSE. NoKV passes the noappledouble mount option to avoid Finder/resource-fork AppleDouble sidecars; user xattr roundtrip is covered by the FUSE smoke test.

🧩 Crates

Crate	Role
`nokv-types`	Storage-neutral namespace model
`nokv-protocol`	Framed metadata RPC DTOs and binary codec
`nokv-object`	S3-compatible object body storage
`nokv-meta`	Schema, `MetadataCommand`, Holt store, service core
`nokv-client`	Rust SDK over the metadata service
`nokv-fuse`	Low-level FUSE frontend
`nokv-server`	Long-running metad process and framed RPC
`nokv`	`nokv` CLI binary

✅ Current Status

Implemented today:

low-level FUSE frontend for lookup, getattr, readdir, readdirplus, create, mkdir, symlink/readlink, rename, unlink, rmdir, read, write, flush, release, fsync, setattr/truncate, hardlink, xattr, advisory locks, special files, statfs, lseek, fallocate, and copy_file_range;
Holt-backed local metadata service with inode/dentry canonical metadata, dentry projection, command predicates, command dedupe, and history records;
chunked object data path where file bodies are split into immutable object blocks and published by metadata manifest;
S3-compatible object backend, with RustFS as the local development default;
Rust SDK and nokv CLI for namespace operations, artifact publish, metadata server access, and object range reads;
the seven-verb agent query surface (ls/stat/catalog/find/ aggregate/read/grep) in the Rust SDK, with LLM-ready tool definitions;
long-running nokv-server with health, readiness, stats, manual GC, and framed binary metadata RPC;
read-only snapshot mounts, snapshot-version reads, typed watch replay, and FUSE cache invalidation from watch events;
pending-object GC and metadata history GC tied to snapshot retention.

Not implemented yet:

distributed metadata replication and HA — NoKV is single-node today;
an MCP server for the agent verbs — in development, tracked in #354;
Python/fsspec and Kubernetes CSI packages;
full POSIX hardening such as ACL enforcement, broad external compatibility gate coverage, and mature multi-client cache coherence.

Benchmarks

The root bench/ package contains all benchmark entry points. System workload runs use nokv-bench:

cargo run --release -p nokv-bench --bin nokv-bench -- \
  --profile smoke \
  --workload all

Key workloads:

mdtest-easy and mdtest-hard metadata smoke workloads;
metadata-negative-lookup, artifact-index-lookup, and metadata-concurrent-read Holt metadata read-path workloads;
checkpoint-publish object-backed checkpoint publish/read;
training-read dataset-shaped object reads;
mlperf-dlio generated MLPerf Storage/DLIO-style I/O shape.

All workloads are single-node service runs; see docs/benchmarks.md for the full workload list, profiles, and gates.

The agent-interface benchmark — harness, tasks, judge, report, and the raw telemetry behind the numbers above — lives under bench/agent-interface/ and runs through the same package:

cargo run --release -p nokv-bench --bin yanex-agent-bench -- list-tasks

📚 Documentation

📄 License

Apache-2.0. See LICENSE.