catai
<div align="center"> <img alt="Catai Logo" src="docs/demo/logo.webp" width="360px"/> <h1>Catai</h1> </div> <div align="center"> [](https://github.com/withcatai/catai/actions/workflows/build.yml) [](https://www.npmjs.com/package/catai) [](https://www.npmjs.com/package/catai) [](https://www.npmjs.com/package/catai) </div> <br /> >🚀 **Exciting Updates Coming Soon!**<br><br>**New UI**, **Function Calling**, and **more amazing features** are on the way! Stay tuned for updates. Run GGUF models on your computer with a chat ui. > Your own AI assistant runs locally on your computer. Inspired by [Node-Llama-Cpp](https://github.com/withcatai/node-llama-cpp), [Llama.cpp](https://github.com/ggerganov/llama.cpp) ## Installation & Use Make sure you have [Node.js](https://nodejs.org/en/) (**download current**) installed. ```bash npm install -g catai catai install qwen3-4b-q4_k_m catai up ```  ## Features - Auto detect programming language 🧑💻 - Click on user icon to show original message 💬 - Real time text streaming ⏱️ - Fast model downloads 🚀 ## CLI ``` Usage: catai [options] [command] Options: -V, --version output the version number -h, --help display help for command Commands: install|i [options] [models...] Install any GGUF model models|ls [options] List all available models use [model] Set model to use serve|up [options] Open the chat website update Update server to the latest version active Show active model remove|rm [options] [models...] Remove a model uninstall Uninstall server and delete all models node-llama-cpp|cpp [options] Node llama.cpp CLI - recompile node-llama-cpp binaries help [command] display help for command ``` ### Install command ``` Usage: cli install|i [options] [models...] Install any GGUF model Arguments: models Model name/url/path Options: -t --tag [tag] The name of the model in local directory -l --latest Install the latest version of a model (may be unstable) -b --bind [bind] The model binding method -bk --bind-key [key] key/cookie that the binding requires -h, --help display help for command ``` ### Cross-platform You can use it on Windows, Linux and Mac. This package uses [node-llama-cpp](https://github.com/withcatai/node-llama-cpp) which supports the following platforms: - darwin-x64 - darwin-arm64 - linux-x64 - linux-arm64 - linux-armv7l - linux-ppc64le - win32-x64-msvc ### Good to know - All download data will be downloaded at `~/catai` folder by default. - The download is multi-threaded, so it may use a lot of bandwidth, but it will download faster! ## Web API There is also a simple API that you can use to ask the model questions. ```js const response = await fetch('http://127.0.0.1:3000/api/chat/prompt', { method: 'POST', body: JSON.stringify({ prompt: 'Write me 100 words story' }), headers: { 'Content-Type': 'application/json' } }); const data = await response.text(); ``` For more information, please read the [API guide](https://github.com/withcatai/catai/blob/main/docs/api.md) ## Development API You can also use the development API to interact with the model. ```ts import {createChat, downloadModel, initCatAILlama, LlamaJsonSchemaGrammar} from "catai"; // skip downloading the model if you already have it await downloadModel("qwen3-4b-q4_k_m"); const llama = await initCatAILlama(); const chat = await createChat({ model: "qwen3-4b-q4_k_m" }); const fullResponse = await chat.prompt("Give me array of random numbers (10 numbers)", { grammar: new LlamaJsonSchemaGrammar(llama, { type: "array", items: { type: "number", minimum: 0, maximum: 100 }, }), topP: 0.8, temperature: 0.8, }); console.log(fullResponse); // [10, 2, 3, 4, 6, 9, 8, 1, 7, 5] ``` (For the full list of model, run `catai models`) ### Node-llama-cpp@beta low level integration You can use the model with [node-llama-cpp@beta](https://github.com/withcatai/node-llama-cpp/pull/105) Catai enables you to easily manage the models and chat with them. ```ts import {downloadModel, getModelPath, initCatAILlama, LlamaChatSession} from 'catai'; // download the model, skip if you already have the model await downloadModel( "https://huggingface.co/giladgd/Qwen3-Reranker-4B-GGUF/resolve/main/Qwen3-Reranker-4B.Q3_K_M.gguf?download=true", "qwen3-reranker-4b" ); // get the model path with catai const modelPath = getModelPath("qwen3-reranker-4b"); const llama = await initCatAILlama(); const model = await llama.loadModel({ modelPath }); const context = await model.createContext(); const session = new LlamaChatSession({ contextSequence: context.getSequence() }); const a1 = await session.prompt("Hi there, how are you?"); console.log("AI: " + a1); ``` ## Configuration You can edit the configuration via the web ui. More information [here](https://github.com/withcatai/catai/blob/main/docs/configuration.md) ## Contributing Contributions are welcome! Please read our [contributing guide](./CONTRIBUTING.md) to get started. ## License This project uses [Llama.cpp](https://github.com/ggerganov/llama.cpp) to run models on your computer. So any license applied to Llama.cpp is also applied to this project. <br /> <div align="center" width="360"> <img alt="Star please" src="docs/demo/star.please.png" style="border-radius: 12px" width="360px" margin="auto" /> <br/> <p align="right"> <i>If you like this repo, star it ✨</i> </p> </div>