About esp-ai

ESP-AI is a low-cost, simple solution for integrating AI voice dialogue capabilities into hardware development boards, particularly the ESP32 series including ESP32-S3 and ESP32-C3. It provides a complete end-to-end pipeline combining Automatic Speech Recognition, Large Language Models, and Text-to-Speech to enable natural language interaction for robots and smart devices. The system is designed to be injected as a dependency without disrupting existing projects, supporting both Node.js server-side and Arduino or ESP-IDF hardware code frameworks. Key features include customizable offline wake words, full conversation chains with streaming data interaction, and support for conversation interruption. It allows dynamic command recognition for tasks like appliance control and offers a plugin architecture to integrate with any LLM, TTS, or IAT service. The platform supports one-to-many client management with independent configurations per device, authentication, and high concurrency handling via load balancing. De

w

Published by

wangzongming

Visit View Profile

README.md

View on GitHub

ESP-AI

硬件接入AI最简单、最低成本的方案
The simplest and lowest cost solution for any item to access AI

Changelog · 中文文档 · English Docs

language

👉简体中文

👉Japanese

Intro

Provides a complete AI dialogue solution for your development board, including but not limited to the IAT(ASR)+LLM+TTS integration solution for the ESP32 series development boards. It is injected into the project as a dependency without affecting existing projects.

For developing the dialogue functionality of robots, you only need to prepare the IAT(ASR), LLM, and TTS services, and leave the rest to ESP-AI.

The server-side code of this project is based on Node.js, and the hardware code is based on Arduino/IDF.

Open source is not easy, click the Star button in the upper right corner to show your support~

🖥 Website

🖥 open platform

A platform based on ESP-AI that provides end services and management services to businesses and individuals. Visit the Open Platform. It offers free ASR (Automatic Speech Recognition), TTS (Text-to-Speech), and LLM (Large Language Model) services. On this platform, you can clone a custom voice with just a 15-second audio clip.

✨ Features

✔️ Customizable offline wake words with multiple built-in wake-up methods (voice, button, serial port, Tianwen ASRPro)
✔️ Complete conversation chain: IAT (ASR) ➡️ LLM/RAG ➡️ TTS
✔️ Fast response algorithms for TTS/LLM, designed to balance service cost while providing the quickest response time
✔️ Supports conversation interruption
✔️ Recognizes user commands (appliance control, singing, etc.) and can dynamically respond based on context
✔️ Configurable
✔️ Plugin-based, allowing integration with any LLM/TTS/IAT using plugins
✔️ One-to-many relationship between service and clients, with independent configuration for each client (hardware)
✔️ Connection supports authentication
✔️ Full-chain streaming data interaction
✔️ Developer platform offers: free services, visual configuration, etc.
✔️ Client configuration webpage provided
✔️ Easily handles high concurrency scenarios (requires Nginx for load balancing)
✔️ Ready to use out of the box
✔️ Supports esp32s3/esp32c3
✔️ Supports OPEN API

🧐 Next Steps

[ ] 🤔 Improve accuracy of built-in offline wake-up (currently recommended to use Tianwen ASRPro)
[ ] 🤔 Online wake word generation

📦 Install

Server

docker run -itd -p 8088:8088 -v /esp-ai-server/index.js:/server/index.js --name esp-ai-server registry.cn-shanghai.aliyuncs.com/xiaomingio/esp-ai:1.0.0

Client

Download the dependency on the release page and burn it to the development board, see details: Client Install

🔨 Inject Soul into Your Robot with Just a Few Lines of Code

Below are the Node.js and Arduino codes you need to write if you only require dialogue functionality.

🏪 Discussion Group

QQ 交流群1: 854445223

QQ 交流群2: 952051286

🎥 Case Study Video

【生活不易，Doro卖艺！】 https://www.bilibili.com/video/BV1uvbKzREYP/?share_source=copy_web&vd_source=041c9610a29750f498de1bafe953086b

【一键制作你的AI动图桌宠（在线免费制作）】 https://www.bilibili.com/video/BV1xut4zuEf8/?share_source=copy_web&vd_source=041c9610a29750f498de1bafe953086b

【ESP-AI 玩偶方案板】 https://www.bilibili.com/video/BV1YTbDzQEk8/?share_source=copy_web&vd_source=041c9610a29750f498de1bafe953086b

【强噪音下对话和tft屏(ESP-AI新版预)】 https://www.bilibili.com/video/BV1KD7KzsEoc/?share_source=copy_web&vd_source=041c9610a29750f498de1bafe953086b

🤝 Contributing

Let's build a better esp-ai together.

We warmly invite contributions from everyone. Feel free to share your ideas through Pull Requests or GitHub Issues.

🌍 Star geographical distribution

quote

If this project has helped your research, please cite us:

@software{ESP-AI,
    title        = {{ESP-AI}},
    author       = {小明IO},
    year         = 2024,
    journal      = {GitHub repository},
    publisher    = {GitHub},
    howpublished = {\url{https://github.com/wangzongming/esp-ai}}
}

esp-ai