Home
Softono
godot-whisper

godot-whisper

Open source MIT C
119
Stars
12
Forks
2
Issues
4
Watchers
4 weeks
Last Commit

About godot-whisper

A GDExtension addon for the Godot Engine that enables realtime audio transcription, supports OpenCL for most platforms, Metal for Apple devices, and runs on a separate thread.

Platforms

Web Self-hosted

Languages

C

Godot Whisper

build Chat on Discord

Features

Realtime audio transcription Offline audio transcription
GPU acceleration Flash Attention
Voice Activity Detection (VAD) Quantized models
99 languages Model downloader

Platforms

Platform GPU Backend
macOS Metal + Accelerate
iOS Metal + Accelerate
Windows OpenCL + Vulkan
Linux OpenCL + Vulkan
Android OpenCL
Web CPU (WebGPU disabled until Godot supports it)

Video Tutorial

Godot Whisper

How to install

GitHub Release

Go to a Github Release, copy paste the addons folder to the samples folder.

Godot Assets

Download directly from Godot Asset Library.

Afterwards:

Activate the extension in Project -> Project Settings -> Godot Whisper. Restart the Godot editor.

Models

Models manual download link: Hugging Face.

Model Size
tiny 78 MB
base 148 MB
small 244M
medium 769M
large-v1 1550M
large-v2 1550M
large-v3 1550M
large-v3-turbo 809M

Global settings

Go to Project -> Project Settings -> General -> Audio -> Input (Check Advance Settings).

You will see a bunch of settings there.

Microphone transcription feeds Whisper at 16000 Hz. The addon resamples captured audio from the actual runtime mix rate reported by AudioServer.get_mix_rate().

Optional: set Project Settings -> Audio -> Driver -> Mix Rate (audio/driver/mix_rate) to 16000 to avoid resampling overhead. This may reduce overall game audio quality, so only use it if speech transcription is the main audio workload. Godot may still use a different runtime mix rate on some platforms or devices; verify with AudioServer.get_mix_rate(). If the runtime mix rate is not 16000, the addon will resample.

Star History

Star History Chart