chidiwilliams

Professional software vendor delivering innovative solutions on the Softono platform. Specialized in both open-source and proprietary software development.

Open Source

chidiwilliams/buzz

[[简体中文](readme/README.zh_CN.md)] <- 点击查看中文页面。 # Buzz [Documentation](https://chidiwilliams.github.io/buzz/) Transcribe and translate audio offline on your personal computer. Powered by OpenAI's [Whisper](https://github.com/openai/whisper). ![MIT License](https://img.shields.io/badge/license-MIT-green) [![CI](https://github.com/chidiwilliams/buzz/actions/workflows/ci.yml/badge.svg)](https://github.com/chidiwilliams/buzz/actions/workflows/ci.yml) [![codecov](https://codecov.io/github/chidiwilliams/buzz/branch/main/graph/badge.svg?token=YJSB8S2VEP)](https://codecov.io/github/chidiwilliams/buzz) ![GitHub release (latest by date)](https://img.shields.io/github/v/release/chidiwilliams/buzz) [![Github all releases](https://img.shields.io/github/downloads/chidiwilliams/buzz/total.svg)](https://GitHub.com/chidiwilliams/buzz/releases/) ![Buzz](https://raw.githubusercontent.com/chidiwilliams/buzz/refs/heads/main/buzz/assets/buzz-banner.jpg) ## Features - Transcribe audio and video files or Youtube links - Live realtime audio transcription from microphone - Presentation window for easy accessibility during events and presentations - Speech separation before transcription for better accuracy on noisy audio - Speaker identification in transcribed media - Multiple whisper backend support - CUDA acceleration support for Nvidia GPUs - Apple Silicon support for Macs - Vulkan acceleration support for Whisper.cpp on most GPUs, including integrated GPUs - Export transcripts to TXT, SRT, and VTT - Advanced Transcription Viewer with search, playback controls, and speed adjustment - Keyboard shortcuts for efficient navigation - Watch folder for automatic transcription of new files - Command-Line Interface for scripting and automation ## Installation ### macOS Download the `.dmg` from the [SourceForge](https://sourceforge.net/projects/buzz-captions/files/). ### Windows Get the installation files from the [SourceForge](https://sourceforge.net/projects/buzz-captions/files/). App is not signed, you will get a warning when you install it. Select `More info` -> `Run anyway`. ### Linux Buzz is available as a [Flatpak](https://flathub.org/apps/io.github.chidiwilliams.Buzz) or a [Snap](https://snapcraft.io/buzz). To install flatpak, run: ```shell flatpak install flathub io.github.chidiwilliams.Buzz ``` [![Download on Flathub](https://flathub.org/api/badge?svg&locale=en)](https://flathub.org/en/apps/io.github.chidiwilliams.Buzz) To install snap, run: ```shell sudo apt-get install libportaudio2 libcanberra-gtk-module libcanberra-gtk3-module sudo snap install buzz ``` [![Get it from the Snap Store](https://snapcraft.io/static/images/badges/en/snap-store-black.svg)](https://snapcraft.io/buzz) ### PyPI Install [ffmpeg](https://www.ffmpeg.org/download.html) Ensure you use Python 3.12 environment. Install Buzz ```shell pip install buzz-captions python -m buzz ``` **GPU support for PyPI** To have GPU support for Nvidia GPUS on Windows, for PyPI installed version ensure, CUDA support for [torch](https://pytorch.org/get-started/locally/) ``` pip3 install -U torch==2.8.0+cu129 torchaudio==2.8.0+cu129 --index-url https://download.pytorch.org/whl/cu129 pip3 install nvidia-cublas-cu12==12.9.1.4 nvidia-cuda-cupti-cu12==12.9.79 nvidia-cuda-runtime-cu12==12.9.79 --extra-index-url https://pypi.ngc.nvidia.com ``` ### Latest development version For info on how to get latest development version with latest features and bug fixes see [FAQ](https://chidiwilliams.github.io/buzz/docs/faq#9-where-can-i-get-latest-development-version). ### Support Buzz You can help the Buzz by starring 🌟 the repo and sharing it with your friends. ### Screenshots <div style="display: flex; flex-wrap: wrap;"> <img alt="File import" src="https://github.com/chidiwilliams/buzz/raw/main/share/screenshots/buzz-1-import.png" style="max-width: 18%; margin-right: 1%;" /> <img alt="Main screen" src="https://github.com/chidiwilliams/buzz/raw/main/share/screenshots/buzz-2-main_screen.png" style="max-width: 18%; margin-right: 1%; height:auto;" /> <img alt="Preferences" src="https://github.com/chidiwilliams/buzz/raw/main/share/screenshots/buzz-3-preferences.png" style="max-width: 18%; margin-right: 1%; height:auto;" /> <img alt="Model preferences" src="https://github.com/chidiwilliams/buzz/raw/main/share/screenshots/buzz-3.2-model-preferences.png" style="max-width: 18%; margin-right: 1%; height:auto;" /> <img alt="Transcript" src="https://github.com/chidiwilliams/buzz/raw/main/share/screenshots/buzz-4-transcript.png" style="max-width: 18%; margin-right: 1%; height:auto;" /> <img alt="Live recording" src="https://github.com/chidiwilliams/buzz/raw/main/share/screenshots/buzz-5-live_recording.png" style="max-width: 18%; margin-right: 1%; height:auto;" /> <img alt="Resize" src="https://github.com/chidiwilliams/buzz/raw/main/share/screenshots/buzz-6-resize.png" style="max-width: 18%;" /> </div>

LLM Tools & Chat UIs Audio Editing & DAW

14.4K Github Stars

Open Source

GPT-Automator

# GPT Automator ![App](assets/app.png) Your voice-controlled Mac assistant. GPT Automator lets you perform tasks on your Mac using your voice. For example, opening applications, looking up restaurants, and synthesizing information. Made by [Luke Harries](https://harries.co/) and [Chidi Williams](https://chidiwilliams.com/) at the [London EA Hackathon, February 2023](https://forum.effectivealtruism.org/events/gTSwA8RoGidjpLnf6/london-ea-hackathon). [![GPT Automator demo](https://cdn.loom.com/sessions/thumbnails/7bfa82c604f3412fbbb04191ce2ae12f-00001.gif)](https://www.loom.com/share/7bfa82c604f3412fbbb04191ce2ae12f "GPT Automator demo") ## Requirements * [FFmpeg](https://ffmpeg.org/) ```shell # on Ubuntu or Debian sudo apt update && sudo apt install ffmpeg # on Arch Linux sudo pacman -S ffmpeg # on MacOS using Homebrew (https://brew.sh/) brew install ffmpeg # on Windows using Chocolatey (https://chocolatey.org/) choco install ffmpeg # on Windows using Scoop (https://scoop.sh/) scoop install ffmpeg ``` ## Instructions 1. Install the dependencies from the `requirements.txt` or `pyproject.toml` files. 2. Create a `.env` file from the `.env.example` file and fill in the OpenAI API key. 3. Run `python gui.py` to run the GUI and click 'Record' to say your prompt. Alternatively, run `python main.py [prompt]` to run the CLI. ## How it works GPT Automator converts your audio input to text using OpenAI's Whisper. Then, it uses a [LangChain](https://github.com/hwchase17/langchain) Agent to choose a set of actions, including generating AppleScript (for desktop automation) and JavaScript (for browser automation) commands from your prompt using OpenAI's GPT-3 ("text-davinci-003"), and then executing the resulting script. ## Example prompts * Find the result of a calculation. Prompt: "What is 2 + 2?" -> It will write AppleScript to open up a calculator and type in 5 * 5. * Find restaurants nearby. Prompt: "Find restaurants near me" -> It will open up Google search, read the text on the page, and say the best restaurants. * Play a game of chess. Prompt: "Play a game of chess" -> It will open up Chess.com and start clicking around. ## Learn more Checkout our blog posts for more information: - [Chidi's blog post](https://chidiwilliams.com/post/gpt-automator/) - [Luke's blog post](https://harries.co/ea-hackathon-gpt-automator-and-langchain/) ## Disclaimer This project executes code generated from natural language and may be susceptible to [prompt injection](https://en.wikipedia.org/wiki/Prompt_engineering#Prompt_injection) and similar attacks. This work was made as a proof-of-concept and is not intended for production use.

AI Agents RPA

256 Github Stars

chidiwilliams

Software by chidiwilliams

chidiwilliams/buzz

GPT-Automator