AI Explorer

Speech

Speech recognition, text-to-speech, and audio AI projects.

Data source

Category lists combine GitHub search queries, repository topics, descriptions, and sync snapshots.

Ranking logic

Projects are filtered for category relevance, then ordered by stars and quality signals.

Best for

Use category pages when you already know the AI workflow or tool type you want to evaluate.

48 projects

whisperopenai/whisper43

Robust Speech Recognition via Large-Scale Weak Supervision

Speech

Stars: 103,797
Growth: -
Language: Python
Created: 2022-09-16

unslothunslothai/unsloth45

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

Speech

Stars: 67,544
Growth: -
Language: Python
Created: 2023-11-29

GPT-SoVITSRVC-Boss/GPT-SoVITS45

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Speech

Stars: 59,130
Growth: -
Language: Python
Created: 2024-01-14

whisper.cppggml-org/whisper.cpp44

Port of OpenAI's Whisper model in C/C++

SpeechInfra

Stars: 51,118
Growth: -
Language: C++
Created: 2022-09-25

VibeVoicemicrosoft/VibeVoice72

Open-Source Frontier Voice AI

Speech

Stars: 48,612
Growth: -
Language: Python
Created: 2025-08-25

LocalAImudler/LocalAI44

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

ImageSpeech

Stars: 47,211
Growth: -
Language: Go
Created: 2023-03-18

TTScoqui-ai/TTS38

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Speech

Stars: 45,641
Growth: -
Language: Python
Created: 2020-05-20

airimoeru-ai/airi65

💖🧸 Self hosted, you-owned Grok Companion, a container of souls of waifu, cyber livings to bring them into our worlds, wishing to achieve Neuro-sama's altitude. Capable of realtime voice chat, Minecraft, Factorio playing. Web / macOS / Windows supported.

Speech

Stars: 40,427
Growth: -
Language: TypeScript
Created: 2024-12-01

ChatTTS2noise/ChatTTS36

A generative speech model for daily dialogue.

SpeechInfra

Stars: 39,520
Growth: -
Language: Python
Created: 2024-05-27

#10

MockingBirdbabysor/MockingBird33

🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time

Speech

Stars: 36,908
Growth: -
Language: Python
Created: 2021-08-07

#11

OpenVoicemyshell-ai/OpenVoice38

Instant voice cloning by MIT and MyShell. Audio foundation model.

Speech

Stars: 36,804
Growth: -
Language: Python
Created: 2023-11-29

#12

khojkhoj-ai/khoj39

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

ImageRAGSearch

Stars: 35,380
Growth: -
Language: Python
Created: 2021-08-16

#13

VoxCPMOpenBMB/VoxCPM42

VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

Speech

Stars: 32,024
Growth: -
Language: Python
Created: 2025-09-16

#14

free-claude-codeAlishahryar1/free-claude-code82

Use claude-code for free in the terminal, VSCode extension or discord like OpenClaw (voice supported)

CodingSpeech

Stars: 30,078
Growth: -
Language: Python
Created: 2026-01-28

#15

faster-whisperSYSTRAN/faster-whisper35

Faster Whisper transcription with CTranslate2

SpeechInfra

Stars: 23,905
Growth: -
Language: Python
Created: 2023-02-11

#16

Pixelle-VideoAIDC-AI/Pixelle-Video42

🚀 AI 全自动短视频引擎 | AI Fully Automated Short Video Engine

ImageVideoSpeech

Stars: 23,760
Growth: -
Language: Python
Created: 2025-11-07

#17

whisperXm-bain/whisperX42

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Speech

Stars: 22,761
Growth: -
Language: Python
Created: 2022-12-09

#18

CosyVoiceFunAudioLLM/CosyVoice36

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Speech

Stars: 21,876
Growth: -
Language: Python
Created: 2024-07-03

#19

index-ttsindex-tts/index-tts35

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Speech

Stars: 21,480
Growth: -
Language: Python
Created: 2025-02-06

#20

buzzchidiwilliams/buzz42

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

Speech

Stars: 19,871
Growth: -
Language: Python
Created: 2022-09-24

#21

dianari-labs/dia36

A TTS model capable of generating ultra-realistic dialogue in one pass.

Speech

Stars: 19,326
Growth: -
Language: Python
Created: 2025-04-19

#22

FunASRmodelscope/FunASR43

Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.

SpeechMCP

Stars: 18,673
Growth: -
Language: Python
Created: 2022-11-24

#23

pyvideotransjianchang512/pyvideotrans40

Translate the video from one language to another and embed dubbing & subtitles.

Speech

Stars: 18,131
Growth: -
Language: Python
Created: 2023-10-02

#24

SpeechNVIDIA-NeMo/Speech43

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Speech

Stars: 17,634
Growth: -
Language: Python
Created: 2019-08-05

#25

NeMoNVIDIA-NeMo/NeMo43

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Speech

Stars: 17,560
Growth: -
Language: Python
Created: 2019-08-05

#26

leonleon-ai/leon41

🧠 Leon is your open-source personal assistant.

AgentsSpeechAutomation

Stars: 17,344
Growth: -
Language: TypeScript
Created: 2019-02-10

#27

vosk-apialphacep/vosk-api36

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Speech

Stars: 14,887
Growth: -
Language: Jupyter Notebook
Created: 2019-09-03

#28

sherpa-onnxk2-fsa/sherpa-onnx38

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

Speech

Stars: 13,237
Growth: -
Language: C++
Created: 2022-09-01

#29

meetilyZackriya-Solutions/meetily38

Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper live transcription, speaker diarization, and Ollama summarization built on Rust. 100% local processing. no cloud required. Meetily (Meetly Ai - https://meetily.ai) is the #1 Self-hosted, Open-source Ai meeting note taker for macOS & Windows.

Speech

Stars: 12,943
Growth: -
Language: Rust
Created: 2024-12-26

#30

supertonicsupertone-inc/supertonic37

Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.

Speech

Stars: 12,789
Growth: -
Language: Swift
Created: 2025-11-18

Scroll to load more