AI Explorer

语音

语音识别、文本转语音和音频 AI 项目。

数据来源

类目列表结合 GitHub 搜索规则、仓库 topics、项目描述和同步快照生成。

排行逻辑

项目会先经过类目相关性过滤,再按 stars 和质量信号排序。

适合人群

适合已经明确 AI 工作流或工具类型,希望快速比较候选项目的用户。

39 个项目

#1
whisperopenai/whisper45

Robust Speech Recognition via Large-Scale Weak Supervision

语音
Stars
99,785
增长
-
语言
Python
创建时间
2022-09-16
#2
unslothunslothai/unsloth45

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

语音
Stars
64,731
增长
-
语言
Python
创建时间
2023-11-29
#3
GPT-SoVITSRVC-Boss/GPT-SoVITS43

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

语音
Stars
57,598
增长
-
语言
Python
创建时间
2024-01-14
#4
whisper.cppggml-org/whisper.cpp44

Port of OpenAI's Whisper model in C/C++

语音基础设施
Stars
49,893
增长
-
语言
C++
创建时间
2022-09-25
#5
LocalAImudler/LocalAI45

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

图像语音
Stars
46,362
增长
-
语言
Go
创建时间
2023-03-18
#6
TTScoqui-ai/TTS38

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

语音
Stars
45,334
增长
-
语言
Python
创建时间
2020-05-20
#7
ChatTTS2noise/ChatTTS37

A generative speech model for daily dialogue.

语音基础设施
Stars
39,289
增长
-
语言
Python
创建时间
2024-05-27
#8
MockingBirdbabysor/MockingBird33

🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time

语音
Stars
36,902
增长
-
语言
Python
创建时间
2021-08-07
#9
OpenVoicemyshell-ai/OpenVoice38

Instant voice cloning by MIT and MyShell. Audio foundation model.

语音
Stars
36,535
增长
-
语言
Python
创建时间
2023-11-29
#10
khojkhoj-ai/khoj34

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

图像RAG搜索
Stars
34,618
增长
-
语言
Python
创建时间
2021-08-16
#11
voiceboxjamiepine/voicebox40

The open-source AI voice studio. Clone, dictate, create.

语音
Stars
26,978
增长
-
语言
TypeScript
创建时间
2026-01-25
#12
free-claude-codeAlishahryar1/free-claude-code81

Use claude-code for free in the terminal, VSCode extension or discord like OpenClaw (voice supported)

编程语音
Stars
26,387
增长
-
语言
Python
创建时间
2026-01-28
#13
faster-whisperSYSTRAN/faster-whisper35

Faster Whisper transcription with CTranslate2

语音基础设施
Stars
23,001
增长
-
语言
Python
创建时间
2023-02-11
#14
whisperXm-bain/whisperX37

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

语音
Stars
21,979
增长
-
语言
Python
创建时间
2022-12-09
#15
CosyVoiceFunAudioLLM/CosyVoice38

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

语音
Stars
21,126
增长
-
语言
Python
创建时间
2024-07-03
#16
index-ttsindex-tts/index-tts30

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

语音
Stars
20,623
增长
-
语言
Python
创建时间
2025-02-06
#17
buzzchidiwilliams/buzz42

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

语音
Stars
19,305
增长
-
语言
Python
创建时间
2022-09-24
#18
dianari-labs/dia36

A TTS model capable of generating ultra-realistic dialogue in one pass.

语音
Stars
19,293
增长
-
语言
Python
创建时间
2025-04-19
#19
VoxCPMOpenBMB/VoxCPM41

VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

语音
Stars
19,222
增长
-
语言
Python
创建时间
2025-09-16
#20
Pixelle-VideoAIDC-AI/Pixelle-Video73

🚀 AI 全自动短视频引擎 | AI Fully Automated Short Video Engine

图像视频语音
Stars
18,427
增长
-
语言
Python
创建时间
2025-11-07
#21
pyvideotransjianchang512/pyvideotrans38

Translate the video from one language to another and embed dubbing & subtitles.

语音
Stars
17,455
增长
-
语言
Python
创建时间
2023-10-02
#22
leonleon-ai/leon42

🧠 Leon is your open-source personal assistant.

智能体语音自动化
Stars
17,243
增长
-
语言
TypeScript
创建时间
2019-02-10
#23
NeMoNVIDIA-NeMo/NeMo43

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

语音
Stars
17,237
增长
-
语言
Python
创建时间
2019-08-05
#24
FunASRmodelscope/FunASR42

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

语音
Stars
16,136
增长
-
语言
Python
创建时间
2022-11-24
#25
vosk-apialphacep/vosk-api33

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

语音
Stars
14,739
增长
-
语言
Jupyter Notebook
创建时间
2019-09-03
#26
PaddleSpeechPaddlePaddle/PaddleSpeech39

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

语音
Stars
12,601
增长
-
语言
Python
创建时间
2017-11-14
#27
sherpa-onnxk2-fsa/sherpa-onnx38

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

语音
Stars
12,341
增长
-
语言
C++
创建时间
2022-09-01
#28
meetilyZackriya-Solutions/meetily40

Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper live transcription, speaker diarization, and Ollama summarization built on Rust. 100% local processing. no cloud required. Meetily (Meetly Ai - https://meetily.ai) is the #1 Self-hosted, Open-source Ai meeting note taker for macOS & Windows.

语音
Stars
12,161
增长
-
语言
Rust
创建时间
2024-12-26
#29
edge-ttsrany2/edge-tts30

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

语音
Stars
10,975
增长
-
语言
Python
创建时间
2021-05-10
#30
piperrhasspy/piper32

A fast, local neural text to speech system

语音
Stars
10,969
增长
-
语言
C++
创建时间
2023-01-10
向下滚动加载更多