AI Explorer
语音
语音识别、文本转语音和音频 AI 项目。
数据来源
类目列表结合 GitHub 搜索规则、仓库 topics、项目描述和同步快照生成。
排行逻辑
项目会先经过类目相关性过滤,再按 stars 和质量信号排序。
适合人群
适合已经明确 AI 工作流或工具类型,希望快速比较候选项目的用户。
48 个项目
Robust Speech Recognition via Large-Scale Weak Supervision
- Stars
- 103,797
- 增长
- -
- 语言
- Python
- 创建时间
- 2022-09-16
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
- Stars
- 67,544
- 增长
- -
- 语言
- Python
- 创建时间
- 2023-11-29
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
- Stars
- 59,130
- 增长
- -
- 语言
- Python
- 创建时间
- 2024-01-14
Port of OpenAI's Whisper model in C/C++
- Stars
- 51,118
- 增长
- -
- 语言
- C++
- 创建时间
- 2022-09-25
Open-Source Frontier Voice AI
- Stars
- 48,612
- 增长
- -
- 语言
- Python
- 创建时间
- 2025-08-25
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
- Stars
- 47,211
- 增长
- -
- 语言
- Go
- 创建时间
- 2023-03-18
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
- Stars
- 45,641
- 增长
- -
- 语言
- Python
- 创建时间
- 2020-05-20
💖🧸 Self hosted, you-owned Grok Companion, a container of souls of waifu, cyber livings to bring them into our worlds, wishing to achieve Neuro-sama's altitude. Capable of realtime voice chat, Minecraft, Factorio playing. Web / macOS / Windows supported.
- Stars
- 40,427
- 增长
- -
- 语言
- TypeScript
- 创建时间
- 2024-12-01
A generative speech model for daily dialogue.
- Stars
- 39,520
- 增长
- -
- 语言
- Python
- 创建时间
- 2024-05-27
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
- Stars
- 36,908
- 增长
- -
- 语言
- Python
- 创建时间
- 2021-08-07
Instant voice cloning by MIT and MyShell. Audio foundation model.
- Stars
- 36,804
- 增长
- -
- 语言
- Python
- 创建时间
- 2023-11-29
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.
- Stars
- 35,380
- 增长
- -
- 语言
- Python
- 创建时间
- 2021-08-16
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning
- Stars
- 32,024
- 增长
- -
- 语言
- Python
- 创建时间
- 2025-09-16
Use claude-code for free in the terminal, VSCode extension or discord like OpenClaw (voice supported)
- Stars
- 30,078
- 增长
- -
- 语言
- Python
- 创建时间
- 2026-01-28
Faster Whisper transcription with CTranslate2
- Stars
- 23,905
- 增长
- -
- 语言
- Python
- 创建时间
- 2023-02-11
🚀 AI 全自动短视频引擎 | AI Fully Automated Short Video Engine
- Stars
- 23,760
- 增长
- -
- 语言
- Python
- 创建时间
- 2025-11-07
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
- Stars
- 22,761
- 增长
- -
- 语言
- Python
- 创建时间
- 2022-12-09
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
- Stars
- 21,876
- 增长
- -
- 语言
- Python
- 创建时间
- 2024-07-03
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
- Stars
- 21,480
- 增长
- -
- 语言
- Python
- 创建时间
- 2025-02-06
Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
- Stars
- 19,871
- 增长
- -
- 语言
- Python
- 创建时间
- 2022-09-24
A TTS model capable of generating ultra-realistic dialogue in one pass.
- Stars
- 19,326
- 增长
- -
- 语言
- Python
- 创建时间
- 2025-04-19
Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.
- Stars
- 18,673
- 增长
- -
- 语言
- Python
- 创建时间
- 2022-11-24
Translate the video from one language to another and embed dubbing & subtitles.
- Stars
- 18,131
- 增长
- -
- 语言
- Python
- 创建时间
- 2023-10-02
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
- Stars
- 17,634
- 增长
- -
- 语言
- Python
- 创建时间
- 2019-08-05
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
- Stars
- 17,560
- 增长
- -
- 语言
- Python
- 创建时间
- 2019-08-05
🧠 Leon is your open-source personal assistant.
- Stars
- 17,344
- 增长
- -
- 语言
- TypeScript
- 创建时间
- 2019-02-10
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
- Stars
- 14,887
- 增长
- -
- 语言
- Jupyter Notebook
- 创建时间
- 2019-09-03
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages
- Stars
- 13,237
- 增长
- -
- 语言
- C++
- 创建时间
- 2022-09-01
Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper live transcription, speaker diarization, and Ollama summarization built on Rust. 100% local processing. no cloud required. Meetily (Meetly Ai - https://meetily.ai) is the #1 Self-hosted, Open-source Ai meeting note taker for macOS & Windows.
- Stars
- 12,943
- 增长
- -
- 语言
- Rust
- 创建时间
- 2024-12-26
Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.
- Stars
- 12,789
- 增长
- -
- 语言
- Swift
- 创建时间
- 2025-11-18