Text-to-Speech

Open-source text-to-speech models and voice synthesis engines

Text-to-Speech — comparison of fish-speech, CosyVoice, VoxCPM, sherpa-onnx
SOTA Open Source TTS
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages
Popularity
Stars30,33720,99218,82112,171
Global Rank#1046#1938#2251#4074
Weekly Activity(May 8 – May 14)
New Stars+32+14+105+19
Pushes1005
Issues Closed0200
Community
Forks2,5712,4172,2361,384
Contributors955226206
Open Issues2481296567
Project Info
OwnerfishaudioFunAudioLLMOpenBMBk2-fsa
LicenseNOASSERTIONApache-2.0Apache-2.0Apache-2.0
LanguagePythonPythonPythonC++
CreatedOct 2023Jul 2024Sep 2025Sep 2022