🎙️ VoxCPM Text-to-Speech

Generate highly expressive speech using VoxCPM-0.5B model. Optionally clone voices by providing reference audio.

Built with anycoder

Text to Synthesize

Generated Speech

Tips:

For voice cloning, upload a clear reference audio (3-10 seconds recommended)
Higher CFG values provide better prompt adherence but may affect naturalness
Increase inference timesteps for better quality at the cost of speed
The retry mechanism helps handle edge cases automatically