🎙️ VoxCPM Text-to-Speech
Generate highly expressive speech using VoxCPM-0.5B model. Optionally clone voices by providing reference audio.
0.5 5
5 50
Enable external TN tool
Enable external Denoise tool
Enable retrying for bad cases
Tips:
- For voice cloning, upload a clear reference audio (3-10 seconds recommended)
- Higher CFG values provide better prompt adherence but may affect naturalness
- Increase inference timesteps for better quality at the cost of speed
- The retry mechanism helps handle edge cases automatically