🎙️ VoxCPM Text-to-Speech

Generate highly expressive speech using VoxCPM-0.5B model. Optionally clone voices by providing reference audio.

Built with anycoder

0.5 5
5 50

Enable external TN tool

Enable external Denoise tool

Enable retrying for bad cases

Tips:

  • For voice cloning, upload a clear reference audio (3-10 seconds recommended)
  • Higher CFG values provide better prompt adherence but may affect naturalness
  • Increase inference timesteps for better quality at the cost of speed
  • The retry mechanism helps handle edge cases automatically