Simple Space for the Kokoro Model
Generate text transcripts with timestamps from audio or video
Transcribe or translate audio and YouTube videos
Generate natural-sounding speech from text using OpenAI's API
Generate realistic audio from text
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Generate speech from text with customizable voices
Accessibility PDF & pasted text to speech converter w/ gTTs
Talk to Qwen2Audio with Gradio and WebRTC ⚡️
Ebook2audiobook docker space beta
Turn Any Article to Podcast
V1.0Convert any Ebook to AudioBook with Xtts + VoiceCloning!
Enhance your audio quality by removing noise
Kokoro is a speech synthesis tool designed to convert text into natural-sounding speech. It provides a simple and intuitive interface for generating audio from written content, leveraging advanced models and engines to deliver high-quality voice outputs.
• Multiple Voice Options: Choose from a variety of voices to match your needs.
• Language Support: Generate speech in multiple languages for global accessibility.
• Engine Flexibility: Utilize different speech synthesis engines for varying output styles.
• SSML Support: Customize speech patterns, pitch, and speed using Speech Synthesis Markup Language.
• Real-Time Generation: Quickly convert text to speech with minimal processing time.
What engines does Kokoro support?
Kokoro supports a range of engines, including Google Text-to-Speech, Amazon Polly, and others, depending on your setup.
Can I customize the speech output?
Yes, Kokoro allows you to customize speech using SSML, enabling control over pitch, speed, and emphasis.
Is Kokoro free to use?
Kokoro offers a free tier with basic features, but advanced options may require a subscription or payment.