Transcribe audio with emotions and events
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Generate natural-sounding speech from text using a voice you choose
Generate speech from text with adjustable speed
A demo of Indic Parler-TTS
MP-SENet is a speech enhancement model.
Moonshine ASR models running on-device, in your web browser.
Better AI powered platform to purify your speech signal
Whisper model to transcript japanese audio to katakana.
Kokoro is an open-weight TTS model with 82 million parameters.
Spanish finetune for the original F5 model.
Identify speakers in an audio file
Ebook2audiobook docker space beta
SenseVoice is an advanced speech synthesis and transcription tool designed to analyze audio data with remarkable accuracy. It specializes in identifying and transcribing emotions, events, and key points within audio content, making it a powerful solution for understanding spoken data at a deeper level.
What languages does SenseVoice support?
SenseVoice supports multiple languages, including English, Spanish, French, Mandarin, and several others, making it accessible to a wide range of users.
Can I use SenseVoice for real-time transcription?
Yes, SenseVoice offers real-time transcription capabilities, allowing users to transcribe audio as it is being spoken.
Is SenseVoice free to use?
SenseVoice offers a free trial version with basic features. For advanced capabilities, users may need to subscribe to a paid plan.