Speech Corpus Creation Tool
Generate dataset for machine learning
Label data for machine learning models
Create a large, deduplicated dataset for LLM pre-training
Create datasets with FAQs and SFT prompts
Explore and manage datasets for machine learning
Build datasets using natural language
Build datasets using natural language
Convert a model to Safetensors and open a PR
Display html
Count tokens in datasets and plot distribution
Search and find similar datasets
Generate synthetic datasets for AI training
Dhravani is a Speech Corpus Creation Tool designed for creating high-quality speech datasets. It allows users to record voices and transcribe them efficiently, making it an essential tool for dataset creation in various applications like speech recognition, voice assistants, and language research.
• Multi-Language Support: Record and transcribe speech in multiple languages, catering to diverse linguistic needs. • AI-Powered Transcription: Utilizes advanced AI algorithms for accurate and rapid transcription of recorded audio. • Collaborative Workspace: Enables team collaboration for efficient dataset creation and management. • Audio Quality Control: Includes tools to analyze and enhance audio quality for optimal dataset performance. • Customizable Metadata: Allows users to add and manage metadata for better organization and search functionality.
What languages does Dhravani support?
Dhravani supports a wide range of languages, including popular ones like English, Spanish, Mandarin, and many others. Check the app for the full list of supported languages.
Do I need an internet connection to use Dhravani?
Yes, an internet connection is required for AI transcription and feature updates, but audio recording can be done offline.
What formats can I export my dataset in?
Dhravani allows you to export your speech corpus in common formats such as WAV, MP3, and XML, depending on your project requirements.