Speech Corpus Creation Tool
Browse a list of machine learning datasets
Browse and search datasets
ReWrite datasets with a text instruction
Upload files to a Hugging Face repository
Label data efficiently with ease
Create datasets with FAQs and SFT prompts
Count tokens in datasets and plot distribution
Create Reddit dataset
Convert PDFs to a dataset and upload to Hugging Face
List of French datasets not referenced on the Hub
Create a domain-specific dataset project
Support by Parquet, CSV, Jsonl, XLS
Dhravani is a Speech Corpus Creation Tool designed for creating high-quality speech datasets. It allows users to record voices and transcribe them efficiently, making it an essential tool for dataset creation in various applications like speech recognition, voice assistants, and language research.
• Multi-Language Support: Record and transcribe speech in multiple languages, catering to diverse linguistic needs. • AI-Powered Transcription: Utilizes advanced AI algorithms for accurate and rapid transcription of recorded audio. • Collaborative Workspace: Enables team collaboration for efficient dataset creation and management. • Audio Quality Control: Includes tools to analyze and enhance audio quality for optimal dataset performance. • Customizable Metadata: Allows users to add and manage metadata for better organization and search functionality.
What languages does Dhravani support?
Dhravani supports a wide range of languages, including popular ones like English, Spanish, Mandarin, and many others. Check the app for the full list of supported languages.
Do I need an internet connection to use Dhravani?
Yes, an internet connection is required for AI transcription and feature updates, but audio recording can be done offline.
What formats can I export my dataset in?
Dhravani allows you to export your speech corpus in common formats such as WAV, MP3, and XML, depending on your project requirements.