List of French datasets not referenced on the Hub
Generate synthetic datasets for AI training
Browse and search datasets
Convert PDFs to a dataset and upload to Hugging Face
Explore datasets on a Nomic Atlas map
Create Reddit dataset
ReWrite datasets with a text instruction
Browse and view Hugging Face datasets from a collection
Browse TheBloke models' history
Speech Corpus Creation Tool
Access NLPre-PL dataset and pre-trained models
Manage and label datasets for your projects
Jeux de données en français mal référencés sur le Hub is a curated list of French datasets that are not well-referenced or easily accessible on popular data hubs. This collection aims to highlight datasets that are valuable but may have been overlooked due to insufficient documentation or lack of visibility. It covers a wide range of domains, including natural language processing (NLP), computer vision, and data science applications. The goal is to provide researchers and developers with high-quality French-language datasets that can be used for various projects and research initiatives.
What types of datasets are included in Jeux de données en français mal référencés sur le Hub?
The collection includes a variety of datasets, such as text corpora, image datasets, and structured data, all primarily in French.
Why is this collection useful for researchers?
It provides easy access to French datasets that are often difficult to find, saving time and effort for researchers working with French data.
How can I contribute a dataset to this collection?
You can submit your dataset through the platform's submission process, usually involving a form or repository pull request, where it will be reviewed for inclusion.