List of French datasets not referenced on the Hub
Support by Parquet, CSV, Jsonl, XLS
Create Reddit dataset
Manage and analyze datasets with AI tools
Search for Hugging Face Hub models
Explore, annotate, and manage datasets
Save user inputs to datasets on Hugging Face
ReWrite datasets with a text instruction
Create datasets with FAQs and SFT prompts
Browse and view Hugging Face datasets
Validate JSONL format for fine-tuning
Organize and process datasets using AI
Explore and manage datasets for machine learning
Jeux de données en français mal référencés sur le Hub is a curated list of French datasets that are not well-referenced or easily accessible on popular data hubs. This collection aims to highlight datasets that are valuable but may have been overlooked due to insufficient documentation or lack of visibility. It covers a wide range of domains, including natural language processing (NLP), computer vision, and data science applications. The goal is to provide researchers and developers with high-quality French-language datasets that can be used for various projects and research initiatives.
What types of datasets are included in Jeux de données en français mal référencés sur le Hub?
The collection includes a variety of datasets, such as text corpora, image datasets, and structured data, all primarily in French.
Why is this collection useful for researchers?
It provides easy access to French datasets that are often difficult to find, saving time and effort for researchers working with French data.
How can I contribute a dataset to this collection?
You can submit your dataset through the platform's submission process, usually involving a form or repository pull request, where it will be reviewed for inclusion.