Submit models for evaluation and view leaderboard
Explore and submit models using the LLM Leaderboard
Evaluate open LLMs in the languages of LATAM and Spain.
GIFT-Eval: A Benchmark for General Time Series Forecasting
View RL Benchmark Reports
Find recent high-liked Hugging Face models
Pergel: A Unified Benchmark for Evaluating Turkish LLMs
Evaluate reward models for math reasoning
Evaluate and submit AI model results for Frugal AI Challenge
Request model evaluation on COCO val 2017 dataset
Measure over-refusal in LLMs using OR-Bench
Display genomic embedding leaderboard
Calculate GPU requirements for running LLMs
GAIA Leaderboard is a platform designed for model benchmarking where users can submit their AI models for evaluation. It provides a transparent and collaborative environment to compare model performance across various datasets and metrics, helping researchers and developers identify top-performing models and improve their own.
What types of models can I submit to GAIA Leaderboard?
GAIA Leaderboard supports a wide range of AI models, including but not limited to computer vision, natural language processing, and machine learning models. Check the submission guidelines for specific requirements.
How long does model evaluation take?
Evaluation time varies depending on the complexity of your model and the dataset size. You will receive a confirmation email once your model is processed.
Can I customize the evaluation metrics?
Yes, GAIA Leaderboard allows you to define custom benchmarks and metrics to tailor evaluations to your specific needs. Contact support for detailed instructions.