View and submit language model evaluations
Calculate survival probability based on passenger details
Compare code model performance on benchmarks
GIFT-Eval: A Benchmark for General Time Series Forecasting
Calculate GPU requirements for running LLMs
Load AI models and prepare your space
Measure over-refusal in LLMs using OR-Bench
Measure execution times of BERT models using WebGPU and WASM
Evaluate LLM over-refusal rates with OR-Bench
Track, rank and evaluate open LLMs and chatbots
Convert Stable Diffusion checkpoint to Diffusers and open a PR
Compare LLM performance across benchmarks
Find and download models from Hugging Face
ContextualBench-Leaderboard is a platform designed for benchmarking and evaluating language models. It provides a centralized space to view and compare the performance of different models across various tasks and datasets. Users can submit their own model evaluations and track progress in the field of natural language processing.
• Comprehensive Leaderboard: Displays performance metrics of language models in a sorted and searchable format.
• Submission Portal: Allows researchers to upload their model evaluations for inclusion in the leaderboard.
• Comparison Tools: Enables side-by-side comparison of models based on specific benchmarks or datasets.
• Filtering Options: Users can filter results by model type, dataset, or performance metric.
• Real-Time Updates: The leaderboard is updated regularly to reflect the latest submissions and advancements.
• Documentation and Guides: Provides resources for understanding evaluation metrics and submission processes.
What models are included in ContextualBench-Leaderboard?
The leaderboard includes a wide range of language models, from state-of-the-art models to smaller, specialized models. The exact list is updated regularly.
How do I submit my model for evaluation?
To submit your model, navigate to the submission portal on the ContextualBench-Leaderboard website and follow the detailed guidelines provided. Ensure your submission includes all required metrics and information.
Why should I use ContextualBench-Leaderboard?
ContextualBench-Leaderboard offers a user-friendly interface and comprehensive tools for comparing and analyzing language models. It is an excellent resource for researchers and developers looking to benchmark their models or stay informed about the latest advancements in the field.