Measure over-refusal in LLMs using OR-Bench
Evaluate AI-generated results for accuracy
Retrain models for new data at edge devices
View and compare language model evaluations
Merge machine learning models using a YAML configuration file
Convert PaddleOCR models to ONNX format
Predict customer churn based on input details
Multilingual Text Embedding Model Pruner
Pergel: A Unified Benchmark for Evaluating Turkish LLMs
Browse and submit evaluations for CaselawQA benchmarks
Download a TriplaneGaussian model checkpoint
Determine GPU requirements for large language models
Measure BERT model performance using WASM and WebGPU
OR-Bench Leaderboard is a tool designed to measure and benchmark over-refusal in Large Language Models (LLMs). It provides a standardized framework to evaluate when and how models refuse to answer questions or generate text. This leaderboard helps researchers and developers understand the limitations and safety mechanisms of LLMs by comparing their performance across different scenarios.
• Model Comparison: Allows users to compare multiple models based on their refusal patterns.
• Refusal Trigger Evaluation: Tests models against a curated set of triggers to assess their refusal thresholds.
• Metric Aggregation: Provides aggregated metrics such as refusal rates and response patterns.
• Result Sharing: Enables sharing of benchmark results for community collaboration.
• Documented Methodology: Offers transparent documentation of evaluation methods and datasets.
1. What is the purpose of OR-Bench Leaderboard?
The purpose of OR-Bench Leaderboard is to provide a standardized way to measure and compare over-refusal behaviors in LLMs, helping to identify models with balanced safety and utility.
2. Why is measuring over-refusal important?
Measuring over-refusal is important because it helps assess a model's ability to avoid harmful or inappropriate responses while still providing useful answers.
3. How can I interpret the results from OR-Bench Leaderboard?
Results show how often a model refuses to answer and under what conditions. Lower refusal rates may indicate a model that is more permissive, while higher rates suggest stricter safety mechanisms.