Evaluate model accuracy using Fbeta score
Browse and filter ML model leaderboard data
Generate leaderboard comparing DNA models
Retrain models for new data at edge devices
View LLM Performance Leaderboard
Search for model performance across languages and benchmarks
Display leaderboard of language model evaluations
Pergel: A Unified Benchmark for Evaluating Turkish LLMs
Compare audio representation models using benchmark results
Evaluate adversarial robustness using generative models
Convert a Stable Diffusion XL checkpoint to Diffusers and open a PR
Browse and submit evaluations for CaselawQA benchmarks
Display genomic embedding leaderboard
FBeta_Score is a tool designed for model benchmarking that evaluates the accuracy of classification models using the Fbeta score. The Fbeta score is a measure that combines precision and recall into a single metric, allowing for a balanced evaluation of model performance. It is particularly useful for assessing models when there is an imbalance in data classes or when one is more interested in either precision or recall.
1. What is the Fbeta score?
The Fbeta score is a metric that combines precision and recall, with a parameter beta that weights their importance. A beta value greater than 1 emphasizes recall, while a value less than 1 emphasizes precision.
2. When should I use a specific beta value?
Choose a beta value based on your problem's requirements. For example, if recall is more critical (e.g., detecting rare events), use beta > 1. If precision matters more (e.g., avoiding false positives), use beta < 1.
3. Does FBeta_Score support multi-class classification?
Yes, FBeta_Score can handle multi-class classification problems by computing scores for each class or providing an overall score.