Explore and filter model evaluation results
Try the Hugging Face API through the playground
Generate benchmark plots for text generation models
Explore and submit NER models
A Leaderboard that demonstrates LMM reasoning capabilities
More advanced and challenging multi-task evaluation
Generate a detailed dataset report
VLMEvalKit Evaluation Results Collection
Browse and compare Indic language LLMs on a leaderboard
Display a treemap of languages and datasets
This project is a GUI for the gpustack/gguf-parser-go
Finance chatbot using vectara-agentic
Life System and Habit Tracker
GTBench is a data visualization tool designed to help users explore and filter model evaluation results. It provides an interactive interface to analyze and compare performance metrics of different models, enabling deeper insights into their effectiveness.
• Interactive Visualization: Explore model performance through dynamic and customizable visualizations. • Advanced Filtering: Apply filters to narrow down results based on specific criteria such as model type, dataset, or performance metrics. • Real-Time Updates: Get instant feedback as you adjust filters or visualization settings. • Multi-Model Support: Compare results from multiple models in a single interface. • Customizable Dashboards: Tailor the layout to focus on the metrics that matter most. • Export Capabilities: Save and share visualizations or raw data for further analysis.
What does GTBench stand for?
GTBench stands for Graph Tool Benchmark, a utility for analyzing and visualizing model evaluation data.
Can I use GTBench for models other than graphs?
Yes, GTBench supports a variety of model types, including but not limited to graph-based models.
How do I export visualization results from GTBench?
To export results, use the "Export" button in the toolbar, which allows you to save visualizations as images or raw data as CSV files.