Browse and submit model evaluations in LLM benchmarks
Multilingual Text Embedding Model Pruner
Benchmark LLMs in accuracy and translation across languages
View NSQL Scores for Models
Browse and evaluate ML tasks in MLIP Arena
Export Hugging Face models to ONNX
Convert Stable Diffusion checkpoint to Diffusers and open a PR
Measure BERT model performance using WASM and WebGPU
Convert a Stable Diffusion XL checkpoint to Diffusers and open a PR
Launch web-based model application
Browse and submit evaluations for CaselawQA benchmarks
Measure execution times of BERT models using WebGPU and WASM
Explore and submit models using the LLM Leaderboard
The OpenLLM Turkish leaderboard v0.2 is a platform designed for model benchmarking and evaluation, specifically tailored for Turkish language models. It allows users to browse, compare, and submit evaluations of various large language models (LLMs) on Turkish benchmarks. This tool facilitates the analysis of model performance across different tasks and datasets, helping researchers and practitioners identify the best-performing models for their specific needs.
• Model Benchmarking: Evaluate and compare the performance of different Turkish language models.
• Submission Interface: Easily submit your own model evaluations for inclusion in the leaderboard.
• Filtering and Sorting: Filter models by performance metrics, dataset, or task type.
• Detailed Model Comparisons: View side-by-side comparisons of model performance across multiple benchmarks.
• Visualizations: Access charts and graphs to understand performance trends and differences.
• Documentation: Get access to resources and guides for using the leaderboard effectively.
1. What types of models are included in the leaderboard?
The leaderboard includes a variety of Turkish language models, ranging from small-scale to state-of-the-art models, evaluated on diverse tasks and datasets.
2. How are models evaluated on the leaderboard?
Models are evaluated based on standard benchmarks and metrics relevant to Turkish language tasks, such as perplexity, BLEU score, or accuracy on specific datasets.
3. Can I submit my own model for evaluation?
Yes, the leaderboard provides a submission interface where you can upload your model’s evaluation results after preparing them according to the platform’s guidelines.