Evaluate Persian LLMs on various tasks
Evaluate code generation with diverse feedback types
Evaluate RAG systems with visual analytics
Launch web-based model application
View LLM Performance Leaderboard
Compare LLM performance across benchmarks
Determine GPU requirements for large language models
Benchmark models using PyTorch and OpenVINO
Calculate memory needed to train AI models
Create and upload a Hugging Face model card
Evaluate model predictions with TruLens
Measure execution times of BERT models using WebGPU and WASM
Browse and submit evaluations for CaselawQA benchmarks
The š¤ Persian LLM Leaderboard is a comprehensive platform designed to evaluate and compare Persian language models across various tasks. It provides a centralized hub for researchers and developers to assess the performance of different models in the Persian language, fostering innovation and transparency in the field of natural language processing.
⢠Model Comparison: Evaluate and compare the performance of multiple Persian LLMs on different tasks. ⢠Task-Specific Benchmarks: Assess models on a variety of tasks tailored to the Persian language, such as text classification, summarization, and translation. ⢠Detailed Metrics: Access detailed performance metrics to understand model strengths and weaknesses. ⢠Visualizations: Interactive charts and graphs to visualize model performance and trends over time. ⢠Regular Updates: Stay informed with the latest developments and updates in Persian LLMs. ⢠Community-Driven: Submit your own models or results to contribute to the leaderboard.
What models are included on the leaderboard?
The leaderboard features a variety of Persian language models, including both state-of-the-art and emerging models from researchers and developers.
How are models evaluated?
Models are evaluated based on their performance on a range of tasks specific to the Persian language, using standard benchmarks and metrics.
Can I submit my own model?
Yes, you can submit your Persian LLM for evaluation. Follow the submission guidelines provided on the platform to include your model on the leaderboard.