Persian Text Embedding Benchmark
Evaluate RAG systems with visual analytics
Evaluate adversarial robustness using generative models
View NSQL Scores for Models
Analyze model errors with interactive pages
Display and submit LLM benchmarks
Retrain models for new data at edge devices
Explore and visualize diverse models
Submit deepfake detection models for evaluation
Evaluate AI-generated results for accuracy
Measure over-refusal in LLMs using OR-Bench
Explore and submit models using the LLM Leaderboard
Benchmark AI models by comparison
The PTEB Leaderboard is a benchmarking platform designed to evaluate and compare the performance of Persian text embedding models. It provides a comprehensive framework for assessing how well these models handle Persian language tasks, making it an essential tool for researchers and developers in the NLP community. The leaderboard allows users to view and analyze the results of various models across different metrics and datasets.
• Comprehensive Benchmarking: Evaluates models on multiple Persian language tasks and datasets.
• Model Comparison: Enables side-by-side comparison of different embedding models.
• Customizable Metrics: Supports a variety of evaluation metrics tailored for Persian text.
• Interactive Visualizations: Presents results in easy-to-understand charts and graphs.
• Regular Updates: Maintains up-to-date results as new models are released.
What is the purpose of the PTEB Leaderboard?
The PTEB Leaderboard is designed to provide standardized benchmarks for Persian text embedding models, helping researchers and developers identify top-performing models for their specific use cases.
Can I add my own model to the leaderboard?
Yes, the PTEB Leaderboard allows submissions of new models. Visit the official documentation for guidelines on how to prepare and submit your model for evaluation.
How often are the benchmarks updated?
The benchmarks are updated regularly as new models are released and existing models are fine-tuned. Follow the leaderboard for the latest updates and improvements.