Track, rank and evaluate open LLMs and chatbots
Playground for NuExtract-v1.5
fake news detection using distilbert trained on liar dataset
Upload a table to predict basalt source lithology, temperature, and pressure
Detect AI-generated texts with precision
Analyze Ancient Greek text for syntax and named entities
Analyze text using tuned lens and visualize predictions
Detect harms and risks with Granite Guardian 3.1 8B
Display and filter LLM benchmark results
Search for similar AI-generated patent abstracts
Parse and highlight entities in an email thread
Track, rank and evaluate open Arabic LLMs and chatbots
Generate relation triplets from text
The Open LLM Leaderboard is a platform designed to track, rank, and evaluate open-source Large Language Models (LLMs) and chatbots. It serves as a comprehensive resource for comparing and understanding the performance of various models across different benchmarks and use cases. The leaderboard provides transparency and insights into the capabilities of open-source LLMs, helping users make informed decisions about which models to use for their specific needs.
• Model Tracking: Continuously updated list of open-source LLMs and chatbots
• Performance Benchmarking: Standardized tests to evaluate models on various tasks
• Custom Comparisons: Ability to compare models based on specific criteria
• Community Contributions: Input from the community to ensure diverse perspectives
• Regular Updates: New models and benchmark results added periodically
What types of models are included on the Open LLM Leaderboard?
The leaderboard includes a wide range of open-source Large Language Models and chatbots, covering various architectures and use cases.
How are the models ranked?
Models are ranked based on their performance on standardized benchmarks, which evaluate tasks such as text generation, question answering, and conversational dialogue.
Can I contribute to the Open LLM Leaderboard?
Yes, the leaderboard encourages community contributions, including suggestions for new models, benchmarks, or features. Visit the website for details on how to participate.