Uncensored General Intelligence Leaderboard
VLMEvalKit Evaluation Results Collection
What happened in open-source AI this year, and whatβs next?
Calculate VRAM requirements for running large language models
Explore and analyze RewardBench leaderboard data
Search and save datasets generated with a LLM in real time
NSFW Text Generator for Detecting NSFW Text
More advanced and challenging multi-task evaluation
Check your progress in a Deep RL course
Need to analyze data? Let a Llama-3.1 agent do it for you!
Try the Hugging Face API through the playground
Explore and compare LLM models through interactive leaderboards and submissions
Browse and filter LLM benchmark results
Display and analyze PyTorch Image Models leaderboard
Generate synthetic dataset files (JSON Lines)
Embed and use ZeroEval for evaluation tasks
Browse and submit evaluation results for AI benchmarks
Explore token probability distributions with sliders
https://huggingface.co/spaces/VIDraft/mouse-webgen
Launch Argilla for data labeling and annotation