pinned
Running
129
Evaluation Guidebook
๐
Display evaluation metrics for LLM benchmarks
LLM evaluation
Display evaluation metrics for LLM benchmarks
A space to view and inspect all the tasks in lighteval
Explore and discover all leaderboards from the HF community
Display and inspect log files
Launch and monitor model evaluation jobs
Generate a command to run model evaluations
Compare tokenization lengths across languages