llm-leaderboard

Star

Here are 8 public repositories matching this topic...

VILA-Lab / Open-LLM-Leaderboard

Star

Open-LLM-Leaderboard: Open-Style Question Evaluation. Paper at https://arxiv.org/abs/2406.07545

leaderboard llms open-ended-question-marker llm-evaluation open-ended-evaluation llm-leaderboard

Updated Jun 27, 2024
Python

AgileWoW / best-ai-models-leaderboard

Star

A technical guide and live-tracking repository for the world's top AI models, specialized by coding, reasoning, and multimodal performance.

llm-leaderboard multimodal-ai ai-benchmark chatbot-arena agentic-coding swe-bench swe-bench-pro

Updated Feb 26, 2026

tatn / awesome-ai-benchmarks

Star

A curated collection of AI model benchmarks and leaderboards — covering general rankings, coding, agents, reasoning, embeddings, and more

benchmark machine-learning awesome ai leaderboard embeddings speech-recognition awesome-list ai-benchmarks llm coding-agents llm-leaderboard

Updated Apr 18, 2026

georgejeffers / Wordle-AI-Benchmark

Star

WordleBench — Deterministic AI Wordle benchmark. Compare 34+ LLMs (GPT-5, Claude 4.5, Gemini, Grok, Llama) head-to-head on accuracy, speed, and cost across 50 standardized words.

typescript nextjs gemini wordle language-models claude wordle-solver gpt-5 vercel-ai-sdk llm-leaderboard ai-benchmark llm-benchmark ai-comparison deterministic-testing

Updated Feb 6, 2026
TypeScript

LARIkoz / ai-model-benchmarks

Star

119 AI models × 55 benchmarks with per-score freshness dates, auto-updated pricing, task routing. Every score has a date and source URL. Daily CI.

embeddings gemini model-selection benchmarks awesome-list gpt ai-agents claude model-comparison ai-pipeline ai-benchmarks ai-models llm openrouter llm-pricing llm-leaderboard llm-routing model-routing

Updated May 21, 2026
HTML

leoncuhk / awesome-llm-bench

Star

Daily-synced Top 10 LLM leaderboards (SWE-bench Verified, Terminal-Bench, OSWorld, ARC-AGI-2, HLE) from benchlm.ai, plus a curated AI coding tools landscape.

benchmark leaderboard awesome-list ai-agents awesome-llm llm-leaderboard ai-coding-tools arc-agi coding-agent swe-bench llm-benchmark

Updated May 21, 2026
Python

butiploka / mimo-bench-arena

Star

Open LLM leaderboard featuring Xiaomi MiMo v2.5 & MiMo 100T head-to-head with GPT-5, Claude, Gemini, DeepSeek, Llama 4. ARC-AGI · SWE-Bench · MMLU-Pro · GPQA · HumanEval · BFCL.

Updated May 18, 2026
TypeScript

PsychoSmiley / RP-Leaderboard

Star

E/RP benchmark leaderboard for LLMs

machine-learning erp leaderboard lms roleplay large-language-models open-ended-question-marker llm-evaluation open-ended-evaluation llm-leaderboard

Updated Feb 2, 2026
Python

Improve this page

Add a description, image, and links to the llm-leaderboard topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-leaderboard topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-leaderboard

Here are 8 public repositories matching this topic...

VILA-Lab / Open-LLM-Leaderboard

AgileWoW / best-ai-models-leaderboard

tatn / awesome-ai-benchmarks

georgejeffers / Wordle-AI-Benchmark

LARIkoz / ai-model-benchmarks

leoncuhk / awesome-llm-bench

butiploka / mimo-bench-arena

PsychoSmiley / RP-Leaderboard

Improve this page

Add this topic to your repo