Large Language Model Rankings

LLM Capability Leaderboard

Compare leading language models by coding, writing, reasoning, math, and multimodal capability.

Snapshot updated 2026-05-20. Scores are AI Explorer normalized 0-100 ratings from public leaderboard signals.

Data source

Model data is organized from public leaderboards, provider information, and curated capability signals.

Ranking logic

Scores normalize public signals across coding, writing, reasoning, math, and multimodal capability.

Best for

Use the LLM page to compare model strengths before choosing a provider or model family.

Overall18 models

#1

GPT-5.5

OpenAIProprietary

OpenAI frontier work model for complex real-world tasks, agentic coding, research, data analysis, and cross-tool execution.

Agentic codingReal-world workResearch workflows
Overall99
Context
1M+
Release
2026-04
#2

Gemini 3.1 Pro

GoogleProprietary

General model with strong long-context, multimodal, and reasoning capability.

Long-context analysisMultimodal reasoningGeneral task quality
Overall98
Context
1M+
Release
2026
#3

GPT-5.4

OpenAIProprietary

Frontier general model for complex reasoning, coding, tool use, and professional writing.

Software engineeringReasoning workflowsStructured writing
Overall97
Context
1M+
Release
2026
#4

Claude Opus 4.7

AnthropicProprietary

Premium model with strong writing, code review, and agentic workflow behavior.

Writing qualityCode reviewAgent workflows
Overall96
Context
200K+
Release
2026
#5

GLM-5.1

Z.AIOpen weights

Z.AI flagship model for long-horizon agent tasks, real-world engineering delivery, coding, and complex reasoning.

Agentic codingLong-horizon tasksTool use
Overall95
Context
200K
Release
2026-04
#6

DeepSeek V4

DeepSeekOpen weights

DeepSeek next-generation V4 family model for long-context work, coding, math reasoning, and cost-efficient production APIs.

Long-context reasoningCodingCost-efficient inference
Overall94
Context
1M
Release
2026-04
#7

GLM-5

Z.AIOpen weights

Z.AI GLM-5 foundation model for general reasoning, writing, coding, and agentic engineering workflows.

General reasoningAgent workflowsCoding
Overall94
Context
200K+
Release
2026
#8

GPT-5.3 Codex

OpenAIProprietary

Coding-optimized model for repository-scale editing, testing, and debugging.

Repository-scale codingDebuggingTool use
Overall94
Context
1M+
Release
2026
#9

Grok 4.1 Thinking

xAIProprietary

Thinking model for complex Q&A and fresh-information workflows.

Reasoning modeConversational tasksFresh knowledge workflows
Overall93
Context
256K+
Release
2026
#10

DeepSeek V4 Flash

DeepSeekOpen weights

Low-latency DeepSeek V4 variant for high-throughput, cost-sensitive, and real-time product scenarios.

Fast inferenceLow costLong context
Overall92
Context
1M
Release
2026-04
#11

Qwen 3 Max

AlibabaProprietary

Balanced multilingual and coding model with broad API availability.

Multilingual workCodingCost-sensitive apps
Overall90
Context
1M
Release
2026
#12

DeepSeek R2

DeepSeekOpen weights

Open-weight reasoning model focused on math, coding, and cost-efficient deployment.

Math reasoningOpen deploymentCoding tasks
Overall89
Context
128K+
Release
2026
#13

GLM-4.6

Z.AIOpen weights

Z.AI GLM-4.6 improves real-world coding, long-context processing, reasoning, search, writing, and agentic applications.

Real-world codingLong contextAgent workflows
Overall89
Context
200K
Release
2025
#14

Kimi K2

Moonshot AIProprietary

General model with strong long-context and Chinese-language performance.

Long contextChinese tasksResearch workflows
Overall88
Context
1M+
Release
2026
#15

Mistral Large 3

Mistral AIProprietary

European model for enterprise API, coding, and multilingual workloads.

Enterprise APIMultilingual workCoding
Overall88
Context
256K+
Release
2026
#16

Llama 4 405B

MetaOpen weights

Large open-weight model for self-hosted enterprise and research workflows.

Open ecosystemSelf-hostingGeneral generation
Overall86
Context
128K+
Release
2026
#17

Yi Large

01.AIProprietary

Stable model for Chinese, writing, and general knowledge tasks.

Chinese writingGeneral knowledgeAPI apps
Overall84
Context
200K+
Release
2026
#18

Mixtral 8x22B

Mistral AIOpen weights

Open-weight MoE model for self-hosting and cost-sensitive inference.

Open weightsSelf-hostingInference cost
Overall78
Context
64K+
Release
2024

Leaderboard Sources

LLM - AI GitHub Projects