Large Language Model Rankings

LLM Capability Leaderboard

Compare leading language models by coding, writing, reasoning, math, and multimodal capability.

Snapshot updated 2026-05-20. Scores are AI Explorer normalized 0-100 ratings from public leaderboard signals.

Data source

Model data is organized from public leaderboards, provider information, and curated capability signals.

Ranking logic

Scores normalize public signals across coding, writing, reasoning, math, and multimodal capability.

Best for

Use the LLM page to compare model strengths before choosing a provider or model family.

Math18 models

#1

GPT-5.5

OpenAIProprietary

OpenAI frontier work model for complex real-world tasks, agentic coding, research, data analysis, and cross-tool execution.

Agentic codingReal-world workResearch workflows
Math98
Context
1M+
Release
2026-04
#2

Gemini 3.1 Pro

GoogleProprietary

General model with strong long-context, multimodal, and reasoning capability.

Long-context analysisMultimodal reasoningGeneral task quality
Math97
Context
1M+
Release
2026
#3

DeepSeek V4

DeepSeekOpen weights

DeepSeek next-generation V4 family model for long-context work, coding, math reasoning, and cost-efficient production APIs.

Long-context reasoningCodingCost-efficient inference
Math96
Context
1M
Release
2026-04
#4

GPT-5.4

OpenAIProprietary

Frontier general model for complex reasoning, coding, tool use, and professional writing.

Software engineeringReasoning workflowsStructured writing
Math96
Context
1M+
Release
2026
#5

DeepSeek R2

DeepSeekOpen weights

Open-weight reasoning model focused on math, coding, and cost-efficient deployment.

Math reasoningOpen deploymentCoding tasks
Math95
Context
128K+
Release
2026
#6

DeepSeek V4 Flash

DeepSeekOpen weights

Low-latency DeepSeek V4 variant for high-throughput, cost-sensitive, and real-time product scenarios.

Fast inferenceLow costLong context
Math94
Context
1M
Release
2026-04
#7

GLM-5.1

Z.AIOpen weights

Z.AI flagship model for long-horizon agent tasks, real-world engineering delivery, coding, and complex reasoning.

Agentic codingLong-horizon tasksTool use
Math94
Context
200K
Release
2026-04
#8

Grok 4.1 Thinking

xAIProprietary

Thinking model for complex Q&A and fresh-information workflows.

Reasoning modeConversational tasksFresh knowledge workflows
Math94
Context
256K+
Release
2026
#9

Claude Opus 4.7

AnthropicProprietary

Premium model with strong writing, code review, and agentic workflow behavior.

Writing qualityCode reviewAgent workflows
Math93
Context
200K+
Release
2026
#10

GLM-5

Z.AIOpen weights

Z.AI GLM-5 foundation model for general reasoning, writing, coding, and agentic engineering workflows.

General reasoningAgent workflowsCoding
Math93
Context
200K+
Release
2026
#11

GPT-5.3 Codex

OpenAIProprietary

Coding-optimized model for repository-scale editing, testing, and debugging.

Repository-scale codingDebuggingTool use
Math92
Context
1M+
Release
2026
#12

Qwen 3 Max

AlibabaProprietary

Balanced multilingual and coding model with broad API availability.

Multilingual workCodingCost-sensitive apps
Math91
Context
1M
Release
2026
#13

GLM-4.6

Z.AIOpen weights

Z.AI GLM-4.6 improves real-world coding, long-context processing, reasoning, search, writing, and agentic applications.

Real-world codingLong contextAgent workflows
Math88
Context
200K
Release
2025
#14

Mistral Large 3

Mistral AIProprietary

European model for enterprise API, coding, and multilingual workloads.

Enterprise APIMultilingual workCoding
Math86
Context
256K+
Release
2026
#15

Kimi K2

Moonshot AIProprietary

General model with strong long-context and Chinese-language performance.

Long contextChinese tasksResearch workflows
Math85
Context
1M+
Release
2026
#16

Llama 4 405B

MetaOpen weights

Large open-weight model for self-hosted enterprise and research workflows.

Open ecosystemSelf-hostingGeneral generation
Math82
Context
128K+
Release
2026
#17

Yi Large

01.AIProprietary

Stable model for Chinese, writing, and general knowledge tasks.

Chinese writingGeneral knowledgeAPI apps
Math82
Context
200K+
Release
2026
#18

Mixtral 8x22B

Mistral AIOpen weights

Open-weight MoE model for self-hosting and cost-sensitive inference.

Open weightsSelf-hostingInference cost
Math75
Context
64K+
Release
2024

Leaderboard Sources

LLM - AI GitHub Projects