Large Language Model Rankings

LLM Capability Leaderboard

Compare leading language models by coding, writing, reasoning, math, and multimodal capability.

Snapshot updated 2026-07-04. Scores are AI Explorer normalized 0-100 ratings from public leaderboard signals.

Data source

Model data is organized from public leaderboards, provider information, and curated capability signals.

Ranking logic

Scores normalize public signals across coding, writing, reasoning, math, and multimodal capability.

Best for

Use the LLM page to compare model strengths before choosing a provider or model family.

Coding20 models

Overall Coding Writing Reasoning Math Multimodal

GPT-5.5

OpenAIProprietary

OpenAI frontier work model for complex real-world tasks, agentic coding, research, data analysis, and cross-tool execution.

Agentic codingReal-world workResearch workflows

Coding99

Context: 1M+
Release: 2026-04

GPT-5.4

OpenAIProprietary

Frontier general model for complex reasoning, coding, tool use, and professional writing.

Software engineeringReasoning workflowsStructured writing

Coding98

Context: 1M+
Release: 2026

Claude Opus 4.8

AnthropicProprietary

Anthropic latest Opus-class model with strong writing, code review, complex reasoning, and agentic workflow behavior.

Writing qualityCode reviewAgent workflows

Coding97

Context: 200K+
Release: 2026-05

Gemini 3.5 Pro

GoogleProprietary

Google latest Pro-family general model for complex reasoning, long-context work, multimodal tasks, and high-quality generation.

Complex reasoningLong-context analysisMultimodal reasoning

Coding95

Context: 1M+
Release: 2026-05

DeepSeek V4

DeepSeekOpen weights

DeepSeek next-generation V4 family model for long-context work, coding, math reasoning, and cost-efficient production APIs.

Long-context reasoningCodingCost-efficient inference

Coding94

Context: 1M
Release: 2026-04

Gemini 3.1 Pro

GoogleProprietary

General model with strong long-context, multimodal, and reasoning capability.

Long-context analysisMultimodal reasoningGeneral task quality

Coding93

Context: 1M+
Release: 2026

DeepSeek V4 Flash

DeepSeekOpen weights

Low-latency DeepSeek V4 variant for high-throughput, cost-sensitive, and real-time product scenarios.

Fast inferenceLow costLong context

Coding93

Context: 1M
Release: 2026-04

Gemini 3.5 Flash

GoogleProprietary

Google latest Flash-family model for low-latency, multimodal, long-context, and high-throughput applications.

Fast responseMultimodal workflowsLong-context analysis

Coding92

Context: 1M+
Release: 2026-05

Qwen 3 Max

AlibabaProprietary

Balanced multilingual and coding model with broad API availability.

Multilingual workCodingCost-sensitive apps

Coding92

Context: 1M
Release: 2026

#10

GLM-5

Z.AIOpen weights

Z.AI GLM-5 foundation model for general reasoning, writing, coding, and agentic engineering workflows.

General reasoningAgent workflowsCoding

Coding91

Context: 200K+
Release: 2026

#11

DeepSeek R2

DeepSeekOpen weights

Open-weight reasoning model focused on math, coding, and cost-efficient deployment.

Math reasoningOpen deploymentCoding tasks

Coding91

Context: 128K+
Release: 2026

#12

GLM-5.1

Z.AIOpen weights

Z.AI flagship model for long-horizon agent tasks, real-world engineering delivery, coding, and complex reasoning.

Agentic codingLong-horizon tasksTool use

Coding90

Context: 200K
Release: 2026-04

#13

Grok 4.1 Thinking

xAIProprietary

Thinking model for complex Q&A and fresh-information workflows.

Reasoning modeConversational tasksFresh knowledge workflows

Coding90

Context: 256K+
Release: 2026

#14

Mistral Large 3

Mistral AIProprietary

European model for enterprise API, coding, and multilingual workloads.

Enterprise APIMultilingual workCoding

Coding88

Context: 256K+
Release: 2026

#15

GPT-5.3 Codex

OpenAIProprietary

Coding-optimized model for repository-scale editing, testing, and debugging.

Repository-scale codingDebuggingTool use

Coding86

Context: 1M+
Release: 2026

#16

Llama 4 405B

MetaOpen weights

Large open-weight model for self-hosted enterprise and research workflows.

Open ecosystemSelf-hostingGeneral generation

Coding84

Context: 128K+
Release: 2026

#17

GLM-4.6

Z.AIOpen weights

Z.AI GLM-4.6 improves real-world coding, long-context processing, reasoning, search, writing, and agentic applications.

Real-world codingLong contextAgent workflows

Coding82

Context: 200K
Release: 2025

#18

Yi Large

01.AIProprietary

Stable model for Chinese, writing, and general knowledge tasks.

Chinese writingGeneral knowledgeAPI apps

Coding80

Context: 200K+
Release: 2026

#19

Mixtral 8x22B

Mistral AIOpen weights

Open-weight MoE model for self-hosting and cost-sensitive inference.

Open weightsSelf-hostingInference cost

Coding76

Context: 64K+
Release: 2024

#20

Kimi K2

Moonshot AIProprietary

General model with strong long-context and Chinese-language performance.

Long contextChinese tasksResearch workflows

Coding74

Context: 1M+
Release: 2026

Leaderboard Sources

Anthropic Claude model docs Artificial Analysis Google Gemini model docs Hugging Face Open LLM Leaderboard LiveCodeBench LMArena Chatbot Arena SWE-bench