Large Language Model Rankings

LLM Capability Leaderboard

Compare leading language models by coding, writing, reasoning, math, and multimodal capability.

Snapshot updated 2026-07-04. Scores are AI Explorer normalized 0-100 ratings from public leaderboard signals.

Data source

Model data is organized from public leaderboards, provider information, and curated capability signals.

Ranking logic

Scores normalize public signals across coding, writing, reasoning, math, and multimodal capability.

Best for

Use the LLM page to compare model strengths before choosing a provider or model family.

Overall20 models

Overall Coding Writing Reasoning Math Multimodal

GPT-5.5

OpenAIProprietary

OpenAI frontier work model for complex real-world tasks, agentic coding, research, data analysis, and cross-tool execution.

Agentic codingReal-world workResearch workflows

Overall99

Context: 1M+
Release: 2026-04

Gemini 3.5 Pro

GoogleProprietary

Google latest Pro-family general model for complex reasoning, long-context work, multimodal tasks, and high-quality generation.

Complex reasoningLong-context analysisMultimodal reasoning

Overall98

Context: 1M+
Release: 2026-05

Claude Opus 4.8

AnthropicProprietary

Anthropic latest Opus-class model with strong writing, code review, complex reasoning, and agentic workflow behavior.

Writing qualityCode reviewAgent workflows

Overall97

Context: 200K+
Release: 2026-05

GPT-5.4

OpenAIProprietary

Frontier general model for complex reasoning, coding, tool use, and professional writing.

Software engineeringReasoning workflowsStructured writing

Overall97

Context: 1M+
Release: 2026

Gemini 3.1 Pro

GoogleProprietary

General model with strong long-context, multimodal, and reasoning capability.

Long-context analysisMultimodal reasoningGeneral task quality

Overall96

Context: 1M+
Release: 2026

DeepSeek V4

DeepSeekOpen weights

DeepSeek next-generation V4 family model for long-context work, coding, math reasoning, and cost-efficient production APIs.

Long-context reasoningCodingCost-efficient inference

Overall95

Context: 1M
Release: 2026-04

GLM-5.1

Z.AIOpen weights

Z.AI flagship model for long-horizon agent tasks, real-world engineering delivery, coding, and complex reasoning.

Agentic codingLong-horizon tasksTool use

Overall95

Context: 200K
Release: 2026-04

Gemini 3.5 Flash

GoogleProprietary

Google latest Flash-family model for low-latency, multimodal, long-context, and high-throughput applications.

Fast responseMultimodal workflowsLong-context analysis

Overall94

Context: 1M+
Release: 2026-05

GLM-5

Z.AIOpen weights

Z.AI GLM-5 foundation model for general reasoning, writing, coding, and agentic engineering workflows.

General reasoningAgent workflowsCoding

Overall94

Context: 200K+
Release: 2026

#10

GPT-5.3 Codex

OpenAIProprietary

Coding-optimized model for repository-scale editing, testing, and debugging.

Repository-scale codingDebuggingTool use

Overall94

Context: 1M+
Release: 2026

#11

Grok 4.1 Thinking

xAIProprietary

Thinking model for complex Q&A and fresh-information workflows.

Reasoning modeConversational tasksFresh knowledge workflows

Overall93

Context: 256K+
Release: 2026

#12

DeepSeek V4 Flash

DeepSeekOpen weights

Low-latency DeepSeek V4 variant for high-throughput, cost-sensitive, and real-time product scenarios.

Fast inferenceLow costLong context

Overall92

Context: 1M
Release: 2026-04

#13

Qwen 3 Max

AlibabaProprietary

Balanced multilingual and coding model with broad API availability.

Multilingual workCodingCost-sensitive apps

Overall90

Context: 1M
Release: 2026

#14

DeepSeek R2

DeepSeekOpen weights

Open-weight reasoning model focused on math, coding, and cost-efficient deployment.

Math reasoningOpen deploymentCoding tasks

Overall89

Context: 128K+
Release: 2026

#15

GLM-4.6

Z.AIOpen weights

Z.AI GLM-4.6 improves real-world coding, long-context processing, reasoning, search, writing, and agentic applications.

Real-world codingLong contextAgent workflows

Overall89

Context: 200K
Release: 2025

#16

Kimi K2

Moonshot AIProprietary

General model with strong long-context and Chinese-language performance.

Long contextChinese tasksResearch workflows

Overall88

Context: 1M+
Release: 2026

#17

Mistral Large 3

Mistral AIProprietary

European model for enterprise API, coding, and multilingual workloads.

Enterprise APIMultilingual workCoding

Overall88

Context: 256K+
Release: 2026

#18

Llama 4 405B

MetaOpen weights

Large open-weight model for self-hosted enterprise and research workflows.

Open ecosystemSelf-hostingGeneral generation

Overall86

Context: 128K+
Release: 2026

#19

Yi Large

01.AIProprietary

Stable model for Chinese, writing, and general knowledge tasks.

Chinese writingGeneral knowledgeAPI apps

Overall84

Context: 200K+
Release: 2026

#20

Mixtral 8x22B

Mistral AIOpen weights

Open-weight MoE model for self-hosting and cost-sensitive inference.

Open weightsSelf-hostingInference cost

Overall78

Context: 64K+
Release: 2024

Leaderboard Sources

Anthropic Claude model docs Artificial Analysis Google Gemini model docs Hugging Face Open LLM Leaderboard LiveCodeBench LMArena Chatbot Arena SWE-bench