Claude Mythos PreviewvsClaude Opus 4.6

Across 7 shared benchmarks, Claude Mythos Preview leads overall: Claude Mythos Preview wins 7, Claude Opus 4.6 wins 0, with 0 ties and an average score difference of +9.68.

Anthropic
Claude Mythos Preview

Anthropic · 2026-04-07 · Chat model

Anthropic
Claude Opus 4.6

Anthropic · 2026-02-05 · Reasoning model

Claude Mythos Preview7 wins(100%)(0%)0 winsClaude Opus 4.6

Benchmark scores

Grouped by capability, sorted by largest gap within each. 7 shared benchmarks.

AI Agent - Tool Usage

Claude Mythos Preview 2/2
BenchmarkClaude Mythos PreviewClaude Opus 4.6Diff
Terminal Bench 2.0822 / 46Extended (with tools)65.4011 / 46Extended (with tools)+16.60
OSWorld-Verified79.603 / 18Extended (with tools)72.709 / 18Extended (with tools)+6.90

Coding and Software Engineer

Claude Mythos Preview 2/2
BenchmarkClaude Mythos PreviewClaude Opus 4.6Diff
SWE-bench Multilingual87.301 / 20Extended (with tools)7212 / 20Extended (with tools)+15.30
SWE-bench Verified93.903 / 108Extended (with tools)80.849 / 108Extended (with tools)+13.06

General Knowledge

Claude Mythos Preview 2/2
BenchmarkClaude Mythos PreviewClaude Opus 4.6Diff
HLE64.701 / 157Extended (with tools)5311 / 157Extended (with tools, internet)+11.70
GPQA Diamond94.601 / 178Extended (no tools)91.3114 / 178Extended (no tools)+3.29

AI Agent - Information Search

Claude Mythos Preview 1/1
BenchmarkClaude Mythos PreviewClaude Opus 4.6Diff
BrowseComp84.904 / 45Extended (with tools)847 / 45Thinking (With Tools + Internet)+0.90

Specs

FieldClaude Mythos PreviewClaude Opus 4.6
PublisherAnthropicAnthropic
Release date2026-04-072026-02-05
Model typeChat modelReasoning model
ArchitectureDenseDense
ParametersNot availableNot available
Context lengthNot available1000K
Max output8K64K

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

ItemClaude Mythos PreviewClaude Opus 4.6
Text input$25 / 1M tokens$0.5 / 1M tokens
Text output$125 / 1M tokens$25 / 1M tokens
Cache readNot public$0.5 / 1M tokens
Cache writeNot public$10 / 1M tokens

Summary

  • Claude Mythos Previewleads in:AI Agent - Tool Usage (2/2), Coding and Software Engineer (2/2), General Knowledge (2/2), AI Agent - Information Search (1/1)

On average across the 7 shared benchmarks, Claude Mythos Preview scores 9.68 higher.

Largest single-benchmark gap: Terminal Bench 2.0 — Claude Mythos Preview 82 vs Claude Opus 4.6 65.40 (+16.60).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.