Claude Mythos PreviewvsClaude Opus 4.6

Across 7 shared benchmarks, Claude Mythos Preview leads overall: Claude Mythos Preview wins 7, Claude Opus 4.6 wins 0, with 0 ties and an average score difference of +9.68.

Claude Mythos Preview

Anthropic · 2026-04-07 · Chat model

Claude Opus 4.6

Anthropic · 2026-02-05 · Reasoning model

Claude Mythos Preview7 wins(100%)(0%)0 winsClaude Opus 4.6

Benchmark scores

Grouped by capability, sorted by largest gap within each. 7 shared benchmarks.

AI Agent - Tool Usage

Claude Mythos Preview 2/2

Benchmark	Claude Mythos Preview	Claude Opus 4.6	Diff
Terminal Bench 2.0	822 / 47Extended (with tools)	65.4011 / 47Extended (with tools)	+16.60
OSWorld-Verified	79.607 / 24Extended (with tools)	72.7015 / 24Extended (with tools)	+6.90

Coding and Software Engineer

Claude Mythos Preview 2/2

Benchmark	Claude Mythos Preview	Claude Opus 4.6	Diff
SWE-bench Multilingual	87.302 / 23Extended (with tools)	7214 / 23Extended (with tools)	+15.30
SWE-bench Verified	93.904 / 112Extended (with tools)	80.8410 / 112Extended (with tools)	+13.06

General Knowledge

Claude Mythos Preview 2/2

Benchmark	Claude Mythos Preview	Claude Opus 4.6	Diff
HLE	64.701 / 172Extended (with tools)	5318 / 172Extended (with tools, internet)	+11.70
GPQA Diamond	94.601 / 187Extended (no tools)	91.3115 / 187Extended (no tools)	+3.29

AI Agent - Information Search

Claude Mythos Preview 1/1

Benchmark	Claude Mythos Preview	Claude Opus 4.6	Diff
BrowseComp	84.906 / 53Extended (with tools)	8411 / 53Thinking (With Tools + Internet)	+0.90

Specs

Field	Claude Mythos Preview	Claude Opus 4.6
Publisher	Anthropic	Anthropic
Release date	2026-04-07	2026-02-05
Model type	Chat model	Reasoning model
Architecture	Dense	Dense
Parameters	Not available	Not available
Context length	Not available	1000K
Max output	8K	64K

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

Item	Claude Mythos Preview	Claude Opus 4.6
Text input	$25 / 1M tokens	$0.5 / 1M tokens
Text output	$125 / 1M tokens	$25 / 1M tokens
Cache read	Not public	$0.5 / 1M tokens
Cache write	Not public	$10 / 1M tokens

Summary

Claude Mythos Previewleads in:AI Agent - Tool Usage (2/2), Coding and Software Engineer (2/2), General Knowledge (2/2), AI Agent - Information Search (1/1)

On average across the 7 shared benchmarks, Claude Mythos Preview scores 9.68 higher.

Largest single-benchmark gap: Terminal Bench 2.0 — Claude Mythos Preview 82 vs Claude Opus 4.6 65.40 (+16.60).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

Claude Mythos Preview details Claude Opus 4.6 details·Customize in compare tool