GPT-5.4 nano Benchmark Details

GPT-5.4 nano currently shows benchmark results led by LiveBench (38 / 115, score 70.13), Claw Bench (10 / 29, score 89.70), GPQA Diamond (63 / 179, score 82.80).

Benchmark Results

GPT-5.4 nano

Benchmark Results

General Knowledge

8 evaluations

Benchmark / mode

Score

Rank/total

GPQA Diamond

Extra-High

82.80

63 / 179

LiveBench

Standard Mode

32.39

115 / 115

LiveBench

Low

48.67

96 / 115

LiveBench

Medium

58.46

75 / 115

LiveBench

High

62.75

57 / 115

LiveBench

Deep Thinking Mode

70.13

38 / 115

HLE

Extra-High

24.30

92 / 159

HLE

Extra-HighTools

37.70

57 / 159

Multimodal Understanding

2 evaluations

Benchmark / mode

Score

Rank/total

MMMU

Extra-High

66.10

26 / 28

MMMU

Extra-HighTools

69.50

24 / 28

Math and Reasoning

1 evaluations

Benchmark / mode

Score

Rank/total

FrontierMath - Tier 4

High

6.30

35 / 80

Coding and Software Engineer

1 evaluations

Benchmark / mode

Score

Rank/total

SWE-Bench Pro - Public

Extra-HighTools

52.40

27 / 44

Agent Level Benchmark

1 evaluations

Benchmark / mode

Score

Rank/total

τ²-Bench - Telecom

Extra-HighTools

92.50

19 / 35

AI Agent - Tool Usage

3 evaluations

Benchmark / mode

Score

Rank/total

Terminal Bench 2.0

Extra-HighTools

46.30

40 / 46

OSWorld-Verified

Extra-HighTools

17 / 18

Tool Decathlon

Extra-HighTools

35.50

6 / 7

Claw-style Agent Evaluation

1 evaluations

Benchmark / mode

Score

Rank/total

Claw Bench

Thinking EnabledTools

89.70

10 / 29

Compare with other models