Comparing GPT-5.4, Claude Opus 4.6 - LLM benchmark results