Comparing GPT-5.5, GPT-5.4 - LLM benchmark results