:
[AI]■ STORY TIMELINE

LATEST AI MODELS FAIL ON REASONING TASKS

Analysis of OpenAI's GPT-5.5 and Anthropic's Opus 4.7 on the ARC-AGI-3 benchmark reveals three systematic reasoning errors that keep both models below 1 percent accuracy on tasks humans solve routinely.

1 SOURCEFIRST SEEN MAY 2, 01:31 PM► READ THE ARTICLE
The Decoder+0m

The ARC Prize Foundation analyzed 160 game runs of OpenAI's GPT-5.5 and Anthropic's Opus 4.7 on the ARC-AGI-3 benchmark.…