GOOGLE'S GEMINI-SQL2 DOMINATES TEXT-TO-SQL BENCHMARKS
■ AI-SUMMARIZED FROM 1 SOURCE ▸ TIMELINE
Google Research's Gemini-SQL2 achieves 80.04% accuracy on the BIRD benchmark, significantly outperforming competitors from OpenAI and Anthropic. The system converts natural language queries into executable SQL code.
■ MORE FROM THE AI DESK
The New Yorker's profile of OpenAI CEO Sam Altman featured an AI-generated illustration, raising questions about whether AI coverage should rely on AI tools.
A new AI model called "Count Anything" can identify and count objects in any image using only text prompts, halving error rates compared to existing systems. The breakthrough addresses a persistent challenge in computer vision, though dense crowds and ambiguous terms still pose problems.
Major AI systems from Google, OpenAI, Anthropic, and xAI perform poorly when predicting Premier League match outcomes. xAI's Grok shows particularly weak performance.
A tester who previously dismissed Siri and Apple Intelligence is reconsidering after 24 hours with the redesigned Siri AI in macOS 27 Golden Gate's developer beta.