GOOGLE'S GEMINI-SQL2 DOMINATES TEXT-TO-SQL BENCHMARKS

AI DESK■ 1 MIN READ

SAT, JUN 13, 2026

■ AI-SUMMARIZED FROM 1 SOURCE ▸ TIMELINE

Google Research's Gemini-SQL2 achieves 80.04% accuracy on the BIRD benchmark, significantly outperforming competitors from OpenAI and Anthropic. The system converts natural language queries into executable SQL code.

Built on Gemini 3.1 Pro, Gemini-SQL2 demonstrates substantial performance gains in text-to-SQL translation. The 80.04% BIRD benchmark score establishes a clear lead over existing solutions in the competitive field. The technology translates natural language requests directly into SQL queries, enabling users to interact with databases without technical SQL knowledge. Google Research indicates the advancement could enhance natural language capabilities across its data services portfolio. Text-to-SQL systems address a significant challenge in database accessibility, allowing non-technical users to query complex data systems through conversation. The performance improvement suggests potential applications in enterprise data analytics, business intelligence platforms, and consumer-facing database tools. Google has not announced specific deployment timelines or product integration plans for Gemini-SQL2.

■ SOURCES

► The Decoder

■ SUMMARY WRITTEN BY AI FROM THE LINKS ABOVE

■ MORE FROM THE AI DESK

P806NEW YORKER USES AI ART FOR AI STORY, SPARKS DEBATE

The New Yorker's profile of OpenAI CEO Sam Altman featured an AI-generated illustration, raising questions about whether AI coverage should rely on AI tools.

JUST NOW— AI Desk

P801COUNT ANYTHING AI CUTS OBJECT DETECTION ERRORS IN HALF

A new AI model called "Count Anything" can identify and count objects in any image using only text prompts, halving error rates compared to existing systems. The breakthrough addresses a persistent challenge in computer vision, though dense crowds and ambiguous terms still pose problems.

JUST NOW— AI Desk

P800AI MODELS FAIL AT SOCCER BETTING, GROK WORST

Major AI systems from Google, OpenAI, Anthropic, and xAI perform poorly when predicting Premier League match outcomes. xAI's Grok shows particularly weak performance.

2H AGO— AI Desk

P796APPLE'S NEW SIRI AI SHOWS PROMISE IN EARLY MACOS TESTING

A tester who previously dismissed Siri and Apple Intelligence is reconsidering after 24 hours with the redesigned Siri AI in macOS 27 Golden Gate's developer beta.

2H AGO— AI Desk

◄ BACK TO NEWS

GOOGLE'S GEMINI-SQL2 DOMINATES TEXT-TO-SQL BENCHMARKS

■ MORE FROM THE AI DESK

■ SUBSCRIBE TO THE DAILY BRIEF