OTHER

Google Unveils Gemini Deep Think AI: A Revolutionary Reasoning Model for Concurrent Idea Evaluation

Google DeepMind is unveiling Gemini 2.5 Deep Think, which is described as the company’s most advanced AI reasoning model. This model is particularly adept at answering inquiries by concurrently examining various concepts and identifying the best response from the outcomes.

As of Friday, those subscribed to Google’s $250 monthly Ultra plan will gain access to Gemini 2.5 Deep Think via the Gemini app.

Launched in May during Google I/O 2025, Gemini 2.5 Deep Think is Google’s first publicly available multi-agent model. This groundbreaking method employs multiple AI agents to tackle a question at the same time, vastly increasing computational needs but often resulting in better responses.

A version of Gemini 2.5 Deep Think contributed to a gold medal victory at this year’s International Math Olympiad (IMO).

Moreover, Google plans to share the model used in the IMO with a select group of mathematicians and researchers. The company notes that this AI model “requires hours to reason,” unlike the seconds or minutes typical of most consumer AI models. Google seeks feedback from the research community to further refine the multi-agent framework for academic purposes.

Google highlights that the Gemini 2.5 Deep Think model represents a significant improvement over its previous I/O announcement. The company has also introduced “innovative reinforcement learning techniques” to bolster the model’s reasoning abilities.

“Deep Think can assist users in addressing problems that necessitate creativity, strategic planning, and iterative advancements,” Google mentioned in a blog post shared with TechCrunch.

TechCrunch event

San Francisco
|
October 27-29, 2025

The company asserts that Gemini 2.5 Deep Think boasts top-tier performance on Humanity’s Last Exam (HLE), a stringent evaluation of AI’s capability to address thousands of crowdsourced inquiries across disciplines like mathematics, humanities, and science. Google claims its model scored 34.8% on HLE (without tools), outpacing xAI’s Grok 4 at 25.4% and OpenAI’s o3 at 20.3%.

Google further states that Gemini 2.5 Deep Think surpasses models from OpenAI, xAI, and Anthropic in LiveCodeBench6, a rigorous test of competitive coding tasks. Google’s model achieved a score of 87.6%, while Grok 4 tallied 79%, and OpenAI’s o3 received 72%.

Benchmark scores. Image Credits: Google

Gemini 2.5 Deep Think integrates effortlessly with tools such as code execution and Google Search, with the company stating it can produce “much longer responses” than standard AI models.

In Google’s tests, the model generated more thorough and visually appealing web development tasks than competing AI models. The company claims this model could assist researchers and “potentially speed up the path to discovery.”

Art scenes created by Google’s AI (Credit: Google)

It seems multiple top AI labs are leaning towards the multi-agent approach.

Elon Musk’s xAI has recently launched its own multi-agent system, Grok 4 Heavy, claiming industry-leading performance on various metrics. OpenAI’s Noam Brown mentioned in a podcast that their unreleased AI model, which contributed to a gold medal victory at the recent International Math Olympiad (IMO), was also built on a multi-agent framework. Similarly, Anthropic’s Research agent, which produces detailed research briefs, operates on a multi-agent basis.

Even with its notable performance, multi-agent systems seem to be significantly more costly to run than standard AI models. As a result, tech companies may choose to limit access to these systems to premium subscription plans, a strategy already implemented by both xAI and Google.

In the coming weeks, Google plans to offer Gemini 2.5 Deep Think to a select group of testers through the Gemini API, intending to gather insights on how developers and businesses could utilize its multi-agent framework.