Google has launched its new Gemini 3.1 Pro model which will be taking over as the default model on the Gemini app and NotebookLM. The company claims that its new model is designed for handling complex problem solving and advanced reasoning related tasks.
Alphabet CEO Sundar Pichai, who is in India for the AI Impact Summit, wrote in a post on X (formerly Twitter), “With a more capable baseline, it’s great for super complex tasks like visualizing difficult concepts, synthesizing data into a single view, or bringing creative projects to life.”
How does Gemini 3.1 Pro compare to other models?
Gemini 3.1 Pro has a score of 77.1% on the ARC-AGI 2, a benchmark which evaluates the models’ ability to solve entirely new logic patterns. Notably, this over double the reasoning performance of the Gemini 3 Pro and much higher than the 52.9% score of GPT-5.2 and 68.8% score of Claude Opus 4.6.
On the highly coveted Humanity’s Last Exam benchmark, the Gemini 3.1 Pro also leads with the score of 44.4%, compared to 40.0% for Opus 4.6 and 34.5% for GPT-5.2. However, when tools like search and code were allowed, Claude Opus 4.6 took a slight lead at 53.1% versus Gemini’s 51.4%.
Gemini 3.1 Pro is still slightly below Opus 4.6 on the SWE-Bench Verified benchmark, which evaluates the performance on agentic coding. Opus 4.6 had a score of 80.8% compared to 80.6% score of Gemini 3.1 Pro and 80% score of GPT-5.2.
Gemini 3.1 Pro availability:
Google says that Gemini 3.1 Pro is rolling out to consumers via the Gemini app and NotebookLM. The model is available for free in Gemini with higher limits for Pro and Ultra users. Meanwhile, the NotebookLM rollout is available only to the Pro and Ultra users.
The model is also in preview to developers via the Gemini API in Google AI Studio, Gemini CLI, the Google Antigravity agentic development platform, and Android Studio. Meanwhile, Enterprise users can access the model through Vertex AI and Gemini Enterprise.