Google made waves with the release of Gemini 2.5 last month, rocketing to the top of the AI leaderboard after previously struggling to keep up with the likes of OpenAI. That first experimental model was just the beginning. Google is deploying its improved AI in more places across its ecosystem, from the developer-centric Vertex AI to the consumer Gemini app.
Gemini models have been dropping so quickly, it can be hard to grasp Google's intended lineup. Things are becoming clearer now that the company is beginning to move its products to the new branch. At the Google Cloud Next conference, it has announced initial availability of Gemini 2.5 Flash. This model is based on the same code as Gemini 2.5 Pro, but it's faster and cheaper to run.
You won't see Gemini 2.5 Flash in the Gemini app just yet—it's starting out in the Vertex AI development platform and AI Studio. The experimental wide release of Pro helped Google gather data and see how people interacted with the new model, and that has helped inform the development of 2.5 Flash.
The Flash versions of Gemini are smaller than the Pro versions, though Google doesn't like to talk about specific parameter counts. Flash models provide faster answers for simpler prompts, which has the side effect of reducing costs. We do know that 2.5 Pro (Experimental) was the first Gemini model to implement dynamic thinking, a technique that allows the model to modulate the amount of simulated reasoning that goes into an answer. 2.5 Flash is also a thinking model, but it's a bit more advanced.