AI
Google is making its AI models cheaper for third-party developers
Google claims this new implicit caching can cut costs by up to 75% when developers use its AI models.

Just a heads up, if you buy something through our links, we may get a small share of the sale. It’s one of the ways we keep the lights on here. Click here for more.
Google is introducing a new feature to its Gemini API called implicit caching, which it says will help developers save money when using its latest AI models, Gemini 2.5 Pro and 2.5 Flash.
The company claims this feature can cut costs by up to 75% when developers send repetitive information, called repetitive context, to the models.
To understand this, think of caching like saving a copy of a frequently asked question and its answer.
Instead of making the AI figure out the answer every single time, it just pulls up the saved version. This reduces the computing power needed, which in turn lowers the cost.
Google already had a caching option before, but it was manual—developers had to identify and set up which parts of their requests should be cached.
That system wasn’t very user-friendly and, in some cases, ended up costing developers more money than they expected. This led to complaints, and Google recently apologized for those issues.
The new implicit caching system is different because it’s automatic and turned on by default.
If part of a new request matches a part of an older request, Google will reuse the earlier response and automatically apply a discount to the developer’s bill.
The amount of text required to activate this feature is fairly small, around 750 to 1,500 words, depending on the model.
However, there are some things developers should watch out for. To increase the chances of triggering savings, Google suggests placing repeated content at the start of a request and putting changing content at the end.
Also, Google hasn’t provided any independent proof that the savings will always happen as claimed, so developers are waiting to see how it works in real use.
If it works as promised, this feature could help cut costs significantly for apps and services that rely on repeating the same instructions or data when using Google’s AI.
What are your thoughts on this new feature from Google? Do you think it’ll make any difference to us as end users? Tell us your thoughts below in the comments, or via our Twitter or Facebook.
Follow us on Flipboard, Google News, or Apple News
