Seattle Daily News

collapse
Home / Daily News Analysis / Google may have fixed the issue that was exhausting your Gemini usage limits

Google may have fixed the issue that was exhausting your Gemini usage limits

May 29, 2026  Twila Rosenbaum  4 views
Google may have fixed the issue that was exhausting your Gemini usage limits

Google has been facing growing criticism from users of its Gemini AI platform, particularly those on the paid Pro plan, who reported that their usage limits were being exhausted far too quickly—sometimes after just a few prompts. In response, Josh Woodward, Vice President at Google, has detailed a series of fixes now being rolled out to address these complaints. The changes touch on various aspects of the quota system, from correcting bugs to introducing new caps and improving transparency.

One of the most significant issues involved Omni video generation. Users who experimented with short clips or different video styles found that a single prompt could consume a disproportionately large portion of their monthly quota. Google has now fixed this bug, and is also increasing allowances for heavier users. For example, Ultra subscribers will receive double the number of Omni video generations starting immediately. This adjustment directly tackles a key pain point, as video generation is a compute-intensive task that can quickly eat up allowances.

Another area of complaint was the handling of Complex 3.1 Pro prompts—long, detailed instructions often accompanied by large file uploads or multi-step reasoning tasks. These prompts consumed quotas in an aggressive manner, leaving users feeling shortchanged. Google is now introducing caps per prompt to prevent extreme outliers. Instead of one very heavy request draining a large chunk of a user's allowance, the system will limit how much a single prompt can consume. This change ensures that a single complex task does not wipe out a significant portion of a user's monthly quota, making the system more predictable.

Failed requests have also been a source of frustration. Approximately 1 in 10 requests can fail due to system errors, yet previously, these failed attempts counted against a user's quota. Now, if a request fails, it will not be charged. This correction addresses a common sense of unfairness and should provide some relief for users who encounter technical glitches.

Perhaps the most user-friendly change is that Flash-Lite prompts will no longer count against quotas at all. This effectively turns Flash-Lite into a free layer for lighter tasks, encouraging users to rely on lighter models when full reasoning power is not needed. This should help stretch the limits of higher tiers further, as users can offload simple requests to the free tier while reserving their paid quota for more demanding tasks.

Google is also improving transparency regarding Deep Research usage. These compute-heavy tasks involve processing large inputs or multi-step analysis, and many users have had little visibility into why their quotas drop faster on some days. Google will now provide more detailed breakdowns and notifications, allowing users to see which types of tasks are expensive and which are not. This transparency should empower users to manage their usage more effectively.

Finally, model selection will now persist across sessions. If a user chooses a specific model inside Gemini, the app will remember that choice the next time it opens. The only exception is when a usage cap is hit, in which case the system may automatically switch to a lighter model to keep things running. This convenience eliminates the need to reselect the preferred model every time the app is launched.

These changes represent a concerted effort by Google to smooth out a quota system that had become inconsistent for many users. The fixes address both bugs and design choices that led to unpredictable consumption of allowances. By capping heavy prompts, excusing failed requests, offering free Flash-Lite usage, and improving transparency, Google aims to make the Gemini experience more logical and user-friendly.

In the broader context of AI services, usage limits are a common mechanism to manage server load and prevent abuse, but they can also be a source of user frustration when they feel arbitrary. Google's adjustments show a willingness to listen to user feedback and iterate on its offerings. For existing users, these changes may tip the scales in favor of staying with Gemini or even upgrading to a higher tier. For potential customers, a more predictable quota system could lower the barrier to adoption.

Despite these improvements, limits still exist, and heavy users may still encounter constraints depending on their workload. However, the direction is clearly more user-centric. The fixes are being rolled out gradually, and users should start noticing improvements in the coming days. By focusing on predictability, fairness, and transparency, Google is addressing the core concerns that have been circulating in online communities and reviews. It remains to be seen whether these changes fully resolve the frustration, but they certainly mark a positive step forward.


Source: Android Authority News


Share:

Your experience on this site will be improved by allowing cookies Cookie Policy