feat: comprehensive rate limit handling overhaul (inspired by opencode-antigravity-auth)
This commit addresses "Max retries exceeded" errors during stress testing where all accounts would become exhausted simultaneously due to short per-second rate limits triggering cascading failures. ## Rate Limit Parser (`rate-limit-parser.js`) - Remove 2s buffer enforcement that caused cascading failures when API returned short reset times (200-600ms). Now adds 200ms buffer for sub-500ms resets - Add `parseRateLimitReason()` for smart backoff based on error type: QUOTA_EXHAUSTED, RATE_LIMIT_EXCEEDED, MODEL_CAPACITY_EXHAUSTED, SERVER_ERROR ## Message/Streaming Handlers - Add per-account+model rate limit state tracking with exponential backoff - For short rate limits (< 1 second), wait and retry on same account instead of switching - prevents thundering herd when all accounts hit per-second limits - Add throttle wait support for fallback modes (emergency/lastResort) - Add `calculateSmartBackoff()` with progressive tiers by error type ## HybridStrategy (`hybrid-strategy.js`) - Refactor `#getCandidates()` to return 4 fallback levels: - `normal`: All filters pass (health, tokens, quota) - `quota`: Bypass critical quota check - `emergency`: Bypass health check when ALL accounts unhealthy - `lastResort`: Bypass BOTH health AND token bucket checks - Add throttle wait times: 500ms for lastResort, 250ms for emergency - Fix LRU calculation to use seconds (matches opencode-antigravity-auth) ## Health Tracker - Increase `recoveryPerHour` from 2 to 10 for faster recovery (1 hour vs 5 hours) ## Account Manager - Add consecutive failure tracking: `getConsecutiveFailures()`, `incrementConsecutiveFailures()`, `resetConsecutiveFailures()` - Add cooldown mechanism separate from rate limits with `CooldownReason` - Reset consecutive failures on successful request ## Base Strategy - Add `isAccountCoolingDown()` check in `isAccountUsable()` ## Constants - Replace fixed `CAPACITY_RETRY_DELAY_MS` with progressive `CAPACITY_BACKOFF_TIERS_MS` - Add `BACKOFF_BY_ERROR_TYPE` for smart backoff - Add `QUOTA_EXHAUSTED_BACKOFF_TIERS_MS` for progressive quota backoff - Add `MIN_BACKOFF_MS` floor to prevent "Available in 0s" loops - Increase `MAX_CAPACITY_RETRIES` from 3 to 5 - Reduce `RATE_LIMIT_DEDUP_WINDOW_MS` from 5s to 2s ## Frontend - Remove `capacityRetryDelayMs` config (replaced by progressive tiers) - Update default `maxCapacityRetries` display from 3 to 5 ## Testing - Add `tests/stress-test.cjs` for concurrent request stress testing Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -1226,47 +1226,23 @@
|
||||
x-text="$store.global.t('extendedCooldownDesc')">Applied after max consecutive failures.</p>
|
||||
</div>
|
||||
|
||||
<div class="form-control">
|
||||
<label class="label pt-0">
|
||||
<span class="label-text text-gray-400 text-xs"
|
||||
x-text="$store.global.t('capacityRetryDelay')">Capacity Retry Delay</span>
|
||||
<span class="label-text-alt font-mono text-neon-cyan text-xs font-semibold"
|
||||
x-text="Math.round((serverConfig.capacityRetryDelayMs || 2000) / 1000) + 's'"></span>
|
||||
</label>
|
||||
<div class="flex gap-3 items-center">
|
||||
<input type="range" min="500" max="10000" step="500"
|
||||
class="custom-range custom-range-cyan flex-1"
|
||||
:value="serverConfig.capacityRetryDelayMs || 2000"
|
||||
:style="`background-size: ${((serverConfig.capacityRetryDelayMs || 2000) - 500) / 95}% 100%`"
|
||||
@input="toggleCapacityRetryDelayMs($event.target.value)"
|
||||
aria-label="Capacity retry delay slider">
|
||||
<input type="number" min="500" max="10000" step="500"
|
||||
class="input input-xs input-bordered w-20 bg-space-800 border-space-border text-white font-mono text-center"
|
||||
:value="serverConfig.capacityRetryDelayMs || 2000"
|
||||
@change="toggleCapacityRetryDelayMs($event.target.value)"
|
||||
aria-label="Capacity retry delay value">
|
||||
</div>
|
||||
<p class="text-[9px] text-gray-600 mt-1 leading-tight"
|
||||
x-text="$store.global.t('capacityRetryDelayDesc')">Delay for capacity (not quota) issues.</p>
|
||||
</div>
|
||||
|
||||
<div class="form-control">
|
||||
<label class="label pt-0">
|
||||
<span class="label-text text-gray-400 text-xs"
|
||||
x-text="$store.global.t('maxCapacityRetries')">Max Capacity Retries</span>
|
||||
<span class="label-text-alt font-mono text-neon-cyan text-xs font-semibold"
|
||||
x-text="serverConfig.maxCapacityRetries || 3"></span>
|
||||
x-text="serverConfig.maxCapacityRetries || 5"></span>
|
||||
</label>
|
||||
<div class="flex gap-3 items-center">
|
||||
<input type="range" min="1" max="10" step="1"
|
||||
class="custom-range custom-range-cyan flex-1"
|
||||
:value="serverConfig.maxCapacityRetries || 3"
|
||||
:style="`background-size: ${((serverConfig.maxCapacityRetries || 3) - 1) / 0.09}% 100%`"
|
||||
:value="serverConfig.maxCapacityRetries || 5"
|
||||
:style="`background-size: ${((serverConfig.maxCapacityRetries || 5) - 1) / 0.09}% 100%`"
|
||||
@input="toggleMaxCapacityRetries($event.target.value)"
|
||||
aria-label="Max capacity retries slider">
|
||||
<input type="number" min="1" max="10" step="1"
|
||||
class="input input-xs input-bordered w-16 bg-space-800 border-space-border text-white font-mono text-center"
|
||||
:value="serverConfig.maxCapacityRetries || 3"
|
||||
:value="serverConfig.maxCapacityRetries || 5"
|
||||
@change="toggleMaxCapacityRetries($event.target.value)"
|
||||
aria-label="Max capacity retries value">
|
||||
</div>
|
||||
|
||||
Reference in New Issue
Block a user