This commit addresses "Max retries exceeded" errors during stress testing where
all accounts would become exhausted simultaneously due to short per-second rate
limits triggering cascading failures.
## Rate Limit Parser (`rate-limit-parser.js`)
- Remove 2s buffer enforcement that caused cascading failures when API returned
short reset times (200-600ms). Now adds 200ms buffer for sub-500ms resets
- Add `parseRateLimitReason()` for smart backoff based on error type:
QUOTA_EXHAUSTED, RATE_LIMIT_EXCEEDED, MODEL_CAPACITY_EXHAUSTED, SERVER_ERROR
## Message/Streaming Handlers
- Add per-account+model rate limit state tracking with exponential backoff
- For short rate limits (< 1 second), wait and retry on same account instead
of switching - prevents thundering herd when all accounts hit per-second limits
- Add throttle wait support for fallback modes (emergency/lastResort)
- Add `calculateSmartBackoff()` with progressive tiers by error type
## HybridStrategy (`hybrid-strategy.js`)
- Refactor `#getCandidates()` to return 4 fallback levels:
- `normal`: All filters pass (health, tokens, quota)
- `quota`: Bypass critical quota check
- `emergency`: Bypass health check when ALL accounts unhealthy
- `lastResort`: Bypass BOTH health AND token bucket checks
- Add throttle wait times: 500ms for lastResort, 250ms for emergency
- Fix LRU calculation to use seconds (matches opencode-antigravity-auth)
## Health Tracker
- Increase `recoveryPerHour` from 2 to 10 for faster recovery (1 hour vs 5 hours)
## Account Manager
- Add consecutive failure tracking: `getConsecutiveFailures()`,
`incrementConsecutiveFailures()`, `resetConsecutiveFailures()`
- Add cooldown mechanism separate from rate limits with `CooldownReason`
- Reset consecutive failures on successful request
## Base Strategy
- Add `isAccountCoolingDown()` check in `isAccountUsable()`
## Constants
- Replace fixed `CAPACITY_RETRY_DELAY_MS` with progressive `CAPACITY_BACKOFF_TIERS_MS`
- Add `BACKOFF_BY_ERROR_TYPE` for smart backoff
- Add `QUOTA_EXHAUSTED_BACKOFF_TIERS_MS` for progressive quota backoff
- Add `MIN_BACKOFF_MS` floor to prevent "Available in 0s" loops
- Increase `MAX_CAPACITY_RETRIES` from 3 to 5
- Reduce `RATE_LIMIT_DEDUP_WINDOW_MS` from 5s to 2s
## Frontend
- Remove `capacityRetryDelayMs` config (replaced by progressive tiers)
- Update default `maxCapacityRetries` display from 3 to 5
## Testing
- Add `tests/stress-test.cjs` for concurrent request stress testing
Co-Authored-By: Claude <noreply@anthropic.com>
* feat: add manual OAuth flow support in WebUI
* fix: reset add account modal state on close
* feat: display custom API key in startup banner
* fix: move translations to separate files and optimize import API
* fix: remove orphaned model-manager.js and cleanup callback server on manual auth
---------
Co-authored-by: Badri Narayanan S <59133612+badrisnarayanan@users.noreply.github.com>
Refactor account selection into a strategy pattern with three options:
- Sticky: cache-optimized, stays on same account until rate-limited
- Round-robin: load-balanced, rotates every request
- Hybrid (default): smart distribution using health scores, token buckets, and LRU
The hybrid strategy uses multiple signals for optimal account selection:
health tracking for reliability, client-side token buckets for rate limiting,
and LRU freshness to prefer rested accounts.
Includes WebUI settings for strategy selection and unit tests.
Co-Authored-By: Claude <noreply@anthropic.com>
* feat: add i18n support with separate translation files
- Extract translations from store.js to separate files for easier management
- Add translation files for English (en.js), Indonesian (id.js), Turkish (tr.js), and Chinese (zh.js)
- Load translations via window.translations object before Alpine store initialization
- Add Bahasa Indonesia option to language selector
* feat: translate remaining hardcoded UI strings
- Update index.html to use t() for Menu and GitHub labels
- Update views to translate Tier, Quota, Live, tier badges, and close button
- Update components to use translated error messages and confirmation dialogs
- Update utils to use translated validation and error messages
- Update app-init.js to use translated OAuth success/error messages