When all accounts fail with HTTP 500/503 errors (e.g., Google API returning
'Unknown Error' for Claude models on large conversations), the proxy now
attempts to use a fallback model if --fallback is enabled.
This enables graceful degradation when:
- All accounts are exhausted due to 5xx errors (not just rate limits)
- Claude models fail on very large conversations
- The API has temporary issues with specific models
The fallback uses the existing MODEL_FALLBACK_MAP configuration:
- claude-opus-4-5-thinking -> gemini-3-pro-high
- claude-sonnet-4-5-thinking -> gemini-3-flash
Relates to #88
Fixes issue #71 - 'No accounts available' error when API returns 429
The Google Cloud Code API can return 429 RESOURCE_EXHAUSTED errors even
when accounts have quota available due to:
- Temporary API load/throttling
- Per-minute request limits (not per-day quota)
- Transient backend issues
This fix adds:
1. A 500ms buffer after waiting for rate limits to expire
2. Optimistic rate limit reset when all accounts appear stuck
3. Retry logic that clears rate limits and tries again
The fix works in conjunction with the server-level optimistic reset
that already exists, providing multiple layers of protection against
false 'No accounts available' errors.
Co-Authored-By: Claude <noreply@anthropic.com>
* feat: apply local user changes and fixes
* ;D
* Implement OpenAI support, model-specific rate limiting, and robustness fixes
* docs: update pr title
* feat: ensure unique openai models endpoint
* fix: startup banner alignment and removed duplicates
* feat: add model fallback system with --fallback flag
* fix: accounts cli hanging after completion
* feat: add exit option to accounts cli menu
* fix: remove circular dependency warning for fallback flag
* feat: show active modes in banner and hide their flags
* Remove OpenAI compatibility and fallback features from PR #35
Cherry-picked selective fixes from PR #35 while removing:
- OpenAI-compatible API endpoints (/openai/v1/*)
- Model fallback system (fallback-config.js)
- Thinking block skip for Gemini models
- Unnecessary files (pullrequest.md, test-fix.js, test-openai.js)
Retained improvements:
- Network error handling with retry logic
- Model-specific rate limiting
- Enhanced health check with quota info
- CLI fixes (exit option, process.exit)
- Startup banner alignment (debug mode only)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* banner alignment fix
* Refactor: Model-specific rate limits and cleanup deprecated code
- Remove global rate limit fields (isRateLimited, rateLimitResetTime) in favor of model-specific limits (modelRateLimits[modelId])
- Remove deprecated wrapper functions (is429Error, isAuthInvalidError) from handlers
- Filter fetchAvailableModels to only return Claude and Gemini models
- Fix getCurrentStickyAccount() to pass model param after waiting
- Update /account-limits endpoint to show model-specific limits
- Remove multi-account OAuth flow to avoid state mismatch errors
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: show (x/y) limited status in account-limits table
- Status is now "ok" only when all models are available
- Shows "(x/y) limited" when x out of y models are exhausted
- Provides better visibility into partial rate limiting
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* docs: update CLAUDE.md with model-specific rate limiting
- Document modelRateLimits[modelId] for per-model rate tracking
- Add isNetworkError() helper to utilities section
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: M1noa <minoa@minoa.cat>
Co-authored-by: Minoa <altgithub@minoa.cat>
Co-authored-by: Claude <noreply@anthropic.com>