When Claude Code sends requests with large thinking_budget values,
the model may spend all tokens on "thinking" and return empty responses,
causing Claude Code to stop mid-conversation.
This commit adds a retry mechanism that:
- Throws EmptyResponseError instead of emitting fake message on empty response
- Retries up to 2 times before giving up
- Emits fallback message only after all retries are exhausted
Changes:
- src/errors.js: Added EmptyResponseError class and isEmptyResponseError()
- src/cloudcode/sse-streamer.js: Throw error instead of yielding fake message
- src/cloudcode/streaming-handler.js: Added retry loop with fallback
Tested for 6+ hours with 1,884 API requests and 88% recovery rate
on empty responses.
Fixes#61🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* feat: apply local user changes and fixes
* ;D
* Implement OpenAI support, model-specific rate limiting, and robustness fixes
* docs: update pr title
* feat: ensure unique openai models endpoint
* fix: startup banner alignment and removed duplicates
* feat: add model fallback system with --fallback flag
* fix: accounts cli hanging after completion
* feat: add exit option to accounts cli menu
* fix: remove circular dependency warning for fallback flag
* feat: show active modes in banner and hide their flags
* Remove OpenAI compatibility and fallback features from PR #35
Cherry-picked selective fixes from PR #35 while removing:
- OpenAI-compatible API endpoints (/openai/v1/*)
- Model fallback system (fallback-config.js)
- Thinking block skip for Gemini models
- Unnecessary files (pullrequest.md, test-fix.js, test-openai.js)
Retained improvements:
- Network error handling with retry logic
- Model-specific rate limiting
- Enhanced health check with quota info
- CLI fixes (exit option, process.exit)
- Startup banner alignment (debug mode only)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* banner alignment fix
* Refactor: Model-specific rate limits and cleanup deprecated code
- Remove global rate limit fields (isRateLimited, rateLimitResetTime) in favor of model-specific limits (modelRateLimits[modelId])
- Remove deprecated wrapper functions (is429Error, isAuthInvalidError) from handlers
- Filter fetchAvailableModels to only return Claude and Gemini models
- Fix getCurrentStickyAccount() to pass model param after waiting
- Update /account-limits endpoint to show model-specific limits
- Remove multi-account OAuth flow to avoid state mismatch errors
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: show (x/y) limited status in account-limits table
- Status is now "ok" only when all models are available
- Shows "(x/y) limited" when x out of y models are exhausted
- Provides better visibility into partial rate limiting
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* docs: update CLAUDE.md with model-specific rate limiting
- Document modelRateLimits[modelId] for per-model rate tracking
- Add isNetworkError() helper to utilities section
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: M1noa <minoa@minoa.cat>
Co-authored-by: Minoa <altgithub@minoa.cat>
Co-authored-by: Claude <noreply@anthropic.com>