torbjorn/antigravity-claude-proxy - antigravity-claude-proxy - Gitea: Git with a cup of tea

torbjorn/antigravity-claude-proxy

Author	SHA1	Message	Date
jgor20	a43d2332ca	feat: per-account quota threshold protection (#212 ) feat: per-account quota threshold protection Resolves #135 - Adds configurable quota protection with three-tier threshold resolution (per-model → per-account → global) - New global Minimum Quota Level slider in Settings - Per-account threshold settings via Account Settings modal - Draggable per-account threshold markers on model quota bars - Backend: PATCH /api/accounts/:email endpoint, globalQuotaThreshold config - i18n: quota protection keys for all 5 languages	2026-02-01 17:15:46 +05:30
Badri Narayanan S	5a85f0cfcc	feat: comprehensive rate limit handling overhaul (inspired by opencode-antigravity-auth) This commit addresses "Max retries exceeded" errors during stress testing where all accounts would become exhausted simultaneously due to short per-second rate limits triggering cascading failures. ## Rate Limit Parser (`rate-limit-parser.js`) - Remove 2s buffer enforcement that caused cascading failures when API returned short reset times (200-600ms). Now adds 200ms buffer for sub-500ms resets - Add `parseRateLimitReason()` for smart backoff based on error type: QUOTA_EXHAUSTED, RATE_LIMIT_EXCEEDED, MODEL_CAPACITY_EXHAUSTED, SERVER_ERROR ## Message/Streaming Handlers - Add per-account+model rate limit state tracking with exponential backoff - For short rate limits (< 1 second), wait and retry on same account instead of switching - prevents thundering herd when all accounts hit per-second limits - Add throttle wait support for fallback modes (emergency/lastResort) - Add `calculateSmartBackoff()` with progressive tiers by error type ## HybridStrategy (`hybrid-strategy.js`) - Refactor `#getCandidates()` to return 4 fallback levels: - `normal`: All filters pass (health, tokens, quota) - `quota`: Bypass critical quota check - `emergency`: Bypass health check when ALL accounts unhealthy - `lastResort`: Bypass BOTH health AND token bucket checks - Add throttle wait times: 500ms for lastResort, 250ms for emergency - Fix LRU calculation to use seconds (matches opencode-antigravity-auth) ## Health Tracker - Increase `recoveryPerHour` from 2 to 10 for faster recovery (1 hour vs 5 hours) ## Account Manager - Add consecutive failure tracking: `getConsecutiveFailures()`, `incrementConsecutiveFailures()`, `resetConsecutiveFailures()` - Add cooldown mechanism separate from rate limits with `CooldownReason` - Reset consecutive failures on successful request ## Base Strategy - Add `isAccountCoolingDown()` check in `isAccountUsable()` ## Constants - Replace fixed `CAPACITY_RETRY_DELAY_MS` with progressive `CAPACITY_BACKOFF_TIERS_MS` - Add `BACKOFF_BY_ERROR_TYPE` for smart backoff - Add `QUOTA_EXHAUSTED_BACKOFF_TIERS_MS` for progressive quota backoff - Add `MIN_BACKOFF_MS` floor to prevent "Available in 0s" loops - Increase `MAX_CAPACITY_RETRIES` from 3 to 5 - Reduce `RATE_LIMIT_DEDUP_WINDOW_MS` from 5s to 2s ## Frontend - Remove `capacityRetryDelayMs` config (replaced by progressive tiers) - Update default `maxCapacityRetries` display from 3 to 5 ## Testing - Add `tests/stress-test.cjs` for concurrent request stress testing Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-24 22:43:53 +05:30
Badri Narayanan S	2175118f9f	feat: align project discovery with opencode-antigravity-auth reference - Store project IDs in composite refresh token format (refreshToken\|projectId\|managedProjectId) - Add parseRefreshParts() and formatRefreshParts() for token handling - Extract and persist subscription tier during project discovery - Fetch subscription in blocking mode when missing from cached accounts - Fix conditional duetProject setting to match reference implementation - Export parseTierId() for reuse across modules Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-19 14:21:30 +05:30
Badri Narayanan S	5ae19a5b72	feat: add configurable account selection strategies Refactor account selection into a strategy pattern with three options: - Sticky: cache-optimized, stays on same account until rate-limited - Round-robin: load-balanced, rotates every request - Hybrid (default): smart distribution using health scores, token buckets, and LRU The hybrid strategy uses multiple signals for optimal account selection: health tracking for reliability, client-side token buckets for rate limiting, and LRU freshness to prefer rested accounts. Includes WebUI settings for strategy selection and unit tests. Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-18 03:48:43 +05:30
Badri Narayanan S	77363c679e	fix: accurate quota reporting with project ID and improved rate limit handling - Pass project ID to fetchAvailableModels for accurate per-project quota - Treat missing remainingFraction with resetTime as 0% (exhausted) - Fix double-escaped regex in rate-limit-parser.js (\\d -> \d) - Use ANTIGRAVITY_HEADERS for loadCodeAssist consistency - Store actual reset time from API instead of capping at default - Add getRateLimitInfo() for detailed rate limit state - Handle disabled accounts in rate limit checks Fixes issue where free tier accounts showed 100% quota but were actually exhausted. Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-15 16:18:13 +05:30
Badri Narayanan S	632536e2d7	fix: use configured cooldown as cap for rate limit wait times - Cooldown now caps API-provided reset times instead of being a fallback - Fixed misleading UI descriptions for cooldown settings - Removed unused cooldownDurationMs from settings object - Updated default fallback values in frontend to 10s Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-13 18:28:52 +05:30
Wha1eChai	c9c5e7d486	feat(webui): add hot-reload account management with OAuth support	2026-01-08 23:52:31 +08:00
Badri Narayanan S	9c4a712a9a	Selective fixes from PR #35 : Model-specific rate limits & robustness improvements (#37 ) * feat: apply local user changes and fixes * ;D * Implement OpenAI support, model-specific rate limiting, and robustness fixes * docs: update pr title * feat: ensure unique openai models endpoint * fix: startup banner alignment and removed duplicates * feat: add model fallback system with --fallback flag * fix: accounts cli hanging after completion * feat: add exit option to accounts cli menu * fix: remove circular dependency warning for fallback flag * feat: show active modes in banner and hide their flags * Remove OpenAI compatibility and fallback features from PR #35 Cherry-picked selective fixes from PR #35 while removing: - OpenAI-compatible API endpoints (/openai/v1/) - Model fallback system (fallback-config.js) - Thinking block skip for Gemini models - Unnecessary files (pullrequest.md, test-fix.js, test-openai.js) Retained improvements: - Network error handling with retry logic - Model-specific rate limiting - Enhanced health check with quota info - CLI fixes (exit option, process.exit) - Startup banner alignment (debug mode only) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> banner alignment fix * Refactor: Model-specific rate limits and cleanup deprecated code - Remove global rate limit fields (isRateLimited, rateLimitResetTime) in favor of model-specific limits (modelRateLimits[modelId]) - Remove deprecated wrapper functions (is429Error, isAuthInvalidError) from handlers - Filter fetchAvailableModels to only return Claude and Gemini models - Fix getCurrentStickyAccount() to pass model param after waiting - Update /account-limits endpoint to show model-specific limits - Remove multi-account OAuth flow to avoid state mismatch errors 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: show (x/y) limited status in account-limits table - Status is now "ok" only when all models are available - Shows "(x/y) limited" when x out of y models are exhausted - Provides better visibility into partial rate limiting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: update CLAUDE.md with model-specific rate limiting - Document modelRateLimits[modelId] for per-model rate tracking - Add isNetworkError() helper to utilities section 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: M1noa <minoa@minoa.cat> Co-authored-by: Minoa <altgithub@minoa.cat> Co-authored-by: Claude <noreply@anthropic.com>	2026-01-03 15:33:49 +05:30
Badri Narayanan S	f02364d4ef	refactor: Reorganize src/ into modular folder structure Split large monolithic files into focused modules: - cloudcode-client.js (1,107 lines) → src/cloudcode/ (9 files) - account-manager.js (639 lines) → src/account-manager/ (5 files) - Move auth files to src/auth/ (oauth, token-extractor, database) - Move CLI to src/cli/accounts.js Update all import paths and documentation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-01 15:13:43 +05:30