antigravity-claude-proxy

Author	SHA1	Message	Date
Badri Narayanan S	77363c679e	fix: accurate quota reporting with project ID and improved rate limit handling - Pass project ID to fetchAvailableModels for accurate per-project quota - Treat missing remainingFraction with resetTime as 0% (exhausted) - Fix double-escaped regex in rate-limit-parser.js (\\d -> \d) - Use ANTIGRAVITY_HEADERS for loadCodeAssist consistency - Store actual reset time from API instead of capping at default - Add getRateLimitInfo() for detailed rate limit state - Handle disabled accounts in rate limit checks Fixes issue where free tier accounts showed 100% quota but were actually exhausted. Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-15 16:18:13 +05:30
Badri Narayanan S	896bf81a36	revert: remove count_tokens endpoint (caused regression) Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-14 23:43:16 +05:30
Badri Narayanan S	dee7512bd8	fix: improve subscription tier detection for Pro accounts The tier detection was incorrectly showing Pro accounts as "free" due to: 1. paidTier field being flaky/missing from some API responses 2. standard-tier not being recognized as a Pro tier 3. No fallback to allowedTiers when currentTier is missing Changes: - Add parseTierId() helper to centralize tier ID parsing - Recognize "standard-tier" as Pro (Gemini Code Assist paid tier) - Add fallback chain: paidTier > currentTier > allowedTiers - Return "unknown" instead of incorrectly defaulting to "free" Fixes #121 Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-14 21:27:24 +05:30
behemoth-phucnm	d33de409d4	docs: fix misleading tokenizer comments	2026-01-14 19:31:43 +07:00
minhphuc429	7da7e887bf	feat: use official tokenizers for 99.99% accuracy Replace gpt-tokenizer with model-specific official tokenizers: - Claude models: @anthropic-ai/tokenizer (official Anthropic tokenizer) - Gemini models: @lenml/tokenizer-gemini (GemmaTokenizer) Changes: - Add @anthropic-ai/tokenizer and @lenml/tokenizer-gemini dependencies - Remove gpt-tokenizer dependency - Update count-tokens.js with model-aware tokenization - Use getModelFamily() to select appropriate tokenizer - Lazy-load Gemini tokenizer (138MB) on first use - Default to local estimation for all content types (no API calls) Tested with all supported models: - claude-sonnet-4-5, claude-opus-4-5-thinking, claude-sonnet-4-5-thinking - gemini-3-flash, gemini-3-pro-low, gemini-3-pro-high	2026-01-14 16:04:13 +07:00
minhphuc429	2bdecf6e96	fix: ensure account manager initialized for count_tokens - Add ensureInitialized() call before count_tokens handler - Use hybrid approach: local estimation for text, API for images/docs - This prevents "No accounts available" error on first request	2026-01-14 15:43:25 +07:00
minhphuc429	df81ba5632	feat: use API-based token counting for 100% accuracy Switch from local estimation (gpt-tokenizer) to API-based counting via Google Cloud Code API for accurate token counts. Falls back to local estimation if API call fails.	2026-01-14 15:36:47 +07:00
minhphuc429	acc228b920	feat: implement /v1/messages/count_tokens endpoint Add Anthropic-compatible token counting endpoint using hybrid approach: - Local estimation with gpt-tokenizer for text content (~95% accuracy) - API-based counting for complex content (images, documents) - Automatic fallback to local estimation on API errors This resolves warnings in LiteLLM and other clients that rely on pre-request token counting.	2026-01-14 15:32:27 +07:00
Badri Narayanan S	70fd1baaa8	fix: improve loadCodeAssist for Google One AI Pro accounts - Add separate LOAD_CODE_ASSIST_ENDPOINTS (prod first) and LOAD_CODE_ASSIST_HEADERS (google-api-nodejs-client User-Agent) - Add duetProject to metadata for project discovery - Silent fallback when API returns success but no project (matches opencode-antigravity-auth behavior) - Only warn when all endpoints fail with actual errors Fixes #114, addresses discussion #113 Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-13 18:11:45 +05:30
liuwuyu118	860c0d6c2d	fix: add image response support Convert Google's inlineData format to Anthropic's image format: - response-converter.js: Handle inlineData in non-streaming responses - sse-parser.js: Parse inlineData for thinking models - sse-streamer.js: Stream inlineData as image content blocks Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-13 11:17:17 +08:00
Badri Narayanan S	325acdba8c	fix: preserve tool_use stop reason from being overwritten by finishReason When a tool call is made, stopReason is set to 'tool_use'. However, when finishReason: STOP arrives later, it was overwriting stopReason back to 'end_turn', breaking multi-turn tool conversations in clients like OpenCode. Fix: Initialize stopReason to null and only set it from finishReason if not already set. This ensures tool_use is preserved once detected. Fixes #96 Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-11 11:52:35 +05:30
Badri Narayanan S	6868bf217c	Merge pull request #47 from Wha1eChai/feature/webui feat: Add Web UI for account and quota management	2026-01-10 22:21:57 +05:30
Tiago Rodrigues	0b477c2552	feat: fallback to alternate model when max retries exceeded with 5xx errors When all accounts fail with HTTP 500/503 errors (e.g., Google API returning 'Unknown Error' for Claude models on large conversations), the proxy now attempts to use a fallback model if --fallback is enabled. This enables graceful degradation when: - All accounts are exhausted due to 5xx errors (not just rate limits) - Claude models fail on very large conversations - The API has temporary issues with specific models The fallback uses the existing MODEL_FALLBACK_MAP configuration: - claude-opus-4-5-thinking -> gemini-3-pro-high - claude-sonnet-4-5-thinking -> gemini-3-flash Relates to #88	2026-01-10 12:08:53 +00:00
Wha1eChai	ee6d222e4d	feat(webui): add subscription tier and quota visualization Backend:	2026-01-10 06:05:17 +08:00
Badri Narayanan S	4c5236d4b3	fix: filter Antigravity system prompt from model responses - Add [ignore] tags around system instruction to prevent model from identifying as "Antigravity" when asked "Who are you?" - Replace full system instruction with minimal version used by CLIProxyAPI/gcli2api to reduce token usage and improve response quality Fixes #76 Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-09 14:10:41 +05:30
SvDp	45755bfa18	fix: add optimistic reset for transient 429 rate limit errors Fixes issue #71 - 'No accounts available' error when API returns 429 The Google Cloud Code API can return 429 RESOURCE_EXHAUSTED errors even when accounts have quota available due to: - Temporary API load/throttling - Per-minute request limits (not per-day quota) - Transient backend issues This fix adds: 1. A 500ms buffer after waiting for rate limits to expire 2. Optimistic rate limit reset when all accounts appear stuck 3. Retry logic that clears rate limits and tries again The fix works in conjunction with the server-level optimistic reset that already exists, providing multiple layers of protection against false 'No accounts available' errors. Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-08 21:31:50 +05:30
Badri Narayanan S	a696ed0872	Merge pull request #64 from BrunoMarc/fix/empty-response-retry fix: add retry mechanism for empty API responses	2026-01-08 17:38:18 +05:30
Badri Narayanan S	f34aa50ba4	Antigravity compatibility to fix antigravity usage	2026-01-08 10:24:54 +05:30
BrunoMarc	1c80c8ba52	fix: address second round code review feedback Issues found by Claude Opus 4.5 + Gemini 3 Pro: HIGH PRIORITY FIXES: - Mark account rate-limited when 429 occurs during retry (was losing resetMs) - Add exponential backoff between retries (500ms, 1000ms, 2000ms) - Fix 5xx handling: don't pass error response to streamer, refetch instead - Use recognizable error messages (429/401) for isRateLimitError/isAuthError MEDIUM PRIORITY FIXES: - Refactor while loop to for loop for clearer retry semantics - Simplify logic flow with early returns Code review by: Claude Opus 4.5 + Gemini 3 Pro Preview 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 18:21:25 -03:00
BrunoMarc	05cd80ebb5	fix: address code review feedback - Move MAX_EMPTY_RESPONSE_RETRIES to constants.js for consistency - Handle 429/401/5xx errors properly during retry fetch - Use proper message ID format (crypto.randomBytes) instead of Date.now() - Add crypto import for UUID generation Code review by: Gemini 3 Pro Preview + Claude Opus 4.5 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 18:16:09 -03:00
BrunoMarc	49480847b6	fix: add retry mechanism for empty API responses When Claude Code sends requests with large thinking_budget values, the model may spend all tokens on "thinking" and return empty responses, causing Claude Code to stop mid-conversation. This commit adds a retry mechanism that: - Throws EmptyResponseError instead of emitting fake message on empty response - Retries up to 2 times before giving up - Emits fallback message only after all retries are exhausted Changes: - src/errors.js: Added EmptyResponseError class and isEmptyResponseError() - src/cloudcode/sse-streamer.js: Throw error instead of yielding fake message - src/cloudcode/streaming-handler.js: Added retry loop with fallback Tested for 6+ hours with 1,884 API requests and 88% recovery rate on empty responses. Fixes #61 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 18:11:03 -03:00
Badri Narayanan S	ac9ec6b358	Signature handling for fallback	2026-01-03 22:01:57 +05:30
Badri Narayanan S	df6625b531	fallback changes from PR #35	2026-01-03 18:01:21 +05:30
Badri Narayanan S	9c4a712a9a	Selective fixes from PR #35 : Model-specific rate limits & robustness improvements (#37 ) * feat: apply local user changes and fixes * ;D * Implement OpenAI support, model-specific rate limiting, and robustness fixes * docs: update pr title * feat: ensure unique openai models endpoint * fix: startup banner alignment and removed duplicates * feat: add model fallback system with --fallback flag * fix: accounts cli hanging after completion * feat: add exit option to accounts cli menu * fix: remove circular dependency warning for fallback flag * feat: show active modes in banner and hide their flags * Remove OpenAI compatibility and fallback features from PR #35 Cherry-picked selective fixes from PR #35 while removing: - OpenAI-compatible API endpoints (/openai/v1/) - Model fallback system (fallback-config.js) - Thinking block skip for Gemini models - Unnecessary files (pullrequest.md, test-fix.js, test-openai.js) Retained improvements: - Network error handling with retry logic - Model-specific rate limiting - Enhanced health check with quota info - CLI fixes (exit option, process.exit) - Startup banner alignment (debug mode only) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> banner alignment fix * Refactor: Model-specific rate limits and cleanup deprecated code - Remove global rate limit fields (isRateLimited, rateLimitResetTime) in favor of model-specific limits (modelRateLimits[modelId]) - Remove deprecated wrapper functions (is429Error, isAuthInvalidError) from handlers - Filter fetchAvailableModels to only return Claude and Gemini models - Fix getCurrentStickyAccount() to pass model param after waiting - Update /account-limits endpoint to show model-specific limits - Remove multi-account OAuth flow to avoid state mismatch errors 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: show (x/y) limited status in account-limits table - Status is now "ok" only when all models are available - Shows "(x/y) limited" when x out of y models are exhausted - Provides better visibility into partial rate limiting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * docs: update CLAUDE.md with model-specific rate limiting - Document modelRateLimits[modelId] for per-model rate tracking - Add isNetworkError() helper to utilities section 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: M1noa <minoa@minoa.cat> Co-authored-by: Minoa <altgithub@minoa.cat> Co-authored-by: Claude <noreply@anthropic.com>	2026-01-03 15:33:49 +05:30
Badri Narayanan S	f02364d4ef	refactor: Reorganize src/ into modular folder structure Split large monolithic files into focused modules: - cloudcode-client.js (1,107 lines) → src/cloudcode/ (9 files) - account-manager.js (639 lines) → src/account-manager/ (5 files) - Move auth files to src/auth/ (oauth, token-extractor, database) - Move CLI to src/cli/accounts.js Update all import paths and documentation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-01-01 15:13:43 +05:30

25 Commits