Commit Graph

149 Commits

Author SHA1 Message Date
Badri Narayanan S
896bf81a36 revert: remove count_tokens endpoint (caused regression)
Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-14 23:43:16 +05:30
Badri Narayanan S
fa29de7183 fix: handle unsigned thinking blocks in tool loops (#120)
When Claude Code strips thinking signatures it doesn't recognize,
the proxy would drop unsigned thinking blocks, causing the error
"Expected thinking but found text". This fix detects unsigned
thinking blocks and triggers recovery to close the tool loop.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-14 23:01:13 +05:30
Badri Narayanan S
dee7512bd8 fix: improve subscription tier detection for Pro accounts
The tier detection was incorrectly showing Pro accounts as "free" due to:
1. paidTier field being flaky/missing from some API responses
2. standard-tier not being recognized as a Pro tier
3. No fallback to allowedTiers when currentTier is missing

Changes:
- Add parseTierId() helper to centralize tier ID parsing
- Recognize "standard-tier" as Pro (Gemini Code Assist paid tier)
- Add fallback chain: paidTier > currentTier > allowedTiers
- Return "unknown" instead of incorrectly defaulting to "free"

Fixes #121

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-14 21:27:24 +05:30
Badri Narayanan S
772dabe7e4 fix: defer inlineData parts to end of array for parallel tool calls (#91)
When multiple tool_results contain images, the inlineData parts were
being interleaved between functionResponse parts, breaking Claude's API
requirement that functionResponse parts be consecutive.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-14 21:04:36 +05:30
behemoth-phucnm
d33de409d4 docs: fix misleading tokenizer comments 2026-01-14 19:31:43 +07:00
minhphuc429
7da7e887bf feat: use official tokenizers for 99.99% accuracy
Replace gpt-tokenizer with model-specific official tokenizers:
- Claude models: @anthropic-ai/tokenizer (official Anthropic tokenizer)
- Gemini models: @lenml/tokenizer-gemini (GemmaTokenizer)

Changes:
- Add @anthropic-ai/tokenizer and @lenml/tokenizer-gemini dependencies
- Remove gpt-tokenizer dependency
- Update count-tokens.js with model-aware tokenization
- Use getModelFamily() to select appropriate tokenizer
- Lazy-load Gemini tokenizer (138MB) on first use
- Default to local estimation for all content types (no API calls)

Tested with all supported models:
- claude-sonnet-4-5, claude-opus-4-5-thinking, claude-sonnet-4-5-thinking
- gemini-3-flash, gemini-3-pro-low, gemini-3-pro-high
2026-01-14 16:04:13 +07:00
minhphuc429
2bdecf6e96 fix: ensure account manager initialized for count_tokens
- Add ensureInitialized() call before count_tokens handler
- Use hybrid approach: local estimation for text, API for images/docs
- This prevents "No accounts available" error on first request
2026-01-14 15:43:25 +07:00
minhphuc429
df81ba5632 feat: use API-based token counting for 100% accuracy
Switch from local estimation (gpt-tokenizer) to API-based counting
via Google Cloud Code API for accurate token counts. Falls back to
local estimation if API call fails.
2026-01-14 15:36:47 +07:00
minhphuc429
acc228b920 feat: implement /v1/messages/count_tokens endpoint
Add Anthropic-compatible token counting endpoint using hybrid approach:
- Local estimation with gpt-tokenizer for text content (~95% accuracy)
- API-based counting for complex content (images, documents)
- Automatic fallback to local estimation on API errors

This resolves warnings in LiteLLM and other clients that rely on
pre-request token counting.
2026-01-14 15:32:27 +07:00
Badri Narayanan S
12d196f6a0 refactor: centralize TEST_MODELS and DEFAULT_PRESETS in constants.js
- Move TEST_MODELS and DEFAULT_PRESETS to src/constants.js as single source of truth
- Update test-models.cjs helper to use dynamic import from constants
- Make getTestModels() and getModels() async functions
- Update all test files to await async model config loading
- Remove duplicate THINKING_MODELS and getThinkingModels() from test helper
- Make thinking tests more lenient for Gemini (doesn't always produce thinking blocks)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-13 19:20:57 +05:30
Badri Narayanan S
632536e2d7 fix: use configured cooldown as cap for rate limit wait times
- Cooldown now caps API-provided reset times instead of being a fallback
- Fixed misleading UI descriptions for cooldown settings
- Removed unused cooldownDurationMs from settings object
- Updated default fallback values in frontend to 10s

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-13 18:28:52 +05:30
Badri Narayanan S
49e536e9a9 fix: reduce default cooldown from 60s to 10s
Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-13 18:16:30 +05:30
Badri Narayanan S
70fd1baaa8 fix: improve loadCodeAssist for Google One AI Pro accounts
- Add separate LOAD_CODE_ASSIST_ENDPOINTS (prod first) and
  LOAD_CODE_ASSIST_HEADERS (google-api-nodejs-client User-Agent)
- Add duetProject to metadata for project discovery
- Silent fallback when API returns success but no project
  (matches opencode-antigravity-auth behavior)
- Only warn when all endpoints fail with actual errors

Fixes #114, addresses discussion #113

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-13 18:11:45 +05:30
Badri Narayanan S
99a06632ea Merge pull request #102 from liuwuyu118/fix/image-response-and-ui-performance
fix: add image response support
2026-01-13 17:17:39 +05:30
董飞祥
6172f5ef10 feat: add API key authentication for /v1/* endpoints 2026-01-13 16:46:31 +08:00
liuwuyu118
860c0d6c2d fix: add image response support
Convert Google's inlineData format to Anthropic's image format:
- response-converter.js: Handle inlineData in non-streaming responses
- sse-parser.js: Parse inlineData for thinking models
- sse-streamer.js: Stream inlineData as image content blocks

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-13 11:17:17 +08:00
simon-ami
e24dff279c feat(webui): add configuration presets for Claude CLI
- Add backend storage logic in `src/utils/claude-config.js` to save/load/delete presets
- Add API endpoints (`GET`, `POST`, `DELETE`) for presets in `src/webui/index.js`
- Update `public/views/settings.html` with new Presets UI card and modals
- Update `public/js/components/claude-config.js` with auto-load logic and unsaved changes protection
- Add translations (EN/ZH) for new UI elements in `public/js/store.js`
- Add integration tests in `tests/frontend/test-frontend-settings.cjs`
- Update compiled CSS

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-12 11:59:32 +01:00
copilot-swe-agent[bot]
cd594f6e82 Address code review: use constant array for proxy env vars and clean empty env
Co-authored-by: simon-ami <102378134+simon-ami@users.noreply.github.com>
2026-01-11 11:56:01 +00:00
copilot-swe-agent[bot]
8eba68e47a Add Restore Default Claude CLI button to settings page
Co-authored-by: simon-ami <102378134+simon-ami@users.noreply.github.com>
2026-01-11 11:54:13 +00:00
Badri Narayanan S
325acdba8c fix: preserve tool_use stop reason from being overwritten by finishReason
When a tool call is made, stopReason is set to 'tool_use'. However, when
finishReason: STOP arrives later, it was overwriting stopReason back to
'end_turn', breaking multi-turn tool conversations in clients like OpenCode.

Fix: Initialize stopReason to null and only set it from finishReason if
not already set. This ensures tool_use is preserved once detected.

Fixes #96

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-11 11:52:35 +05:30
Wha1eChai
14ef814c1a merge: sync with upstream/main (5xx error fallback feature) 2026-01-11 02:57:59 +08:00
Wha1eChai
bda9623f3a feat(webui): optimize CSS build system and enhance UI quality 2026-01-11 02:57:42 +08:00
Badri Narayanan S
6868bf217c Merge pull request #47 from Wha1eChai/feature/webui
feat: Add Web UI for account and quota management
2026-01-10 22:21:57 +05:30
Tiago Rodrigues
0b477c2552 feat: fallback to alternate model when max retries exceeded with 5xx errors
When all accounts fail with HTTP 500/503 errors (e.g., Google API returning
'Unknown Error' for Claude models on large conversations), the proxy now
attempts to use a fallback model if --fallback is enabled.

This enables graceful degradation when:
- All accounts are exhausted due to 5xx errors (not just rate limits)
- Claude models fail on very large conversations
- The API has temporary issues with specific models

The fallback uses the existing MODEL_FALLBACK_MAP configuration:
- claude-opus-4-5-thinking -> gemini-3-pro-high
- claude-sonnet-4-5-thinking -> gemini-3-flash

Relates to #88
2026-01-10 12:08:53 +00:00
Wha1eChai
ee6d222e4d feat(webui): add subscription tier and quota visualization
Backend:
2026-01-10 06:05:17 +08:00
Wha1eChai
369a66e8cf Merge branch 'main' into feature/webui 2026-01-10 04:46:30 +08:00
Wha1eChai
71c7c2e423 feat(webui): enhance settings UI, persistence and documentation
- Update CLAUDE.md with comprehensive WebUI architecture and API documentation
- Improve settings UI with searchable model dropdowns and visual family indicators
- Migrate usage statistics persistence to user config directory with auto-migration
- Refactor server request handling and fix model suffix logic
2026-01-10 04:22:59 +08:00
Badri Narayanan S
f1e945a7e6 change cleanSchemaForGemini to cleanSchema 2026-01-10 00:35:50 +05:30
Tiago Rodrigues
90214c43b0 fix: convert schema types to Google uppercase format (fixes #82)
The /compact command was failing with 'Proto field is not repeating,
cannot start list' error for Claude models because tool schemas were
sent with lowercase JSON Schema types (array, object, string) but
Google's Cloud Code API expects uppercase protobuf types (ARRAY,
OBJECT, STRING).

Changes:
- Add toGoogleType() function to convert JSON Schema types to Google format
- Add Phase 5 to cleanSchemaForGemini() for type conversion
- Apply cleanSchemaForGemini() for ALL models (not just Gemini) since
  all requests go through Cloud Code API which validates schema format
- Add comprehensive test suite with 10 tests covering nested arrays,
  complex schemas, and real-world Claude Code tool scenarios

Fixes #82
2026-01-09 18:36:19 +00:00
Wha1eChai
169e18402f merge: sync with origin/feature/webui after upstream merge 2026-01-09 18:23:52 +08:00
Wha1eChai
f2f0a7452e merge: integrate upstream/main (v1.2.15) into feature/webui
- Resolved conflict in src/constants.js: kept config-driven approach

- Adopted upstream 10-second cooldown default

- Added MAX_EMPTY_RESPONSE_RETRIES constant from upstream

- Incorporated new test files and GitHub issue templates
2026-01-09 18:08:45 +08:00
Wha1eChai
a914821d49 perf(webui): refactor dashboard modules and optimize API performance 2026-01-09 17:58:09 +08:00
Badri Narayanan S
4c5236d4b3 fix: filter Antigravity system prompt from model responses
- Add [ignore] tags around system instruction to prevent model from
  identifying as "Antigravity" when asked "Who are you?"
- Replace full system instruction with minimal version used by
  CLIProxyAPI/gcli2api to reduce token usage and improve response quality

Fixes #76

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-09 14:10:41 +05:30
SvDp
45755bfa18 fix: add optimistic reset for transient 429 rate limit errors
Fixes issue #71 - 'No accounts available' error when API returns 429

The Google Cloud Code API can return 429 RESOURCE_EXHAUSTED errors even
when accounts have quota available due to:
- Temporary API load/throttling
- Per-minute request limits (not per-day quota)
- Transient backend issues

This fix adds:
1. A 500ms buffer after waiting for rate limits to expire
2. Optimistic rate limit reset when all accounts appear stuck
3. Retry logic that clears rate limits and tries again

The fix works in conjunction with the server-level optimistic reset
that already exists, providing multiple layers of protection against
false 'No accounts available' errors.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-08 21:31:50 +05:30
Wha1eChai
c9c5e7d486 feat(webui): add hot-reload account management with OAuth support 2026-01-08 23:52:31 +08:00
Badri Narayanan S
5f6ce1b97d Update daily Cloud Code endpoint to production URL
Remove sandbox subdomain from daily-cloudcode-pa endpoint.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-08 20:24:23 +05:30
Badri Narayanan S
a696ed0872 Merge pull request #64 from BrunoMarc/fix/empty-response-retry
fix: add retry mechanism for empty API responses
2026-01-08 17:38:18 +05:30
Wha1eChai
217053839f feat(webui): Enhance dashboard, global styles, and settings module
## Dashboard Enhancements
- Add Request Volume trend chart with Chart.js line graph
  - Support Family/Model display modes for aggregation levels
  - Show Total/Today/1H usage statistics
  - Hierarchical filter dropdown with Smart select (Top 5 by 24h usage)
  - Persist chart preferences to localStorage
- Improve account health detection logic
  - Core models (sonnet/opus/pro/flash) require >5% quota to be healthy
  - Dynamic quota ring chart supporting any model family
- Unify table styles with standard-table class

## Global Style Refactoring
- Add CSS variable system for theming
  - Space color scale (950/900/850/800/border)
  - Neon accent colors (purple/green/cyan/yellow/red)
  - Text hierarchy (main/dim/muted/bright)
  - Chart palette (16 colors)
- Add unified component classes
  - .view-container for consistent page layouts
  - .section-header/.section-title/.section-desc
  - .standard-table for table styling
- Update scrollbar, nav-item, progress-bar to use theme variables

## Settings Module Extensions
- Add model mapping column in Models tab
- Enhance model selectors with family color indicators
- Support horizontal scroll for tabs on narrow screens
- Add defaultCooldownMs and maxWaitBeforeErrorMs config options

## New Module
- Add src/modules/usage-stats.js for request tracking
  - Track /v1/messages and /v1/chat/completions endpoints
  - Hierarchical storage: { hour: { family: { model: count } } }
  - Auto-save every minute, 30-day retention
  - GET /api/stats/history endpoint for dashboard chart

## Backend Changes
- Add direct account manipulation helpers (bypass AccountManager)
- Add POST /api/config/password endpoint for WebUI password change
- Auto-reload AccountManager after account operations
- Use CSS variables in OAuth callback pages

## Other
- Update .gitignore for runtime data directory
- Add i18n keys for new UI elements (EN/zh_CN)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-08 19:04:43 +08:00
Badri Narayanan S
f34aa50ba4 Antigravity compatibility to fix antigravity usage 2026-01-08 10:24:54 +05:30
BrunoMarc
1c80c8ba52 fix: address second round code review feedback
Issues found by Claude Opus 4.5 + Gemini 3 Pro:

HIGH PRIORITY FIXES:
- Mark account rate-limited when 429 occurs during retry (was losing resetMs)
- Add exponential backoff between retries (500ms, 1000ms, 2000ms)
- Fix 5xx handling: don't pass error response to streamer, refetch instead
- Use recognizable error messages (429/401) for isRateLimitError/isAuthError

MEDIUM PRIORITY FIXES:
- Refactor while loop to for loop for clearer retry semantics
- Simplify logic flow with early returns

Code review by: Claude Opus 4.5 + Gemini 3 Pro Preview

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 18:21:25 -03:00
BrunoMarc
05cd80ebb5 fix: address code review feedback
- Move MAX_EMPTY_RESPONSE_RETRIES to constants.js for consistency
- Handle 429/401/5xx errors properly during retry fetch
- Use proper message ID format (crypto.randomBytes) instead of Date.now()
- Add crypto import for UUID generation

Code review by: Gemini 3 Pro Preview + Claude Opus 4.5

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 18:16:09 -03:00
BrunoMarc
49480847b6 fix: add retry mechanism for empty API responses
When Claude Code sends requests with large thinking_budget values,
the model may spend all tokens on "thinking" and return empty responses,
causing Claude Code to stop mid-conversation.

This commit adds a retry mechanism that:
- Throws EmptyResponseError instead of emitting fake message on empty response
- Retries up to 2 times before giving up
- Emits fallback message only after all retries are exhausted

Changes:
- src/errors.js: Added EmptyResponseError class and isEmptyResponseError()
- src/cloudcode/sse-streamer.js: Throw error instead of yielding fake message
- src/cloudcode/streaming-handler.js: Added retry loop with fallback

Tested for 6+ hours with 1,884 API requests and 88% recovery rate
on empty responses.

Fixes #61

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 18:11:03 -03:00
Badri Narayanan S
56e5042700 Merge branch 'main' into feature/webui 2026-01-07 02:01:23 +05:30
Badri Narayanan S
63fa90c04b set cooldown to 10 seconds 2026-01-07 00:33:42 +05:30
Badri Narayanan S
57ba5f9c1c setting cooldown back to 30 seconds 2026-01-07 00:20:22 +05:30
Badri Narayanan S
5b70b7703e Changed default cooldown time to 10 seconds 2026-01-06 22:22:32 +05:30
Badri Narayanan S
59d0a92b37 Merge pull request #55 from jgor20/fix/oauth-callback-utf8-encoding
fix(oauth): add UTF-8 encoding to callback HTML pages
2026-01-06 21:49:02 +05:30
jgor20
df9b935329 fix(auth): add UTF-8 charset to OAuth callback HTML responses
Ensure proper encoding for international characters in error and success pages
by specifying charset=utf-8 in Content-Type headers and adding meta charset tags.
2026-01-05 01:43:15 +00:00
jgor20
e29cd5fa9d refactor(auth): use NativeModuleError for native module load failures
Replace generic Error instances with NativeModuleError in loadDatabaseModule
to provide more structured error information, including rebuild status and
restart requirements. Update getAuthStatus to re-throw NativeModuleError
instances without wrapping.
2026-01-05 01:20:35 +00:00
jgor20
b90eb63f22 feat(errors): add NativeModuleError for native module version mismatches
Add a new error class to handle native module errors, including version mismatches and rebuild requirements. This supports the auto-rebuild functionality by providing structured error information for rebuild success and restart needs.
2026-01-05 01:20:28 +00:00