Add model validation cache with 5-minute TTL to reject invalid model IDs
upfront instead of sending them to the API. This provides better error
messages and avoids unnecessary API calls.
- Add MODEL_VALIDATION_CACHE_TTL_MS constant (5 min)
- Add isValidModel() with lazy cache population
- Warm cache when listModels() is called
- Validate model ID in /v1/messages before processing
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
400 errors (INVALID_ARGUMENT) are client errors that won't be fixed by
switching accounts. Previously the proxy would cycle through all accounts
before returning the error. Now it fails immediately.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use startsWith() for count_tokens URL check to match requests with
query parameters (e.g., ?beta=true).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Make mode detection more robust (handle ::1, 0.0.0.0)
- Add getProxyPort() to parse port from ANTHROPIC_BASE_URL dynamically
- Add i18n translation keys for mode toggle in all 5 languages
- Update settings.html to use translation keys and dynamic port
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Resolved merge conflicts in public/views/settings.html:
- Fixed HTML entity escaping for quote characters in presetHint text
- Fixed HTML entity escaping for pendingPresetName text
- Update src/index.js to use HOST environment variable for main server
- Update src/auth/oauth.js to use HOST environment variable for OAuth callback server
- Add diagnostic logging to show actual bound address on startup
- Update startup banner to reflect correct host URL
Co-Authored-By: Claude (gemini-3-flash[1m]) <noreply@anthropic.com>
- Add version display next to "CLAUDE PROXY SYSTEM" in navbar
- Replace version with GitHub icon link in footer
- Update footer styling with larger text and better visibility
- Add getPackageVersion() utility function in src/utils/helpers.js
- Display version in CLI startup banner (e.g., "v2.4.2")
- Display version in WebUI navbar next to "CLAUDE PROXY SYSTEM"
- Refactor WebUI to use shared getPackageVersion() utility
- Update footer with GitHub icon and link
- Changed logging middleware to use req.originalUrl instead of req.path,
which was mangled by Express wildcard catch-all path stripping
- Suppress Chrome DevTools /.well-known/ requests from logs (debug-only)
- Update data-store.js to filter out disabled accounts in computeQuotaRows()
- Update data-store.js to filter out disabled accounts in getUnfilteredQuotaData()
- Ensures Global Quota average and Account Distribution only reflect active accounts
Co-Authored-By: Claude <noreply@anthropic.com>
Claude Code CLI sends cache_control on text, thinking, tool_use, and
tool_result blocks for prompt caching. Cloud Code API rejects these
with "Extra inputs are not permitted".
- Add cleanCacheControl() to proactively strip cache_control at pipeline entry
- Add sanitizeTextBlock() and sanitizeToolUseBlock() for defense-in-depth
- Update reorderAssistantContent() to use block sanitizers
- Add test-cache-control.cjs with multi-model test coverage
- Update frontend dashboard tests to match current UI design
- Update strategy tests to match v2.4.0 fallback behavior
- Update CLAUDE.md and README.md with recent features
Inspired by Antigravity-Manager's clean_cache_control_from_messages() pattern.
Co-Authored-By: Claude <noreply@anthropic.com>
This commit addresses "Max retries exceeded" errors during stress testing where
all accounts would become exhausted simultaneously due to short per-second rate
limits triggering cascading failures.
## Rate Limit Parser (`rate-limit-parser.js`)
- Remove 2s buffer enforcement that caused cascading failures when API returned
short reset times (200-600ms). Now adds 200ms buffer for sub-500ms resets
- Add `parseRateLimitReason()` for smart backoff based on error type:
QUOTA_EXHAUSTED, RATE_LIMIT_EXCEEDED, MODEL_CAPACITY_EXHAUSTED, SERVER_ERROR
## Message/Streaming Handlers
- Add per-account+model rate limit state tracking with exponential backoff
- For short rate limits (< 1 second), wait and retry on same account instead
of switching - prevents thundering herd when all accounts hit per-second limits
- Add throttle wait support for fallback modes (emergency/lastResort)
- Add `calculateSmartBackoff()` with progressive tiers by error type
## HybridStrategy (`hybrid-strategy.js`)
- Refactor `#getCandidates()` to return 4 fallback levels:
- `normal`: All filters pass (health, tokens, quota)
- `quota`: Bypass critical quota check
- `emergency`: Bypass health check when ALL accounts unhealthy
- `lastResort`: Bypass BOTH health AND token bucket checks
- Add throttle wait times: 500ms for lastResort, 250ms for emergency
- Fix LRU calculation to use seconds (matches opencode-antigravity-auth)
## Health Tracker
- Increase `recoveryPerHour` from 2 to 10 for faster recovery (1 hour vs 5 hours)
## Account Manager
- Add consecutive failure tracking: `getConsecutiveFailures()`,
`incrementConsecutiveFailures()`, `resetConsecutiveFailures()`
- Add cooldown mechanism separate from rate limits with `CooldownReason`
- Reset consecutive failures on successful request
## Base Strategy
- Add `isAccountCoolingDown()` check in `isAccountUsable()`
## Constants
- Replace fixed `CAPACITY_RETRY_DELAY_MS` with progressive `CAPACITY_BACKOFF_TIERS_MS`
- Add `BACKOFF_BY_ERROR_TYPE` for smart backoff
- Add `QUOTA_EXHAUSTED_BACKOFF_TIERS_MS` for progressive quota backoff
- Add `MIN_BACKOFF_MS` floor to prevent "Available in 0s" loops
- Increase `MAX_CAPACITY_RETRIES` from 3 to 5
- Reduce `RATE_LIMIT_DEDUP_WINDOW_MS` from 5s to 2s
## Frontend
- Remove `capacityRetryDelayMs` config (replaced by progressive tiers)
- Update default `maxCapacityRetries` display from 3 to 5
## Testing
- Add `tests/stress-test.cjs` for concurrent request stress testing
Co-Authored-By: Claude <noreply@anthropic.com>
- Update column names from identity/projectId to accountEmail/source/tier
- Change deleteAccount to confirmDeleteAccount (uses confirmation modal)
- Fix modal tests to check index.html instead of accounts.html partial
Co-Authored-By: Claude <noreply@anthropic.com>
* feat: add manual OAuth flow support in WebUI
* fix: reset add account modal state on close
* feat: display custom API key in startup banner
* fix: move translations to separate files and optimize import API
* fix: remove orphaned model-manager.js and cleanup callback server on manual auth
---------
Co-authored-by: Badri Narayanan S <59133612+badrisnarayanan@users.noreply.github.com>
When all accounts are rate-limited or token-exhausted, the retry loop
was incorrectly counting the wait time as a failed attempt. This caused
premature "Max retries exceeded" errors when we were just patiently
waiting for accounts to become available.
- Add attempt-- after sleeping for rate limits or strategy waits
- Add #diagnoseNoCandidates() to hybrid strategy for better logging
- Add getTimeUntilNextToken() and getMinTimeUntilToken() to token tracker
- Return waitMs from hybrid strategy when all accounts are token-blocked
Co-Authored-By: Claude <noreply@anthropic.com>
The POST /api/config endpoint was missing validation for several
Advanced Server Settings fields, causing the sliders to fail silently.
Added support for: rateLimitDedupWindowMs, maxConsecutiveFailures,
extendedCooldownMs, capacityRetryDelayMs, maxCapacityRetries.
Fixes#181
Co-Authored-By: Claude <noreply@anthropic.com>
The data store's fetchVersion() was never called, so maxAccounts stayed
at the default value of 10. Consolidated into the global store's
fetchVersion() which is called on init.
Co-Authored-By: Claude <noreply@anthropic.com>
The hybrid strategy now considers account quota levels when selecting
accounts, preventing any single account from being drained to 0%.
- Add QuotaTracker class to track per-account quota levels
- Exclude accounts with critical quota (<5%) from selection
- Add quota component to scoring formula (weight: 3)
- Fall back to critical accounts when no alternatives exist
- Add 18 new tests for quota-aware selection
Scoring formula: Health×2 + Tokens×5 + Quota×3 + LRU×0.1
An attempt at resolving badrisnarayanan/antigravity-claude-proxy#171
Adds `maxAccounts` configuration parameter to control the maximum number of Google accounts.
**Changes:**
- New config field `maxAccounts` (default: 10, range: 1-100)
- Settings page: slider control for adjusting limit
- Accounts page: counter badge (e.g., "8/10") with visual feedback
- Add button disabled when limit reached
- Server-side validation on account creation
**Breaking Changes:** None
Adds logging when 403/404 errors trigger endpoint fallback (daily → prod).
The retry behavior was already working but silently - now it's visible.
Matches opencode-antigravity-auth error handling behavior.
Co-Authored-By: Claude <noreply@anthropic.com>
The version was stuck at "1.0.0" because fetchVersion() was only called
when initialLoad was true, but loadFromCache() set initialLoad to false
before fetchData() ran. Now version is fetched unconditionally in the
global store's init().
Fixes#144
Co-Authored-By: Claude <noreply@anthropic.com>
Fixes regression where /v1/models returned 500 error because
pickNext() method was removed in v2.2.x refactor.
Fixes#164
Co-Authored-By: Claude <noreply@anthropic.com>
paidTier values like g1-pro-tier and g1-ultra-tier are rejected by
the onboardUser API with 400 INVALID_ARGUMENT. This matches the
opencode-antigravity-auth reference which only uses allowedTiers.
Co-Authored-By: Claude <noreply@anthropic.com>
- Store project IDs in composite refresh token format (refreshToken|projectId|managedProjectId)
- Add parseRefreshParts() and formatRefreshParts() for token handling
- Extract and persist subscription tier during project discovery
- Fetch subscription in blocking mode when missing from cached accounts
- Fix conditional duetProject setting to match reference implementation
- Export parseTierId() for reuse across modules
Co-Authored-By: Claude <noreply@anthropic.com>