Count responses with thinking content (but no text) as successful,
and validate actual response status instead of hardcoding 200.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Claude Code CLI sends cache_control on text, thinking, tool_use, and
tool_result blocks for prompt caching. Cloud Code API rejects these
with "Extra inputs are not permitted".
- Add cleanCacheControl() to proactively strip cache_control at pipeline entry
- Add sanitizeTextBlock() and sanitizeToolUseBlock() for defense-in-depth
- Update reorderAssistantContent() to use block sanitizers
- Add test-cache-control.cjs with multi-model test coverage
- Update frontend dashboard tests to match current UI design
- Update strategy tests to match v2.4.0 fallback behavior
- Update CLAUDE.md and README.md with recent features
Inspired by Antigravity-Manager's clean_cache_control_from_messages() pattern.
Co-Authored-By: Claude <noreply@anthropic.com>
This commit addresses "Max retries exceeded" errors during stress testing where
all accounts would become exhausted simultaneously due to short per-second rate
limits triggering cascading failures.
## Rate Limit Parser (`rate-limit-parser.js`)
- Remove 2s buffer enforcement that caused cascading failures when API returned
short reset times (200-600ms). Now adds 200ms buffer for sub-500ms resets
- Add `parseRateLimitReason()` for smart backoff based on error type:
QUOTA_EXHAUSTED, RATE_LIMIT_EXCEEDED, MODEL_CAPACITY_EXHAUSTED, SERVER_ERROR
## Message/Streaming Handlers
- Add per-account+model rate limit state tracking with exponential backoff
- For short rate limits (< 1 second), wait and retry on same account instead
of switching - prevents thundering herd when all accounts hit per-second limits
- Add throttle wait support for fallback modes (emergency/lastResort)
- Add `calculateSmartBackoff()` with progressive tiers by error type
## HybridStrategy (`hybrid-strategy.js`)
- Refactor `#getCandidates()` to return 4 fallback levels:
- `normal`: All filters pass (health, tokens, quota)
- `quota`: Bypass critical quota check
- `emergency`: Bypass health check when ALL accounts unhealthy
- `lastResort`: Bypass BOTH health AND token bucket checks
- Add throttle wait times: 500ms for lastResort, 250ms for emergency
- Fix LRU calculation to use seconds (matches opencode-antigravity-auth)
## Health Tracker
- Increase `recoveryPerHour` from 2 to 10 for faster recovery (1 hour vs 5 hours)
## Account Manager
- Add consecutive failure tracking: `getConsecutiveFailures()`,
`incrementConsecutiveFailures()`, `resetConsecutiveFailures()`
- Add cooldown mechanism separate from rate limits with `CooldownReason`
- Reset consecutive failures on successful request
## Base Strategy
- Add `isAccountCoolingDown()` check in `isAccountUsable()`
## Constants
- Replace fixed `CAPACITY_RETRY_DELAY_MS` with progressive `CAPACITY_BACKOFF_TIERS_MS`
- Add `BACKOFF_BY_ERROR_TYPE` for smart backoff
- Add `QUOTA_EXHAUSTED_BACKOFF_TIERS_MS` for progressive quota backoff
- Add `MIN_BACKOFF_MS` floor to prevent "Available in 0s" loops
- Increase `MAX_CAPACITY_RETRIES` from 3 to 5
- Reduce `RATE_LIMIT_DEDUP_WINDOW_MS` from 5s to 2s
## Frontend
- Remove `capacityRetryDelayMs` config (replaced by progressive tiers)
- Update default `maxCapacityRetries` display from 3 to 5
## Testing
- Add `tests/stress-test.cjs` for concurrent request stress testing
Co-Authored-By: Claude <noreply@anthropic.com>
- Update column names from identity/projectId to accountEmail/source/tier
- Change deleteAccount to confirmDeleteAccount (uses confirmation modal)
- Fix modal tests to check index.html instead of accounts.html partial
Co-Authored-By: Claude <noreply@anthropic.com>
* feat: add manual OAuth flow support in WebUI
* fix: reset add account modal state on close
* feat: display custom API key in startup banner
* fix: move translations to separate files and optimize import API
* fix: remove orphaned model-manager.js and cleanup callback server on manual auth
---------
Co-authored-by: Badri Narayanan S <59133612+badrisnarayanan@users.noreply.github.com>
The hybrid strategy now considers account quota levels when selecting
accounts, preventing any single account from being drained to 0%.
- Add QuotaTracker class to track per-account quota levels
- Exclude accounts with critical quota (<5%) from selection
- Add quota component to scoring formula (weight: 3)
- Fall back to critical accounts when no alternatives exist
- Add 18 new tests for quota-aware selection
Scoring formula: Health×2 + Tokens×5 + Quota×3 + LRU×0.1
An attempt at resolving badrisnarayanan/antigravity-claude-proxy#171
Refactor account selection into a strategy pattern with three options:
- Sticky: cache-optimized, stays on same account until rate-limited
- Round-robin: load-balanced, rotates every request
- Hybrid (default): smart distribution using health scores, token buckets, and LRU
The hybrid strategy uses multiple signals for optimal account selection:
health tracking for reliability, client-side token buckets for rate limiting,
and LRU freshness to prefer rested accounts.
Includes WebUI settings for strategy selection and unit tests.
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: preserve whitespace-only chunks in SSE stream
Fixes issue #138 where Claude models would swallow spaces between words because whitespace-only chunks (e.g., " ") were being filtered out as empty.
Changes:
- Modified sse-streamer.js to only skip truly empty strings (""), preserving strings that contain only whitespace.
- Added regression test case in tests/test-streaming-whitespace.cjs to verify whitespace preservation.
* test: add streaming whitespace regression test to main suite
---------
Co-authored-by: walczak <walczak@ial.ruhr>
When Claude Code strips thinking signatures it doesn't recognize,
the proxy would drop unsigned thinking blocks, causing the error
"Expected thinking but found text". This fix detects unsigned
thinking blocks and triggers recovery to close the tool loop.
Co-Authored-By: Claude <noreply@anthropic.com>
Add comprehensive test suite for /v1/messages/count_tokens endpoint:
- Simple text messages
- Multi-turn conversations
- System prompts (string and array format)
- Tool definitions and tool use/result blocks
- Thinking blocks
- Content arrays with text blocks
- Error handling for invalid requests
- Long text tokenization
Also adds npm script test:counttokens for running tests individually.
- Move TEST_MODELS and DEFAULT_PRESETS to src/constants.js as single source of truth
- Update test-models.cjs helper to use dynamic import from constants
- Make getTestModels() and getModels() async functions
- Update all test files to await async model config loading
- Remove duplicate THINKING_MODELS and getThinkingModels() from test helper
- Make thinking tests more lenient for Gemini (doesn't always produce thinking blocks)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add backend storage logic in `src/utils/claude-config.js` to save/load/delete presets
- Add API endpoints (`GET`, `POST`, `DELETE`) for presets in `src/webui/index.js`
- Update `public/views/settings.html` with new Presets UI card and modals
- Update `public/js/components/claude-config.js` with auto-load logic and unsaved changes protection
- Add translations (EN/ZH) for new UI elements in `public/js/store.js`
- Add integration tests in `tests/frontend/test-frontend-settings.cjs`
- Update compiled CSS
Co-Authored-By: Claude <noreply@anthropic.com>
- Add test-schema-sanitizer.cjs to run-all.cjs test runner
- Add test:sanitizer npm script for running it individually
- Update test to use renamed cleanSchema function
- Fix interleaved thinking test to not require thinking blocks after
tool result (model decides when to use visible thinking)
Co-Authored-By: Claude <noreply@anthropic.com>
The /compact command was failing with 'Proto field is not repeating,
cannot start list' error for Claude models because tool schemas were
sent with lowercase JSON Schema types (array, object, string) but
Google's Cloud Code API expects uppercase protobuf types (ARRAY,
OBJECT, STRING).
Changes:
- Add toGoogleType() function to convert JSON Schema types to Google format
- Add Phase 5 to cleanSchemaForGemini() for type conversion
- Apply cleanSchemaForGemini() for ALL models (not just Gemini) since
all requests go through Cloud Code API which validates schema format
- Add comprehensive test suite with 10 tests covering nested arrays,
complex schemas, and real-world Claude Code tool scenarios
Fixes#82
- Add time range selector (1H/6H/24H/7D/All) with preference persistence
- Implement smart X-axis label formatting for multi-day usage data
- Standardize global component spacing and fix CSS @apply limitations
- Add elegant empty state UI for charts when filtered data is absent
- Update i18n translations for all new dashboard features
- Add test:emptyretry script and include in test suite
- Fix test-interleaved-thinking: use complex prompt to force thinking
- Fix test-multiturn-thinking-tools: make Turn 2 lenient (thinking optional)
- Fix test-multiturn-thinking-tools-streaming: same lenient approach
- Use TEST_MODELS helper instead of hardcoded model ID
Models may skip thinking on obvious next steps - this is valid behavior.
Tests now only require thinking on first turn to verify signatures work.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Adds comprehensive test for the empty response retry mechanism:
- Verifies EmptyResponseError class exists and works correctly
- Tests basic requests still work (no regression)
- Validates error class behavior and detection
All tests pass successfully.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
## Summary
Add an optional Web UI for managing accounts and monitoring quotas.
WebUI is implemented as a modular plugin with minimal changes to server.js (only 5 lines added).
## New Features
- Dashboard: Real-time model quota visualization with Chart.js
- Accounts: OAuth-based account management (add/enable/disable/refresh/remove)
- Logs: Live server log streaming via SSE with search and level filtering
- Settings: System configuration with 4 tabs
- Interface: Language (EN/zh_CN), polling interval, log buffer size, display options
- Claude CLI: Proxy connection config, model selection, alias overrides (~/.claude.json)
- Models: Model visibility and ordering management
- Server Info: Runtime info and account config reload
## Technical Changes
- Add src/webui/index.js as modular plugin (all WebUI routes encapsulated)
- Add src/config.js for centralized configuration (~/.config/antigravity-proxy/config.json)
- Add authMiddleware for optional password protection (WEBUI_PASSWORD env var)
- Enhance logger with EventEmitter for SSE log streaming
- Make constants configurable via config.json
- Merge with main v1.2.6 (model fallback, cross-model thinking)
- server.js changes: only 5 lines added to import and mount WebUI module
## Bug Fixes
- Fix Alpine.js $watch error in settings-store.js (not supported in store init)
- Fix "OK" label to "SUCCESS" in logs filter
- Add saveSettings() calls to settings toggles for proper persistence
- Improve Claude CLI config robustness (handle empty/invalid JSON files)
- Add safety check for empty config.env in claude-config component
- Improve config.example.json instructions with clear Windows/macOS/Linux paths
## New Files
- src/webui/index.js - WebUI module with all API routes
- public/ - Complete Web UI frontend (Alpine.js + TailwindCSS + DaisyUI)
- src/config.js - Configuration management
- src/utils/claude-config.js - Claude CLI settings helper
- tests/frontend/ - Frontend test suite
## API Endpoints Added
- GET/POST /api/config - Server configuration
- GET/POST /api/claude/config - Claude CLI configuration
- POST /api/models/config - Model alias/hidden settings
- GET /api/accounts - Account list with status
- POST /api/accounts/:email/toggle - Enable/disable account
- POST /api/accounts/:email/refresh - Refresh account token
- DELETE /api/accounts/:email - Remove account
- GET /api/logs - Log history
- GET /api/logs/stream - Live log streaming (SSE)
- GET /api/auth/url - OAuth URL generation
- GET /oauth/callback - OAuth callback handler
## Backward Compatibility
- Default port remains 8080
- All existing CLI/API functionality unchanged
- WebUI is entirely optional
- Can be disabled by removing mountWebUI() call
- Fix extractCodeFromInput destructuring: returns { code, state } not
{ code, extractedState }, so state validation was being bypassed
- Add --no-browser hint to CLI banner for discoverability
- Document --no-browser mode in README.md and CLAUDE.md
- Add test:oauth script to package.json
- Add OAuth test to run-all.cjs test suite
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Implement sticky account selection for prompt cache continuity
- Derive stable session ID from first user message (SHA256 hash)
- Return cache_read_input_tokens in usage metadata
- Add claude-sonnet-4-5 model without thinking
- Remove DEFAULT_THINKING_BUDGET (let API use its default)
- Add prompt caching test
- Update README and CLAUDE.md documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Create src/errors.js with custom error classes (RateLimitError, AuthError, ApiError, etc.)
- Create src/utils/helpers.js with shared utilities (formatDuration, sleep)
- Create tests/helpers/http-client.cjs with shared test utilities (~250 lines deduped)
- Centralize OAuth config and other constants in src/constants.js
- Add JSDoc types to all major exported functions
- Refactor all test files to use shared http-client utilities
- Update CLAUDE.md with new architecture documentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>