Claude Code CLI sends cache_control on text, thinking, tool_use, and
tool_result blocks for prompt caching. Cloud Code API rejects these
with "Extra inputs are not permitted".
- Add cleanCacheControl() to proactively strip cache_control at pipeline entry
- Add sanitizeTextBlock() and sanitizeToolUseBlock() for defense-in-depth
- Update reorderAssistantContent() to use block sanitizers
- Add test-cache-control.cjs with multi-model test coverage
- Update frontend dashboard tests to match current UI design
- Update strategy tests to match v2.4.0 fallback behavior
- Update CLAUDE.md and README.md with recent features
Inspired by Antigravity-Manager's clean_cache_control_from_messages() pattern.
Co-Authored-By: Claude <noreply@anthropic.com>
The hybrid strategy now considers account quota levels when selecting
accounts, preventing any single account from being drained to 0%.
- Add QuotaTracker class to track per-account quota levels
- Exclude accounts with critical quota (<5%) from selection
- Add quota component to scoring formula (weight: 3)
- Fall back to critical accounts when no alternatives exist
- Add 18 new tests for quota-aware selection
Scoring formula: Health×2 + Tokens×5 + Quota×3 + LRU×0.1
An attempt at resolving badrisnarayanan/antigravity-claude-proxy#171
Refactor account selection into a strategy pattern with three options:
- Sticky: cache-optimized, stays on same account until rate-limited
- Round-robin: load-balanced, rotates every request
- Hybrid (default): smart distribution using health scores, token buckets, and LRU
The hybrid strategy uses multiple signals for optimal account selection:
health tracking for reliability, client-side token buckets for rate limiting,
and LRU freshness to prefer rested accounts.
Includes WebUI settings for strategy selection and unit tests.
Co-Authored-By: Claude <noreply@anthropic.com>