fix: strip cache_control fields from content blocks (#189)

Claude Code CLI sends cache_control on text, thinking, tool_use, and
tool_result blocks for prompt caching. Cloud Code API rejects these
with "Extra inputs are not permitted".

- Add cleanCacheControl() to proactively strip cache_control at pipeline entry
- Add sanitizeTextBlock() and sanitizeToolUseBlock() for defense-in-depth
- Update reorderAssistantContent() to use block sanitizers
- Add test-cache-control.cjs with multi-model test coverage
- Update frontend dashboard tests to match current UI design
- Update strategy tests to match v2.4.0 fallback behavior
- Update CLAUDE.md and README.md with recent features

Inspired by Antigravity-Manager's clean_cache_control_from_messages() pattern.

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Badri Narayanan S
2026-01-25 03:27:05 +05:30
parent 6cadaee928
commit 683ca41480
9 changed files with 466 additions and 30 deletions

View File

@@ -55,6 +55,7 @@ npm run test:images # Image processing
npm run test:caching # Prompt caching
npm run test:crossmodel # Cross-model thinking signatures
npm run test:oauth # OAuth no-browser mode
npm run test:cache-control # Cache control field stripping
# Run strategy unit tests (no server required)
node tests/test-strategies.cjs
@@ -102,7 +103,8 @@ src/
│ └── trackers/ # State trackers for hybrid strategy
│ ├── index.js # Re-exports trackers
│ ├── health-tracker.js # Account health scores
── token-bucket-tracker.js # Client-side rate limiting
── token-bucket-tracker.js # Client-side rate limiting
│ └── quota-tracker.js # Quota-aware account selection
├── auth/ # Authentication
│ ├── oauth.js # Google OAuth with PKCE
@@ -211,11 +213,15 @@ public/
- Maximizes concurrent request distribution
3. **Hybrid Strategy** (default, smart distribution):
- Uses health scores, token buckets, and LRU for selection
- Scoring formula: `score = (Health × 2) + ((Tokens / MaxTokens × 100) × 5) + (LRU × 0.1)`
- Uses health scores, token buckets, quota awareness, and LRU for selection
- Scoring formula: `score = (Health × 2) + ((Tokens / MaxTokens × 100) × 5) + (Quota × 1) + (LRU × 0.1)`
- Health scores: Track success/failure patterns with passive recovery
- Token buckets: Client-side rate limiting (50 tokens, 6 per minute regeneration)
- Quota awareness: Accounts with critical quota (<5%) are deprioritized
- LRU freshness: Prefer accounts that have rested longer
- **Emergency/Last Resort Fallback**: When all accounts are exhausted:
- Emergency fallback: Bypasses health check, adds 250ms throttle delay
- Last resort fallback: Bypasses both health and token checks, adds 500ms throttle delay
- Configuration in `src/config.js` under `accountSelection`
**Account Data Model:**
@@ -251,6 +257,14 @@ Each account object in `accounts.json` contains:
- For Gemini targets: strict validation - drops unknown or mismatched signatures
- For Claude targets: lenient - lets Claude validate its own signatures
**Cache Control Handling (Issue #189):**
- Claude Code CLI sends `cache_control` fields on content blocks for prompt caching
- Cloud Code API rejects these with "Extra inputs are not permitted"
- `cleanCacheControl(messages)` strips cache_control from ALL block types at pipeline entry
- Called at the START of `convertAnthropicToGoogle()` before any other processing
- Additional sanitizers (`sanitizeTextBlock`, `sanitizeToolUseBlock`) provide defense-in-depth
- Pattern inspired by Antigravity-Manager's `clean_cache_control_from_messages()`
**Native Module Auto-Rebuild:**
- When Node.js is updated, native modules like `better-sqlite3` may become incompatible
- The proxy automatically detects `NODE_MODULE_VERSION` mismatch errors
@@ -284,7 +298,9 @@ Each account object in `accounts.json` contains:
- ARIA labels on search inputs and icon buttons
- Keyboard navigation support (Escape to clear search)
- **Security**: Optional password protection via `WEBUI_PASSWORD` env var
- **Config Redaction**: Sensitive values (passwords, tokens) are redacted in API responses
- **Smart Refresh**: Client-side polling with ±20% jitter and tab visibility detection (3x slower when hidden)
- **i18n Support**: English, Chinese (中文), Indonesian (Bahasa), Portuguese (PT-BR)
## Testing Notes