feat: add prompt caching, sticky account selection, and non-thinking model
- Implement sticky account selection for prompt cache continuity - Derive stable session ID from first user message (SHA256 hash) - Return cache_read_input_tokens in usage metadata - Add claude-sonnet-4-5 model without thinking - Remove DEFAULT_THINKING_BUDGET (let API use its default) - Add prompt caching test - Update README and CLAUDE.md documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
15
CLAUDE.md
15
CLAUDE.md
@@ -35,6 +35,7 @@ npm run test:multiturn # Multi-turn with tools
|
||||
npm run test:streaming # Streaming SSE events
|
||||
npm run test:interleaved # Interleaved thinking
|
||||
npm run test:images # Image processing
|
||||
npm run test:caching # Prompt caching
|
||||
```
|
||||
|
||||
## Architecture
|
||||
@@ -57,11 +58,17 @@ Claude Code CLI → Express Server (server.js) → CloudCode Client → Antigrav
|
||||
- **src/utils/helpers.js**: Shared utility functions (`formatDuration`, `sleep`)
|
||||
|
||||
**Multi-Account Load Balancing:**
|
||||
- Round-robin rotation across configured accounts
|
||||
- Automatic switch on 429 rate limit errors
|
||||
- Configurable cooldown period for rate-limited accounts
|
||||
- Sticky account selection for prompt caching (stays on same account across turns)
|
||||
- Automatic switch only when rate-limited for > 2 minutes
|
||||
- Session ID derived from first user message hash for cache continuity
|
||||
- Account state persisted to `~/.config/antigravity-proxy/accounts.json`
|
||||
|
||||
**Prompt Caching:**
|
||||
- Cache is organization-scoped (requires same account + session ID)
|
||||
- Session ID is SHA256 hash of first user message content (stable across turns)
|
||||
- `cache_read_input_tokens` returned in usage metadata when cache hits
|
||||
- Token calculation: `input_tokens = promptTokenCount - cachedContentTokenCount`
|
||||
|
||||
## Testing Notes
|
||||
|
||||
- Tests require the server to be running (`npm start` in separate terminal)
|
||||
@@ -90,4 +97,4 @@ Claude Code CLI → Express Server (server.js) → CloudCode Client → Antigrav
|
||||
|
||||
## Maintenance
|
||||
|
||||
When making significant changes to the codebase (new modules, refactoring, architectural changes), update this CLAUDE.md file to keep documentation in sync.
|
||||
When making significant changes to the codebase (new modules, refactoring, architectural changes), update this CLAUDE.md and the README.md file to keep documentation in sync.
|
||||
|
||||
Reference in New Issue
Block a user