docs: refactor README by moving sections to docs/ folder
Move 11 documentation sections to separate markdown files in docs/: - models.md, load-balancing.md, web-console.md, configuration.md - menubar-app.md, api-endpoints.md, testing.md, troubleshooting.md - safety-notices.md, legal.md, development.md README now contains a Documentation section with links to each doc. Also moved donation link to above Star History section. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
59
docs/load-balancing.md
Normal file
59
docs/load-balancing.md
Normal file
@@ -0,0 +1,59 @@
|
||||
# Multi-Account Load Balancing
|
||||
|
||||
When you add multiple accounts, the proxy intelligently distributes requests across them using configurable selection strategies.
|
||||
|
||||
## Account Selection Strategies
|
||||
|
||||
Choose a strategy based on your needs:
|
||||
|
||||
| Strategy | Best For | Description |
|
||||
| --- | --- | --- |
|
||||
| **Hybrid** (Default) | Most users | Smart selection combining health score, token bucket rate limiting, quota awareness, and LRU freshness |
|
||||
| **Sticky** | Prompt caching | Stays on the same account to maximize cache hits, switches only when rate-limited |
|
||||
| **Round-Robin** | Even distribution | Cycles through accounts sequentially for balanced load |
|
||||
|
||||
**Configure via CLI:**
|
||||
|
||||
```bash
|
||||
antigravity-claude-proxy start --strategy=hybrid # Default: smart distribution
|
||||
antigravity-claude-proxy start --strategy=sticky # Cache-optimized
|
||||
antigravity-claude-proxy start --strategy=round-robin # Load-balanced
|
||||
```
|
||||
|
||||
**Or via WebUI:** Settings → Server → Account Selection Strategy
|
||||
|
||||
## How It Works
|
||||
|
||||
- **Health Score Tracking**: Accounts earn points for successful requests and lose points for failures/rate-limits
|
||||
- **Token Bucket Rate Limiting**: Client-side throttling with regenerating tokens (50 max, 6/minute)
|
||||
- **Quota Awareness**: Accounts below configurable quota thresholds are deprioritized; exhausted accounts trigger emergency fallback
|
||||
- **Quota Protection**: Set minimum quota levels globally, per-account, or per-model to switch accounts before quota runs out
|
||||
- **Emergency Fallback**: When all accounts appear exhausted, bypasses checks with throttle delays (250-500ms)
|
||||
- **Automatic Cooldown**: Rate-limited accounts recover automatically after reset time expires
|
||||
- **Invalid Account Detection**: Accounts needing re-authentication are marked and skipped
|
||||
- **Prompt Caching Support**: Session IDs derived from conversation enable cache hits across turns
|
||||
|
||||
## Monitoring
|
||||
|
||||
Check account status, subscription tiers, and quota anytime:
|
||||
|
||||
```bash
|
||||
# Web UI: http://localhost:8080/ (Accounts tab - shows tier badges and quota progress)
|
||||
# CLI Table:
|
||||
curl "http://localhost:8080/account-limits?format=table"
|
||||
```
|
||||
|
||||
### CLI Management Reference
|
||||
|
||||
If you prefer using the terminal for management:
|
||||
|
||||
```bash
|
||||
# List all accounts
|
||||
antigravity-claude-proxy accounts list
|
||||
|
||||
# Verify account health
|
||||
antigravity-claude-proxy accounts verify
|
||||
|
||||
# Interactive CLI menu
|
||||
antigravity-claude-proxy accounts
|
||||
```
|
||||
Reference in New Issue
Block a user