docs: refactor README by moving sections to docs/ folder

Move 11 documentation sections to separate markdown files in docs/:
- models.md, load-balancing.md, web-console.md, configuration.md
- menubar-app.md, api-endpoints.md, testing.md, troubleshooting.md
- safety-notices.md, legal.md, development.md

README now contains a Documentation section with links to each doc.
Also moved donation link to above Star History section.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Badri Narayanan S
2026-02-01 21:00:53 +05:30
parent b72aa0e056
commit 2ea9f4ba8e
12 changed files with 366 additions and 383 deletions

59
docs/load-balancing.md Normal file
View File

@@ -0,0 +1,59 @@
# Multi-Account Load Balancing
When you add multiple accounts, the proxy intelligently distributes requests across them using configurable selection strategies.
## Account Selection Strategies
Choose a strategy based on your needs:
| Strategy | Best For | Description |
| --- | --- | --- |
| **Hybrid** (Default) | Most users | Smart selection combining health score, token bucket rate limiting, quota awareness, and LRU freshness |
| **Sticky** | Prompt caching | Stays on the same account to maximize cache hits, switches only when rate-limited |
| **Round-Robin** | Even distribution | Cycles through accounts sequentially for balanced load |
**Configure via CLI:**
```bash
antigravity-claude-proxy start --strategy=hybrid # Default: smart distribution
antigravity-claude-proxy start --strategy=sticky # Cache-optimized
antigravity-claude-proxy start --strategy=round-robin # Load-balanced
```
**Or via WebUI:** Settings → Server → Account Selection Strategy
## How It Works
- **Health Score Tracking**: Accounts earn points for successful requests and lose points for failures/rate-limits
- **Token Bucket Rate Limiting**: Client-side throttling with regenerating tokens (50 max, 6/minute)
- **Quota Awareness**: Accounts below configurable quota thresholds are deprioritized; exhausted accounts trigger emergency fallback
- **Quota Protection**: Set minimum quota levels globally, per-account, or per-model to switch accounts before quota runs out
- **Emergency Fallback**: When all accounts appear exhausted, bypasses checks with throttle delays (250-500ms)
- **Automatic Cooldown**: Rate-limited accounts recover automatically after reset time expires
- **Invalid Account Detection**: Accounts needing re-authentication are marked and skipped
- **Prompt Caching Support**: Session IDs derived from conversation enable cache hits across turns
## Monitoring
Check account status, subscription tiers, and quota anytime:
```bash
# Web UI: http://localhost:8080/ (Accounts tab - shows tier badges and quota progress)
# CLI Table:
curl "http://localhost:8080/account-limits?format=table"
```
### CLI Management Reference
If you prefer using the terminal for management:
```bash
# List all accounts
antigravity-claude-proxy accounts list
# Verify account health
antigravity-claude-proxy accounts verify
# Interactive CLI menu
antigravity-claude-proxy accounts
```