feat: add configurable account selection strategies

Refactor account selection into a strategy pattern with three options: - Sticky: cache-optimized, stays on same account until rate-limited - Round-robin: load-balanced, rotates every request - Hybrid (default): smart distribution using health scores, token buckets, and LRU The hybrid strategy uses multiple signals for optimal account selection: health tracking for reliability, client-side token buckets for rate limiting, and LRU freshness to prefer rested accounts. Includes WebUI settings for strategy selection and unit tests. Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-18 03:48:43 +05:30
parent 973234372b
commit 5ae19a5b72
31 changed files with 2721 additions and 353 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -17,6 +17,11 @@ npm install
 # Start server (runs on port 8080)
 npm start

+# Start with specific account selection strategy
+npm start -- --strategy=sticky      # Cache-optimized (stays on same account)
+npm start -- --strategy=round-robin # Load-balanced (rotates every request)
+npm start -- --strategy=hybrid      # Smart distribution (default)
+
 # Start with model fallback enabled (falls back to alternate model when quota exhausted)
 npm start -- --fallback

@@ -50,6 +55,9 @@ npm run test:images        # Image processing
 npm run test:caching       # Prompt caching
 npm run test:crossmodel    # Cross-model thinking signatures
 npm run test:oauth         # OAuth no-browser mode
+
+# Run strategy unit tests (no server required)
+node tests/test-strategies.cjs
 ```

 ## Architecture
@@ -83,9 +91,18 @@ src/
 ├── account-manager/            # Multi-account pool management
 │   ├── index.js                # AccountManager class facade
 │   ├── storage.js              # Config file I/O and persistence
-│   ├── selection.js            # Account picking (round-robin, sticky)
 │   ├── rate-limits.js          # Rate limit tracking and state
-│   └── credentials.js          # OAuth token and project handling
+│   ├── credentials.js          # OAuth token and project handling
+│   └── strategies/             # Account selection strategies
+│       ├── index.js            # Strategy factory (createStrategy)
+│       ├── base-strategy.js    # Abstract base class
+│       ├── sticky-strategy.js  # Cache-optimized sticky selection
+│       ├── round-robin-strategy.js  # Load-balanced rotation
+│       ├── hybrid-strategy.js  # Smart multi-signal distribution
+│       └── trackers/           # State trackers for hybrid strategy
+│           ├── index.js        # Re-exports trackers
+│           ├── health-tracker.js    # Account health scores
+│           └── token-bucket-tracker.js  # Client-side rate limiting
 │
 ├── auth/                       # Authentication
 │   ├── oauth.js                # Google OAuth with PKCE
@@ -161,7 +178,8 @@ public/
 - **src/webui/index.js**: WebUI backend handling API routes (`/api/*`) for config, accounts, and logs
 - **src/cloudcode/**: Cloud Code API client with retry/failover logic, streaming and non-streaming support
  - `model-api.js`: Model listing, quota retrieval (`getModelQuotas()`), and subscription tier detection (`getSubscriptionTier()`)
- **src/account-manager/**: Multi-account pool with sticky selection, rate limit handling, and automatic cooldown
+- **src/account-manager/**: Multi-account pool with configurable selection strategies, rate limit handling, and automatic cooldown
+  - Strategies: `sticky` (cache-optimized), `round-robin` (load-balanced), `hybrid` (smart distribution)
 - **src/auth/**: Authentication including Google OAuth, token extraction, database access, and auto-rebuild of native modules
 - **src/format/**: Format conversion between Anthropic and Google Generative AI formats
 - **src/constants.js**: API endpoints, model mappings, fallback config, OAuth config, and all configuration values
@@ -170,12 +188,36 @@ public/
 - **src/errors.js**: Custom error classes (`RateLimitError`, `AuthError`, `ApiError`, etc.)

 **Multi-Account Load Balancing:**
- Sticky account selection for prompt caching (stays on same account across turns)
+- Configurable selection strategy via `--strategy` flag or WebUI
+- Three strategies available:
+  - **Sticky** (`--strategy=sticky`): Best for prompt caching, stays on same account
+  - **Round-Robin** (`--strategy=round-robin`): Maximum throughput, rotates every request
+  - **Hybrid** (`--strategy=hybrid`, default): Smart selection using health + tokens + LRU
 - Model-specific rate limiting via `account.modelRateLimits[modelId]`
 - Automatic switch only when rate-limited for > 2 minutes on the current model
 - Session ID derived from first user message hash for cache continuity
 - Account state persisted to `~/.config/antigravity-proxy/accounts.json`

+**Account Selection Strategies:**
+
+1. **Sticky Strategy** (best for caching):
+   - Stays on current account until rate-limited or unavailable
+   - Waits up to 2 minutes for short rate limits before switching
+   - Maintains prompt cache continuity across requests
+
+2. **Round-Robin Strategy** (best for throughput):
+   - Rotates to next account on every request
+   - Skips rate-limited/disabled accounts
+   - Maximizes concurrent request distribution
+
+3. **Hybrid Strategy** (default, smart distribution):
+   - Uses health scores, token buckets, and LRU for selection
+   - Scoring formula: `score = (Health × 2) + ((Tokens / MaxTokens × 100) × 5) + (LRU × 0.1)`
+   - Health scores: Track success/failure patterns with passive recovery
+   - Token buckets: Client-side rate limiting (50 tokens, 6 per minute regeneration)
+   - LRU freshness: Prefer accounts that have rested longer
+   - Configuration in `src/config.js` under `accountSelection`
+
 **Account Data Model:**
 Each account object in `accounts.json` contains:
 - **Basic Info**: `email`, `source` (oauth/manual/database), `enabled`, `lastUsed`