fix: don't count rate limit waits as failed retry attempts

When all accounts are rate-limited or token-exhausted, the retry loop
was incorrectly counting the wait time as a failed attempt. This caused
premature "Max retries exceeded" errors when we were just patiently
waiting for accounts to become available.

- Add attempt-- after sleeping for rate limits or strategy waits
- Add #diagnoseNoCandidates() to hybrid strategy for better logging
- Add getTimeUntilNextToken() and getMinTimeUntilToken() to token tracker
- Return waitMs from hybrid strategy when all accounts are token-blocked

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Badri Narayanan S
2026-01-23 14:29:24 +05:30
parent 7aa1508b27
commit 0fa945b069
4 changed files with 104 additions and 4 deletions

View File

@@ -158,6 +158,10 @@ export async function* sendMessageStream(anthropicRequest, accountManager, fallb
logger.warn(`[CloudCode] All ${accountCount} account(s) rate-limited. Waiting ${formatDuration(minWaitMs)}...`);
await sleep(minWaitMs + 500); // Add 500ms buffer
accountManager.clearExpiredLimits();
// CRITICAL FIX: Don't count waiting for rate limits as a failed attempt
// This prevents "Max retries exceeded" when we are just patiently waiting
attempt--;
continue; // Retry the loop
}
@@ -172,11 +176,13 @@ export async function* sendMessageStream(anthropicRequest, accountManager, fallb
if (!account && waitMs > 0) {
logger.info(`[CloudCode] Waiting ${formatDuration(waitMs)} for account...`);
await sleep(waitMs + 500);
attempt--; // CRITICAL FIX: Don't count strategy wait as failure
continue;
}
if (!account) {
continue; // Shouldn't happen, but safety check
logger.warn(`[CloudCode] Strategy returned no account for ${model} (attempt ${attempt + 1}/${maxAttempts})`);
continue;
}
try {