Commit Graph

38 Commits

Author SHA1 Message Date
Badri Narayanan S
90b38bbb56 feat: validate model IDs before processing requests
Add model validation cache with 5-minute TTL to reject invalid model IDs
upfront instead of sending them to the API. This provides better error
messages and avoids unnecessary API calls.

- Add MODEL_VALIDATION_CACHE_TTL_MS constant (5 min)
- Add isValidModel() with lazy cache population
- Warm cache when listModels() is called
- Validate model ID in /v1/messages before processing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 19:07:23 +05:30
Badri Narayanan S
ca6783f153 chore: bump user agent version to 1.15.8
Co-Authored-By: Claude (claude-opus-4-5-thinking) <noreply@anthropic.com>
2026-01-29 23:55:15 +05:30
Badri Narayanan S
b64809277c chore: update default haiku model to claude-sonnet-4-5 and gemini-3-flash
- Claude preset: gemini-2.5-flash-lite → claude-sonnet-4-5
- Gemini preset: gemini-2.5-flash-lite → gemini-3-flash
- Remove outdated quota warning about haiku model usage

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-26 14:17:11 +05:30
Badri Narayanan S
bc8d428107 Merge pull request #187 from quocthai0404/fix/issue-176-windows-callback-port
fix: Make OAuth callback port configurable for Windows compatibility (#176)
2026-01-26 13:45:13 +05:30
Badri Narayanan S
5a85f0cfcc feat: comprehensive rate limit handling overhaul (inspired by opencode-antigravity-auth)
This commit addresses "Max retries exceeded" errors during stress testing where
all accounts would become exhausted simultaneously due to short per-second rate
limits triggering cascading failures.

## Rate Limit Parser (`rate-limit-parser.js`)
- Remove 2s buffer enforcement that caused cascading failures when API returned
  short reset times (200-600ms). Now adds 200ms buffer for sub-500ms resets
- Add `parseRateLimitReason()` for smart backoff based on error type:
  QUOTA_EXHAUSTED, RATE_LIMIT_EXCEEDED, MODEL_CAPACITY_EXHAUSTED, SERVER_ERROR

## Message/Streaming Handlers
- Add per-account+model rate limit state tracking with exponential backoff
- For short rate limits (< 1 second), wait and retry on same account instead
  of switching - prevents thundering herd when all accounts hit per-second limits
- Add throttle wait support for fallback modes (emergency/lastResort)
- Add `calculateSmartBackoff()` with progressive tiers by error type

## HybridStrategy (`hybrid-strategy.js`)
- Refactor `#getCandidates()` to return 4 fallback levels:
  - `normal`: All filters pass (health, tokens, quota)
  - `quota`: Bypass critical quota check
  - `emergency`: Bypass health check when ALL accounts unhealthy
  - `lastResort`: Bypass BOTH health AND token bucket checks
- Add throttle wait times: 500ms for lastResort, 250ms for emergency
- Fix LRU calculation to use seconds (matches opencode-antigravity-auth)

## Health Tracker
- Increase `recoveryPerHour` from 2 to 10 for faster recovery (1 hour vs 5 hours)

## Account Manager
- Add consecutive failure tracking: `getConsecutiveFailures()`,
  `incrementConsecutiveFailures()`, `resetConsecutiveFailures()`
- Add cooldown mechanism separate from rate limits with `CooldownReason`
- Reset consecutive failures on successful request

## Base Strategy
- Add `isAccountCoolingDown()` check in `isAccountUsable()`

## Constants
- Replace fixed `CAPACITY_RETRY_DELAY_MS` with progressive `CAPACITY_BACKOFF_TIERS_MS`
- Add `BACKOFF_BY_ERROR_TYPE` for smart backoff
- Add `QUOTA_EXHAUSTED_BACKOFF_TIERS_MS` for progressive quota backoff
- Add `MIN_BACKOFF_MS` floor to prevent "Available in 0s" loops
- Increase `MAX_CAPACITY_RETRIES` from 3 to 5
- Reduce `RATE_LIMIT_DEDUP_WINDOW_MS` from 5s to 2s

## Frontend
- Remove `capacityRetryDelayMs` config (replaced by progressive tiers)
- Update default `maxCapacityRetries` display from 3 to 5

## Testing
- Add `tests/stress-test.cjs` for concurrent request stress testing

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-24 22:43:53 +05:30
quocthai0404
54fc1da829 fix: make OAuth callback port configurable for Windows compatibility (#176)
- Add OAUTH_CALLBACK_PORT environment variable (default: 51121)
- Implement automatic port fallback (51122-51126) on EACCES/EADDRINUSE
- Add Windows-specific troubleshooting in error messages and README
- Document configuration in config.example.json

Closes #176
2026-01-24 14:28:31 +07:00
Badri Narayanan S
c8c7a5a8aa refactor: move STRATEGY_LABELS to constants.js and make banner dynamic
Consolidate strategy configuration by moving STRATEGY_LABELS from
strategies/index.js to constants.js. Update startup banner to
dynamically display strategy options from SELECTION_STRATEGIES.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-18 12:59:33 +05:30
Badri Narayanan S
5ae19a5b72 feat: add configurable account selection strategies
Refactor account selection into a strategy pattern with three options:
- Sticky: cache-optimized, stays on same account until rate-limited
- Round-robin: load-balanced, rotates every request
- Hybrid (default): smart distribution using health scores, token buckets, and LRU

The hybrid strategy uses multiple signals for optimal account selection:
health tracking for reliability, client-side token buckets for rate limiting,
and LRU freshness to prefer rested accounts.

Includes WebUI settings for strategy selection and unit tests.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-18 03:48:43 +05:30
Badri Narayanan S
973234372b chore: remove unused code and suppress noisy Claude Code logs
- Delete unused files: retry.js, app-init.js, model-manager.js
- Remove duplicate error helpers from helpers.js (exist in errors.js)
- Remove unused exports from signature-cache.js, logger.js
- Remove unused frontend code: ErrorHandler methods, validators, canDelete, destroy
- Make internal functions private in thinking-utils.js
- Remove commented-out code from constants.js
- Remove deprecated .glass-panel CSS class
- Add silent handler for Claude Code event logging (/api/event_logging/batch)
- Suppress logging for /v1/messages/count_tokens (501 responses)
- Fix catch-all to use originalUrl (wildcard strips req.path)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-18 01:36:24 +05:30
Badri Narayanan S
77363c679e fix: accurate quota reporting with project ID and improved rate limit handling
- Pass project ID to fetchAvailableModels for accurate per-project quota
- Treat missing remainingFraction with resetTime as 0% (exhausted)
- Fix double-escaped regex in rate-limit-parser.js (\\d -> \d)
- Use ANTIGRAVITY_HEADERS for loadCodeAssist consistency
- Store actual reset time from API instead of capping at default
- Add getRateLimitInfo() for detailed rate limit state
- Handle disabled accounts in rate limit checks

Fixes issue where free tier accounts showed 100% quota but were actually exhausted.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-15 16:18:13 +05:30
Badri Narayanan S
44632dc301 feat: add automatic user onboarding for accounts without projects
When loadCodeAssist returns no project, automatically call onboardUser API
to provision a managed project. This handles first-time setup for new accounts.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-15 12:27:37 +05:30
Badri Narayanan S
12d196f6a0 refactor: centralize TEST_MODELS and DEFAULT_PRESETS in constants.js
- Move TEST_MODELS and DEFAULT_PRESETS to src/constants.js as single source of truth
- Update test-models.cjs helper to use dynamic import from constants
- Make getTestModels() and getModels() async functions
- Update all test files to await async model config loading
- Remove duplicate THINKING_MODELS and getThinkingModels() from test helper
- Make thinking tests more lenient for Gemini (doesn't always produce thinking blocks)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-13 19:20:57 +05:30
Badri Narayanan S
70fd1baaa8 fix: improve loadCodeAssist for Google One AI Pro accounts
- Add separate LOAD_CODE_ASSIST_ENDPOINTS (prod first) and
  LOAD_CODE_ASSIST_HEADERS (google-api-nodejs-client User-Agent)
- Add duetProject to metadata for project discovery
- Silent fallback when API returns success but no project
  (matches opencode-antigravity-auth behavior)
- Only warn when all endpoints fail with actual errors

Fixes #114, addresses discussion #113

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-13 18:11:45 +05:30
Wha1eChai
71c7c2e423 feat(webui): enhance settings UI, persistence and documentation
- Update CLAUDE.md with comprehensive WebUI architecture and API documentation
- Improve settings UI with searchable model dropdowns and visual family indicators
- Migrate usage statistics persistence to user config directory with auto-migration
- Refactor server request handling and fix model suffix logic
2026-01-10 04:22:59 +08:00
Wha1eChai
169e18402f merge: sync with origin/feature/webui after upstream merge 2026-01-09 18:23:52 +08:00
Wha1eChai
f2f0a7452e merge: integrate upstream/main (v1.2.15) into feature/webui
- Resolved conflict in src/constants.js: kept config-driven approach

- Adopted upstream 10-second cooldown default

- Added MAX_EMPTY_RESPONSE_RETRIES constant from upstream

- Incorporated new test files and GitHub issue templates
2026-01-09 18:08:45 +08:00
Badri Narayanan S
4c5236d4b3 fix: filter Antigravity system prompt from model responses
- Add [ignore] tags around system instruction to prevent model from
  identifying as "Antigravity" when asked "Who are you?"
- Replace full system instruction with minimal version used by
  CLIProxyAPI/gcli2api to reduce token usage and improve response quality

Fixes #76

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-09 14:10:41 +05:30
Badri Narayanan S
5f6ce1b97d Update daily Cloud Code endpoint to production URL
Remove sandbox subdomain from daily-cloudcode-pa endpoint.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-08 20:24:23 +05:30
Badri Narayanan S
a696ed0872 Merge pull request #64 from BrunoMarc/fix/empty-response-retry
fix: add retry mechanism for empty API responses
2026-01-08 17:38:18 +05:30
Badri Narayanan S
f34aa50ba4 Antigravity compatibility to fix antigravity usage 2026-01-08 10:24:54 +05:30
BrunoMarc
05cd80ebb5 fix: address code review feedback
- Move MAX_EMPTY_RESPONSE_RETRIES to constants.js for consistency
- Handle 429/401/5xx errors properly during retry fetch
- Use proper message ID format (crypto.randomBytes) instead of Date.now()
- Add crypto import for UUID generation

Code review by: Gemini 3 Pro Preview + Claude Opus 4.5

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 18:16:09 -03:00
Badri Narayanan S
56e5042700 Merge branch 'main' into feature/webui 2026-01-07 02:01:23 +05:30
Badri Narayanan S
63fa90c04b set cooldown to 10 seconds 2026-01-07 00:33:42 +05:30
Badri Narayanan S
57ba5f9c1c setting cooldown back to 30 seconds 2026-01-07 00:20:22 +05:30
Badri Narayanan S
5b70b7703e Changed default cooldown time to 10 seconds 2026-01-06 22:22:32 +05:30
Wha1eChai
85f7d3bae7 feat: Add Web UI for account and quota management
## Summary
Add an optional Web UI for managing accounts and monitoring quotas.
WebUI is implemented as a modular plugin with minimal changes to server.js (only 5 lines added).

## New Features
- Dashboard: Real-time model quota visualization with Chart.js
- Accounts: OAuth-based account management (add/enable/disable/refresh/remove)
- Logs: Live server log streaming via SSE with search and level filtering
- Settings: System configuration with 4 tabs
  - Interface: Language (EN/zh_CN), polling interval, log buffer size, display options
  - Claude CLI: Proxy connection config, model selection, alias overrides (~/.claude.json)
  - Models: Model visibility and ordering management
  - Server Info: Runtime info and account config reload

## Technical Changes
- Add src/webui/index.js as modular plugin (all WebUI routes encapsulated)
- Add src/config.js for centralized configuration (~/.config/antigravity-proxy/config.json)
- Add authMiddleware for optional password protection (WEBUI_PASSWORD env var)
- Enhance logger with EventEmitter for SSE log streaming
- Make constants configurable via config.json
- Merge with main v1.2.6 (model fallback, cross-model thinking)
- server.js changes: only 5 lines added to import and mount WebUI module

## Bug Fixes
- Fix Alpine.js $watch error in settings-store.js (not supported in store init)
- Fix "OK" label to "SUCCESS" in logs filter
- Add saveSettings() calls to settings toggles for proper persistence
- Improve Claude CLI config robustness (handle empty/invalid JSON files)
- Add safety check for empty config.env in claude-config component
- Improve config.example.json instructions with clear Windows/macOS/Linux paths

## New Files
- src/webui/index.js - WebUI module with all API routes
- public/ - Complete Web UI frontend (Alpine.js + TailwindCSS + DaisyUI)
- src/config.js - Configuration management
- src/utils/claude-config.js - Claude CLI settings helper
- tests/frontend/ - Frontend test suite

## API Endpoints Added
- GET/POST /api/config - Server configuration
- GET/POST /api/claude/config - Claude CLI configuration
- POST /api/models/config - Model alias/hidden settings
- GET /api/accounts - Account list with status
- POST /api/accounts/:email/toggle - Enable/disable account
- POST /api/accounts/:email/refresh - Refresh account token
- DELETE /api/accounts/:email - Remove account
- GET /api/logs - Log history
- GET /api/logs/stream - Live log streaming (SSE)
- GET /api/auth/url - OAuth URL generation
- GET /oauth/callback - OAuth callback handler

## Backward Compatibility
- Default port remains 8080
- All existing CLI/API functionality unchanged
- WebUI is entirely optional
- Can be disabled by removing mountWebUI() call
2026-01-04 18:35:29 +08:00
Badri Narayanan S
141558dd62 Improve cross-model thinking handling and add gemini-3-flash fallback
- Add gemini-3-flash to MODEL_FALLBACK_MAP for completeness
- Add hasGeminiHistory() to detect Gemini→Claude cross-model switch
- Trigger recovery for Claude only when Gemini history detected
- Remove unnecessary thinking block filtering for Claude-only conversations
- Add comments explaining '.' placeholder usage
- Remove unused filterUnsignedThinkingFromMessages function

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-04 00:11:14 +05:30
Badri Narayanan S
602d6ca0f8 move fallback map to constants 2026-01-03 22:05:16 +05:30
Badri Narayanan S
426acc494a Implement Gemini signature caching and thinking recovery
- Add in-memory signature cache to restore thoughtSignatures stripped by Claude Code
- Implement thinking recovery logic to handle interrupted tool loops for Gemini
- Enhance schema sanitizer to preserve constraints and enums as description hints
- Update CLAUDE.md with new architecture details

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-28 14:34:03 +05:30
Badri Narayanan S
c1e1dbb0ef Added support for Gemini models 2025-12-27 14:09:20 +05:30
Badri Narayanan S
9b7dcf3a6c removing restcting of available models, fixing max tokens issues in test 2025-12-27 12:17:45 +05:30
Badri Narayanan S
e55d3ccb20 Merge pull request #1 from 0FL01/linux-support
Add Linux support with cross-platform database path detection
2025-12-25 14:04:25 +05:30
Badri Narayanan S
01cda835d9 feat: add prompt caching, sticky account selection, and non-thinking model
- Implement sticky account selection for prompt cache continuity
- Derive stable session ID from first user message (SHA256 hash)
- Return cache_read_input_tokens in usage metadata
- Add claude-sonnet-4-5 model without thinking
- Remove DEFAULT_THINKING_BUDGET (let API use its default)
- Add prompt caching test
- Update README and CLAUDE.md documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-25 13:26:48 +05:30
Andrey Bash
a777f37c3f Add Linux support with cross-platform database path detection
- Add platform-specific database path detection for macOS, Windows, and Linux
- Make User-Agent header dynamic based on current OS and architecture
- Maintain backward compatibility with existing macOS installations
- Tested on Linux: server now correctly finds ~/.config/Antigravity database
2025-12-25 09:35:21 +03:00
Badri Narayanan S
0edc718672 refactor: centralize constants, add error classes, and DRY test utilities
- Create src/errors.js with custom error classes (RateLimitError, AuthError, ApiError, etc.)
- Create src/utils/helpers.js with shared utilities (formatDuration, sleep)
- Create tests/helpers/http-client.cjs with shared test utilities (~250 lines deduped)
- Centralize OAuth config and other constants in src/constants.js
- Add JSDoc types to all major exported functions
- Refactor all test files to use shared http-client utilities
- Update CLAUDE.md with new architecture documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-24 18:11:45 +05:30
Badri Narayanan S
712da8f7f2 code quality improvements and refactoring 2025-12-21 22:11:44 +05:30
Badri Narayanan S
5ae29947b1 initial commit 2025-12-19 19:20:28 +05:30
Badri Narayanan S
52d72b7bff initial commit 2025-12-18 00:06:00 +05:30