diff --git a/CLAUDE.md b/CLAUDE.md
index 443c3ff..70c7fb2 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -17,6 +17,12 @@ npm install
 # Start server (runs on port 8080)
 npm start
 
+# Start with model fallback enabled (falls back to alternate model when quota exhausted)
+npm start -- --fallback
+
+# Start with debug logging
+npm start -- --debug
+
 # Start with file watching for development
 npm run dev
 
@@ -36,6 +42,7 @@ npm run test:streaming     # Streaming SSE events
 npm run test:interleaved   # Interleaved thinking
 npm run test:images        # Image processing
 npm run test:caching       # Prompt caching
+npm run test:crossmodel    # Cross-model thinking signatures
 ```
 
 ## Architecture
@@ -53,6 +60,7 @@ src/
 ├── server.js                   # Express server
 ├── constants.js                # Configuration values
 ├── errors.js                   # Custom error classes
+├── fallback-config.js          # Model fallback mappings and helpers
 │
 ├── cloudcode/                  # Cloud Code API client
 │   ├── index.js                # Public API exports
@@ -87,7 +95,7 @@ src/
 │   ├── content-converter.js    # Message content conversion
 │   ├── schema-sanitizer.js     # JSON Schema cleaning for Gemini
 │   ├── thinking-utils.js       # Thinking block validation/recovery
-│   └── signature-cache.js      # In-memory signature cache
+│   └── signature-cache.js      # Signature cache (tool_use + thinking signatures)
 │
 └── utils/                      # Utilities
     ├── helpers.js              # formatDuration, sleep
@@ -101,7 +109,8 @@ src/
 - **src/account-manager/**: Multi-account pool with sticky selection, rate limit handling, and automatic cooldown
 - **src/auth/**: Authentication including Google OAuth, token extraction, and database access
 - **src/format/**: Format conversion between Anthropic and Google Generative AI formats
-- **src/constants.js**: API endpoints, model mappings, OAuth config, and all configuration values
+- **src/constants.js**: API endpoints, model mappings, fallback config, OAuth config, and all configuration values
+- **src/fallback-config.js**: Model fallback mappings (`getFallbackModel()`, `hasFallback()`)
 - **src/errors.js**: Custom error classes (`RateLimitError`, `AuthError`, `ApiError`, etc.)
 
 **Multi-Account Load Balancing:**
@@ -117,6 +126,22 @@ src/
 - `cache_read_input_tokens` returned in usage metadata when cache hits
 - Token calculation: `input_tokens = promptTokenCount - cachedContentTokenCount`
 
+**Model Fallback (--fallback flag):**
+- When all accounts are exhausted for a model, automatically falls back to an alternate model
+- Fallback mappings defined in `MODEL_FALLBACK_MAP` in `src/constants.js`
+- Thinking models fall back to thinking models (e.g., `claude-sonnet-4-5-thinking` → `gemini-3-flash`)
+- Fallback is disabled on recursive calls to prevent infinite chains
+- Enable with `npm start -- --fallback` or `FALLBACK=true` environment variable
+
+**Cross-Model Thinking Signatures:**
+- Claude and Gemini use incompatible thinking signatures
+- When switching models mid-conversation, incompatible signatures are detected and dropped
+- Signature cache tracks model family ('claude' or 'gemini') for each signature
+- `hasGeminiHistory()` detects Gemini→Claude cross-model scenarios
+- Thinking recovery (`closeToolLoopForThinking()`) injects synthetic messages to close interrupted tool loops
+- For Gemini targets: strict validation - drops unknown or mismatched signatures
+- For Claude targets: lenient - lets Claude validate its own signatures
+
 ## Testing Notes
 
 - Tests require the server to be running (`npm start` in separate terminal)
@@ -129,6 +154,7 @@ src/
 **Constants:** All configuration values are centralized in `src/constants.js`:
 - API endpoints and headers
 - Model mappings and model family detection (`getModelFamily()`, `isThinkingModel()`)
+- Model fallback mappings (`MODEL_FALLBACK_MAP`)
 - OAuth configuration
 - Rate limit thresholds
 - Thinking model settings