Merge branch 'BeehiveInnovations:main' into fix/google-allowed-models-restriction

2025-06-16 21:17:19 +08:00
parent 4a95197846 6b09f1468f
commit 39c50a1e93
29 changed files with 1533 additions and 494 deletions
--- a/CLAUDE.local.md
+++ b/CLAUDE.local.md
@@ -0,0 +1 @@
 - Before any commit / push to github, you must first always run and confirm run that code quality checks pass. Use @code_quality_checks.sh and confirm that we have 100% unit tests passing.
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -112,6 +112,11 @@ docker logs zen-mcp-redis
 ### Testing
 Simulation tests are available to test the MCP server in a 'live' scenario, using your configured
 API keys to ensure the models are working and the server is able to communicate back and forth. 
 IMPORTANT: Any time any code is changed or updated, you MUST first restart it with ./run-server.sh OR
 pass `--rebuild` to the `communication_simulator_test.py` script (if running it for the first time after changes) so that it's able to restart and use the latest code.
 #### Run All Simulator Tests
 ```bash
 # Run the complete test suite
--- a/README.md
+++ b/README.md
@@ -80,6 +80,7 @@ Claude is brilliant, but sometimes you need:
 - **Local model support** - Run models like Llama 3.2 locally via Ollama, vLLM, or LM Studio for privacy and cost control
 - **Dynamic collaboration** - Models can request additional context and follow-up replies from Claude mid-analysis
 - **Smart file handling** - Automatically expands directories, manages token limits based on model capacity
 - **Vision support** - Analyze images, diagrams, screenshots, and visual content with vision-capable models
 - **[Bypass MCP's token limits](docs/advanced-usage.md#working-with-large-prompts)** - Work around MCP's 25K limit automatically
 - **[Context revival across sessions](docs/context-revival.md)** - Continue conversations even after Claude's context resets, with other models maintaining full history
@@ -314,6 +315,7 @@ and then debate with the other models to give me a final verdict
 - Technology comparisons and best practices
 - Architecture and design discussions
 - Can reference files for context: `"Use gemini to explain this algorithm with context from algorithm.py"`
 - **Image support**: Include screenshots, diagrams, UI mockups for visual analysis: `"Chat with gemini about this error dialog screenshot to understand the user experience issue"`
 - **Dynamic collaboration**: Gemini can request additional files or context during the conversation if needed for a more thorough response
 - **Web search capability**: Analyzes when web searches would be helpful and recommends specific searches for Claude to perform, ensuring access to current documentation and best practices
@@ -337,6 +339,7 @@ with the best architecture for my project
 - Offers alternative perspectives and approaches
 - Validates architectural decisions and design patterns
 - Can reference specific files for context: `"Use gemini to think deeper about my API design with reference to api/routes.py"`
 - **Image support**: Analyze architectural diagrams, flowcharts, design mockups: `"Think deeper about this system architecture diagram with gemini pro using max thinking mode"`
 - **Enhanced Critical Evaluation (v2.10.0)**: After Gemini's analysis, Claude is prompted to critically evaluate the suggestions, consider context and constraints, identify risks, and synthesize a final recommendation - ensuring a balanced, well-considered solution
 - **Web search capability**: When enabled (default: true), identifies areas where current documentation or community solutions would strengthen the analysis and suggests specific searches for Claude
@@ -362,6 +365,7 @@ I need an actionable plan but break it down into smaller quick-wins that we can
 - Supports specialized reviews: security, performance, quick
 - Can enforce coding standards: `"Use gemini to review src/ against PEP8 standards"`
 - Filters by severity: `"Get gemini to review auth/ - only report critical vulnerabilities"`
 - **Image support**: Review code from screenshots, error dialogs, or visual bug reports: `"Review this error screenshot and the related auth.py file for potential security issues"`
 ### 4. `precommit` - Pre-Commit Validation
 **Comprehensive review of staged/unstaged git changes across multiple repositories**
@@ -408,6 +412,7 @@ Use zen and perform a thorough precommit ensuring there aren't any new regressio
 - `review_type`: full|security|performance|quick
 - `severity_filter`: Filter by issue severity
 - `max_depth`: How deep to search for nested repos
 - `images`: Screenshots of requirements, design mockups, or error states for validation context
 ### 5. `debug` - Expert Debugging Assistant
 **Root cause analysis for complex problems**
@@ -428,6 +433,7 @@ Use zen and perform a thorough precommit ensuring there aren't any new regressio
 - Supports runtime info and previous attempts
 - Provides structured root cause analysis with validation steps
 - Can request additional context when needed for thorough analysis
 - **Image support**: Include error screenshots, stack traces, console output: `"Debug this error using gemini with the stack trace screenshot and the failing test.py"`
 - **Web search capability**: When enabled (default: true), identifies when searching for error messages, known issues, or documentation would help solve the problem and recommends specific searches for Claude
 ### 6. `analyze` - Smart File Analysis
 **General-purpose code understanding and exploration**
@@ -447,6 +453,7 @@ Use zen and perform a thorough precommit ensuring there aren't any new regressio
 - Supports specialized analysis types: architecture, performance, security, quality
 - Uses file paths (not content) for clean terminal output
 - Can identify patterns, anti-patterns, and refactoring opportunities
 - **Image support**: Analyze architecture diagrams, UML charts, flowcharts: `"Analyze this system diagram with gemini to understand the data flow and identify bottlenecks"`
 - **Web search capability**: When enabled with `use_websearch` (default: true), the model can request Claude to perform web searches and share results back to enhance analysis with current documentation, design patterns, and best practices
 ### 7. `refactor` - Intelligent Code Refactoring
@@ -489,6 +496,7 @@ did *not* discover.
 - **Conservative approach** - Careful dependency analysis to prevent breaking changes
 - **Multi-file analysis** - Understands cross-file relationships and dependencies
 - **Priority sequencing** - Recommends implementation order for refactoring changes
 - **Image support**: Analyze code architecture diagrams, legacy system charts: `"Refactor this legacy module using gemini pro with the current architecture diagram"`
 **Refactor Types (Progressive Priority System):**
@@ -529,7 +537,8 @@ Claude can use to efficiently trace execution flows and map dependencies within
 - Creates structured instructions for call-flow graph generation
 - Provides detailed formatting requirements for consistent output
 - Supports any programming language with automatic convention detection
- Output can be used as an input into another tool, such as `chat` along with related code files to perform a logical call-flow analysis 
+- Output can be used as an input into another tool, such as `chat` along with related code files to perform a logical call-flow analysis
 - **Image support**: Analyze visual call flow diagrams, sequence diagrams: `"Generate tracer analysis for this payment flow using the sequence diagram"` 
 #### Example Prompts:
 ```
@@ -564,6 +573,7 @@ suites that cover realistic failure scenarios and integration points that shorte
 - Prioritizes smallest test files for pattern detection
 - Can reference existing test files: `"Generate tests following patterns from tests/unit/"`
 - Specific code coverage - target specific functions/classes rather than testing everything
 - **Image support**: Test UI components, analyze visual requirements: `"Generate tests for this login form using the UI mockup screenshot"`
 ### 10. `version` - Server Information
 ```
@@ -626,6 +636,7 @@ This server enables **true AI collaboration** between Claude and multiple AI mod
 - **Automatic 25K limit bypass**: Each exchange sends only incremental context, allowing unlimited total conversation size
 - Up to 10 exchanges per conversation (configurable via `MAX_CONVERSATION_TURNS`) with 3-hour expiry (configurable via `CONVERSATION_TIMEOUT_HOURS`)
 - Thread-safe with Redis persistence across all tools
 - **Image context preservation** - Images and visual references are maintained across conversation turns and tool switches
 **Cross-tool & Cross-Model Continuation Example:**
 ```
@@ -659,7 +670,7 @@ DEFAULT_MODEL=auto  # Claude picks the best model automatically
 # API Keys (at least one required)
 GEMINI_API_KEY=your-gemini-key    # Enables Gemini Pro & Flash
-OPENAI_API_KEY=your-openai-key    # Enables O3, O3mini, O4-mini, O4-mini-high
+OPENAI_API_KEY=your-openai-key    # Enables O3, O3mini, O4-mini, O4-mini-high, GPT-4.1
 ```
 **Available Models:**
@@ -669,6 +680,7 @@ OPENAI_API_KEY=your-openai-key    # Enables O3, O3mini, O4-mini, O4-mini-high
 - **`o3mini`**: Balanced speed/quality
 - **`o4-mini`**: Latest reasoning model, optimized for shorter contexts
 - **`o4-mini-high`**: Enhanced O4 with higher reasoning effort
 - **`gpt4.1`**: GPT-4.1 with 1M context window
 - **Custom models**: via OpenRouter or local APIs (Ollama, vLLM, etc.)
 For detailed configuration options, see the [Advanced Usage Guide](docs/advanced-usage.md).
--- a/conf/custom_models.json
+++ b/conf/custom_models.json
@@ -25,6 +25,8 @@
      "supports_extended_thinking": "Whether the model supports extended reasoning tokens (currently none do via OpenRouter or custom APIs)",
      "supports_json_mode": "Whether the model can guarantee valid JSON output",
      "supports_function_calling": "Whether the model supports function/tool calling",
      "supports_images": "Whether the model can process images/visual input",
      "max_image_size_mb": "Maximum total size in MB for all images combined (capped at 40MB max for custom models)",
      "is_custom": "Set to true for models that should ONLY be used with custom API endpoints (Ollama, vLLM, etc.). False or omitted for OpenRouter/cloud models.",
      "description": "Human-readable description of the model"
    },
@@ -35,6 +37,8 @@
      "supports_extended_thinking": false,
      "supports_json_mode": true,
      "supports_function_calling": true,
      "supports_images": true,
      "max_image_size_mb": 10.0,
      "is_custom": true,
      "description": "Example custom/local model for Ollama, vLLM, etc."
    }
@@ -47,7 +51,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": false,
      "supports_function_calling": false,
-      "description": "Claude 3 Opus - Most capable Claude model"
+      "supports_images": true,
      "max_image_size_mb": 5.0,
      "description": "Claude 3 Opus - Most capable Claude model with vision"
    },
    {
      "model_name": "anthropic/claude-3-sonnet",
@@ -56,7 +62,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": false,
      "supports_function_calling": false,
-      "description": "Claude 3 Sonnet - Balanced performance"
+      "supports_images": true,
      "max_image_size_mb": 5.0,
      "description": "Claude 3 Sonnet - Balanced performance with vision"
    },
    {
      "model_name": "anthropic/claude-3-haiku",
@@ -65,7 +73,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": false,
      "supports_function_calling": false,
-      "description": "Claude 3 Haiku - Fast and efficient"
+      "supports_images": true,
      "max_image_size_mb": 5.0,
      "description": "Claude 3 Haiku - Fast and efficient with vision"
    },
    {
      "model_name": "google/gemini-2.5-pro-preview",
@@ -74,7 +84,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": true,
      "supports_function_calling": false,
-      "description": "Google's Gemini 2.5 Pro via OpenRouter"
+      "supports_images": true,
      "max_image_size_mb": 20.0,
      "description": "Google's Gemini 2.5 Pro via OpenRouter with vision"
    },
    {
      "model_name": "google/gemini-2.5-flash-preview-05-20",
@@ -83,7 +95,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": true,
      "supports_function_calling": false,
-      "description": "Google's Gemini 2.5 Flash via OpenRouter"
+      "supports_images": true,
      "max_image_size_mb": 15.0,
      "description": "Google's Gemini 2.5 Flash via OpenRouter with vision"
    },
    {
      "model_name": "mistral/mistral-large",
@@ -92,7 +106,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": true,
      "supports_function_calling": true,
-      "description": "Mistral's largest model"
+      "supports_images": false,
      "max_image_size_mb": 0.0,
      "description": "Mistral's largest model (text-only)"
    },
    {
      "model_name": "meta-llama/llama-3-70b",
@@ -101,7 +117,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": false,
      "supports_function_calling": false,
-      "description": "Meta's Llama 3 70B model"
+      "supports_images": false,
      "max_image_size_mb": 0.0,
      "description": "Meta's Llama 3 70B model (text-only)"
    },
    {
      "model_name": "deepseek/deepseek-r1-0528",
@@ -110,7 +128,9 @@
      "supports_extended_thinking": true,
      "supports_json_mode": true,
      "supports_function_calling": false,
-      "description": "DeepSeek R1 with thinking mode - advanced reasoning capabilities"
+      "supports_images": false,
      "max_image_size_mb": 0.0,
      "description": "DeepSeek R1 with thinking mode - advanced reasoning capabilities (text-only)"
    },
    {
      "model_name": "perplexity/llama-3-sonar-large-32k-online",
@@ -119,7 +139,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": false,
      "supports_function_calling": false,
-      "description": "Perplexity's online model with web search"
+      "supports_images": false,
      "max_image_size_mb": 0.0,
      "description": "Perplexity's online model with web search (text-only)"
    },
    {
      "model_name": "openai/o3",
@@ -128,7 +150,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": true,
      "supports_function_calling": true,
-      "description": "OpenAI's o3 model - well-rounded and powerful across domains"
+      "supports_images": true,
      "max_image_size_mb": 20.0,
      "description": "OpenAI's o3 model - well-rounded and powerful across domains with vision"
    },
    {
      "model_name": "openai/o3-mini",
@@ -137,7 +161,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": true,
      "supports_function_calling": true,
-      "description": "OpenAI's o3-mini model - balanced performance and speed"
+      "supports_images": true,
      "max_image_size_mb": 20.0,
      "description": "OpenAI's o3-mini model - balanced performance and speed with vision"
    },
    {
      "model_name": "openai/o3-mini-high",
@@ -146,7 +172,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": true,
      "supports_function_calling": true,
-      "description": "OpenAI's o3-mini with high reasoning effort - optimized for complex problems"
+      "supports_images": true,
      "max_image_size_mb": 20.0,
      "description": "OpenAI's o3-mini with high reasoning effort - optimized for complex problems with vision"
    },
    {
      "model_name": "openai/o3-pro",
@@ -155,7 +183,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": true,
      "supports_function_calling": true,
-      "description": "OpenAI's o3-pro model - professional-grade reasoning and analysis"
+      "supports_images": true,
      "max_image_size_mb": 20.0,
      "description": "OpenAI's o3-pro model - professional-grade reasoning and analysis with vision"
    },
    {
      "model_name": "openai/o4-mini",
@@ -164,7 +194,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": true,
      "supports_function_calling": true,
-      "description": "OpenAI's o4-mini model - optimized for shorter contexts with rapid reasoning"
+      "supports_images": true,
      "max_image_size_mb": 20.0,
      "description": "OpenAI's o4-mini model - optimized for shorter contexts with rapid reasoning and vision"
    },
    {
      "model_name": "openai/o4-mini-high",
@@ -173,7 +205,9 @@
      "supports_extended_thinking": false,
      "supports_json_mode": true,
      "supports_function_calling": true,
-      "description": "OpenAI's o4-mini with high reasoning effort - enhanced for complex tasks"
+      "supports_images": true,
      "max_image_size_mb": 20.0,
      "description": "OpenAI's o4-mini with high reasoning effort - enhanced for complex tasks with vision"
    },
    {
      "model_name": "llama3.2",
@@ -182,8 +216,10 @@
      "supports_extended_thinking": false,
      "supports_json_mode": false,
      "supports_function_calling": false,
      "supports_images": false,
      "max_image_size_mb": 0.0,
      "is_custom": true,
-      "description": "Local Llama 3.2 model via custom endpoint (Ollama/vLLM) - 128K context window"
+      "description": "Local Llama 3.2 model via custom endpoint (Ollama/vLLM) - 128K context window (text-only)"
    }
  ]
 }
--- a/config.py
+++ b/config.py
@@ -14,7 +14,7 @@ import os
 # These values are used in server responses and for tracking releases
 # IMPORTANT: This is the single source of truth for version and author info
 # Semantic versioning: MAJOR.MINOR.PATCH
-__version__ = "4.7.5"
+__version__ = "4.8.0"
 # Last update date in ISO format
 __updated__ = "2025-06-16"
 # Primary maintainer
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -8,13 +8,13 @@ services:
      - "6379:6379"
    volumes:
      - redis_data:/data
-    command: redis-server --save 60 1 --loglevel warning --maxmemory 64mb --maxmemory-policy allkeys-lru
+    command: redis-server --save 60 1 --loglevel warning --maxmemory 512mb --maxmemory-policy allkeys-lru
    deploy:
      resources:
        limits:
          memory: 1G
        reservations:
-          memory: 256M
+          memory: 128M
  zen-mcp:
    build: .
--- a/docs/advanced-usage.md
+++ b/docs/advanced-usage.md
@@ -11,6 +11,7 @@ This guide covers advanced features, configuration options, and workflows for po
 - [Context Revival: AI Memory Beyond Context Limits](#context-revival-ai-memory-beyond-context-limits)
 - [Collaborative Workflows](#collaborative-workflows)
 - [Working with Large Prompts](#working-with-large-prompts)
 - [Vision Support](#vision-support)
 - [Web Search Integration](#web-search-integration)
 - [System Prompts](#system-prompts)
@@ -25,7 +26,7 @@ DEFAULT_MODEL=auto  # Claude picks the best model automatically
 # API Keys (at least one required)
 GEMINI_API_KEY=your-gemini-key    # Enables Gemini Pro & Flash
-OPENAI_API_KEY=your-openai-key    # Enables O3, O3-mini, O4-mini, O4-mini-high
+OPENAI_API_KEY=your-openai-key    # Enables O3, O3-mini, O4-mini, O4-mini-high, GPT-4.1
 ```
 **How Auto Mode Works:**
@@ -43,6 +44,7 @@ OPENAI_API_KEY=your-openai-key    # Enables O3, O3-mini, O4-mini, O4-mini-high
 | **`o3-mini`** | OpenAI | 200K tokens | Balanced speed/quality | Moderate complexity tasks |
 | **`o4-mini`** | OpenAI | 200K tokens | Latest reasoning model | Optimized for shorter contexts |
 | **`o4-mini-high`** | OpenAI | 200K tokens | Enhanced reasoning | Complex tasks requiring deeper analysis |
 | **`gpt4.1`** | OpenAI | 1M tokens | Latest GPT-4 with extended context | Large codebase analysis, comprehensive reviews |
 | **`llama`** (Llama 3.2) | Custom/Local | 128K tokens | Local inference, privacy | On-device analysis, cost-free processing |
 | **Any model** | OpenRouter | Varies | Access to GPT-4, Claude, Llama, etc. | User-specified or based on task requirements |
@@ -57,6 +59,7 @@ You can specify a default model instead of auto mode:
 DEFAULT_MODEL=gemini-2.5-pro-preview-06-05  # Always use Gemini Pro
 DEFAULT_MODEL=flash                         # Always use Flash
 DEFAULT_MODEL=o3                           # Always use O3
 DEFAULT_MODEL=gpt4.1                       # Always use GPT-4.1
 ```
 **Important:** After changing any configuration in `.env` (including `DEFAULT_MODEL`, API keys, or other settings), restart the server with `./run-server.sh` to apply the changes.
@@ -67,10 +70,12 @@ Regardless of your default setting, you can specify models per request:
 - "Use **flash** to quickly format this code"
 - "Use **o3** to debug this logic error"
 - "Review with **o4-mini** for balanced analysis"
 - "Use **gpt4.1** for comprehensive codebase analysis"
 **Model Capabilities:**
 - **Gemini Models**: Support thinking modes (minimal to max), web search, 1M context
 - **O3 Models**: Excellent reasoning, systematic analysis, 200K context
 - **GPT-4.1**: Extended context window (1M tokens), general capabilities
 ## Model Usage Restrictions
@@ -186,7 +191,7 @@ All tools that work with files support **both individual files and entire direct
 **`analyze`** - Analyze files or directories
 - `files`: List of file paths or directories (required)
 - `question`: What to analyze (required)  
- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high (default: server default)
+- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high|gpt4.1 (default: server default)
 - `analysis_type`: architecture|performance|security|quality|general
 - `output_format`: summary|detailed|actionable
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
@@ -201,7 +206,7 @@ All tools that work with files support **both individual files and entire direct
 **`codereview`** - Review code files or directories
 - `files`: List of file paths or directories (required)
- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high (default: server default)
+- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high|gpt4.1 (default: server default)
 - `review_type`: full|security|performance|quick
 - `focus_on`: Specific aspects to focus on
 - `standards`: Coding standards to enforce
@@ -217,7 +222,7 @@ All tools that work with files support **both individual files and entire direct
 **`debug`** - Debug with file context
 - `error_description`: Description of the issue (required)
- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high (default: server default)
+- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high|gpt4.1 (default: server default)
 - `error_context`: Stack trace or logs
 - `files`: Files or directories related to the issue
 - `runtime_info`: Environment details
@@ -233,7 +238,7 @@ All tools that work with files support **both individual files and entire direct
 **`thinkdeep`** - Extended analysis with file context
 - `current_analysis`: Your current thinking (required)
- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high (default: server default)
+- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high|gpt4.1 (default: server default)
 - `problem_context`: Additional context
 - `focus_areas`: Specific aspects to focus on
 - `files`: Files or directories for context
@@ -249,7 +254,7 @@ All tools that work with files support **both individual files and entire direct
 **`testgen`** - Comprehensive test generation with edge case coverage
 - `files`: Code files or directories to generate tests for (required)
 - `prompt`: Description of what to test, testing objectives, and scope (required)
- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high (default: server default)
+- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high|gpt4.1 (default: server default)
 - `test_examples`: Optional existing test files as style/pattern reference
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
@@ -264,7 +269,7 @@ All tools that work with files support **both individual files and entire direct
 - `files`: Code files or directories to analyze for refactoring opportunities (required)
 - `prompt`: Description of refactoring goals, context, and specific areas of focus (required)
 - `refactor_type`: codesmells|decompose|modernize|organization (required)
- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high (default: server default)
+- `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high|gpt4.1 (default: server default)
 - `focus_areas`: Specific areas to focus on (e.g., 'performance', 'readability', 'maintainability', 'security')
 - `style_guide_examples`: Optional existing code files to use as style/pattern reference
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
@@ -357,6 +362,47 @@ To help choose the right tool for your needs:
 - `refactor` vs `codereview`: refactor suggests structural improvements, codereview finds bugs/issues
 - `refactor` vs `analyze`: refactor provides actionable refactoring steps, analyze provides understanding
 ## Vision Support
 The Zen MCP server supports vision-capable models for analyzing images, diagrams, screenshots, and visual content. Vision support works seamlessly with all tools and conversation threading.
 **Supported Models:**
 - **Gemini 2.5 Pro & Flash**: Excellent for diagrams, architecture analysis, UI mockups (up to 20MB total)
 - **OpenAI O3/O4 series**: Strong for visual debugging, error screenshots (up to 20MB total)
 - **Claude models via OpenRouter**: Good for code screenshots, visual analysis (up to 5MB total)
 - **Custom models**: Support varies by model, with 40MB maximum enforced for abuse prevention
 **Usage Examples:**
 ```bash
 # Debug with error screenshots
 "Use zen to debug this error with the stack trace screenshot and error.py"
 # Architecture analysis with diagrams  
 "Analyze this system architecture diagram with gemini pro for bottlenecks"
 # UI review with mockups
 "Chat with flash about this UI mockup - is the layout intuitive?"
 # Code review with visual context
 "Review this authentication code along with the error dialog screenshot"
 ```
 **Image Formats Supported:**
 - **Images**: JPG, PNG, GIF, WebP, BMP, SVG, TIFF
 - **Documents**: PDF (where supported by model)
 - **Data URLs**: Base64-encoded images from Claude
 **Key Features:**
 - **Automatic validation**: File type, magic bytes, and size validation
 - **Conversation context**: Images persist across tool switches and continuation
 - **Budget management**: Automatic dropping of old images when limits exceeded
 - **Model capability-aware**: Only sends images to vision-capable models
 **Best Practices:**
 - Describe images when including them: "screenshot of login error", "system architecture diagram"
 - Use appropriate models: Gemini for complex diagrams, O3 for debugging visuals
 - Consider image sizes: Larger images consume more of the model's capacity
 ## Working with Large Prompts
 The MCP protocol has a combined request+response limit of approximately 25K tokens. This server intelligently works around this limitation by automatically handling large prompts as files:
--- a/providers/base.py
+++ b/providers/base.py
@@ -112,6 +112,8 @@ class ModelCapabilities:
    supports_system_prompts: bool = True
    supports_streaming: bool = True
    supports_function_calling: bool = False
    supports_images: bool = False  # Whether model can process images
    max_image_size_mb: float = 0.0  # Maximum total size for all images in MB
    # Temperature constraint object - preferred way to define temperature limits
    temperature_constraint: TemperatureConstraint = field(
--- a/providers/gemini.py
+++ b/providers/gemini.py
@@ -1,6 +1,8 @@
 """Gemini model provider implementation."""
 import base64
 import logging
 import os
 import time
 from typing import Optional
@@ -21,11 +23,15 @@ class GeminiModelProvider(ModelProvider):
            "context_window": 1_048_576,  # 1M tokens
            "supports_extended_thinking": True,
            "max_thinking_tokens": 24576,  # Flash 2.5 thinking budget limit
            "supports_images": True,  # Vision capability
            "max_image_size_mb": 20.0,  # Conservative 20MB limit for reliability
        },
        "gemini-2.5-pro-preview-06-05": {
            "context_window": 1_048_576,  # 1M tokens
            "supports_extended_thinking": True,
            "max_thinking_tokens": 32768,  # Pro 2.5 thinking budget limit
            "supports_images": True,  # Vision capability
            "max_image_size_mb": 32.0,  # Higher limit for Pro model
        },
        # Shorthands
        "flash": "gemini-2.5-flash-preview-05-20",
@@ -84,6 +90,8 @@ class GeminiModelProvider(ModelProvider):
            supports_system_prompts=True,
            supports_streaming=True,
            supports_function_calling=True,
            supports_images=config.get("supports_images", False),
            max_image_size_mb=config.get("max_image_size_mb", 0.0),
            temperature_constraint=temp_constraint,
        )
@@ -95,6 +103,7 @@ class GeminiModelProvider(ModelProvider):
        temperature: float = 0.7,
        max_output_tokens: Optional[int] = None,
        thinking_mode: str = "medium",
        images: Optional[list[str]] = None,
        **kwargs,
    ) -> ModelResponse:
        """Generate content using Gemini model."""
@@ -102,12 +111,34 @@ class GeminiModelProvider(ModelProvider):
        resolved_name = self._resolve_model_name(model_name)
        self.validate_parameters(model_name, temperature)
-        # Combine system prompt with user prompt if provided
+        # Prepare content parts (text and potentially images)
        parts = []
        # Add system and user prompts as text
        if system_prompt:
            full_prompt = f"{system_prompt}\n\n{prompt}"
        else:
            full_prompt = prompt
        parts.append({"text": full_prompt})
        # Add images if provided and model supports vision
        if images and self._supports_vision(resolved_name):
            for image_path in images:
                try:
                    image_part = self._process_image(image_path)
                    if image_part:
                        parts.append(image_part)
                except Exception as e:
                    logger.warning(f"Failed to process image {image_path}: {e}")
                    # Continue with other images and text
                    continue
        elif images and not self._supports_vision(resolved_name):
            logger.warning(f"Model {resolved_name} does not support images, ignoring {len(images)} image(s)")
        # Create contents structure
        contents = [{"parts": parts}]
        # Prepare generation config
        generation_config = types.GenerateContentConfig(
            temperature=temperature,
@@ -139,7 +170,7 @@ class GeminiModelProvider(ModelProvider):
                # Generate content
                response = self.client.models.generate_content(
                    model=resolved_name,
-                    contents=full_prompt,
+                    contents=contents,
                    config=generation_config,
                )
@@ -274,3 +305,51 @@ class GeminiModelProvider(ModelProvider):
                usage["total_tokens"] = usage["input_tokens"] + usage["output_tokens"]
        return usage
    def _supports_vision(self, model_name: str) -> bool:
        """Check if the model supports vision (image processing)."""
        # Gemini 2.5 models support vision
        vision_models = {
            "gemini-2.5-flash-preview-05-20",
            "gemini-2.5-pro-preview-06-05",
            "gemini-2.0-flash",
            "gemini-1.5-pro",
            "gemini-1.5-flash",
        }
        return model_name in vision_models
    def _process_image(self, image_path: str) -> Optional[dict]:
        """Process an image for Gemini API."""
        try:
            if image_path.startswith("data:image/"):
                # Handle data URL: data:image/png;base64,iVBORw0...
                header, data = image_path.split(",", 1)
                mime_type = header.split(";")[0].split(":")[1]
                return {"inline_data": {"mime_type": mime_type, "data": data}}
            else:
                # Handle file path - translate for Docker environment
                from utils.file_types import get_image_mime_type
                from utils.file_utils import translate_path_for_environment
                translated_path = translate_path_for_environment(image_path)
                logger.debug(f"Translated image path from '{image_path}' to '{translated_path}'")
                if not os.path.exists(translated_path):
                    logger.warning(f"Image file not found: {translated_path} (original: {image_path})")
                    return None
                # Use translated path for all subsequent operations
                image_path = translated_path
                # Detect MIME type from file extension using centralized mappings
                ext = os.path.splitext(image_path)[1].lower()
                mime_type = get_image_mime_type(ext)
                # Read and encode the image
                with open(image_path, "rb") as f:
                    image_data = base64.b64encode(f.read()).decode()
                return {"inline_data": {"mime_type": mime_type, "data": image_data}}
        except Exception as e:
            logger.error(f"Error processing image {image_path}: {e}")
            return None
--- a/providers/openai.py
+++ b/providers/openai.py
@@ -23,22 +23,38 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
        "o3": {
            "context_window": 200_000,  # 200K tokens
            "supports_extended_thinking": False,
            "supports_images": True,  # O3 models support vision
            "max_image_size_mb": 20.0,  # 20MB per OpenAI docs
        },
        "o3-mini": {
            "context_window": 200_000,  # 200K tokens
            "supports_extended_thinking": False,
            "supports_images": True,  # O3 models support vision
            "max_image_size_mb": 20.0,  # 20MB per OpenAI docs
        },
        "o3-pro": {
            "context_window": 200_000,  # 200K tokens
            "supports_extended_thinking": False,
            "supports_images": True,  # O3 models support vision
            "max_image_size_mb": 20.0,  # 20MB per OpenAI docs
        },
        "o4-mini": {
            "context_window": 200_000,  # 200K tokens
            "supports_extended_thinking": False,
            "supports_images": True,  # O4 models support vision
            "max_image_size_mb": 20.0,  # 20MB per OpenAI docs
        },
        "o4-mini-high": {
            "context_window": 200_000,  # 200K tokens
            "supports_extended_thinking": False,
            "supports_images": True,  # O4 models support vision
            "max_image_size_mb": 20.0,  # 20MB per OpenAI docs
        },
        "gpt-4.1-2025-04-14": {
            "context_window": 1_000_000,  # 1M tokens
            "supports_extended_thinking": False,
            "supports_images": True,  # GPT-4.1 supports vision
            "max_image_size_mb": 20.0,  # 20MB per OpenAI docs
        },
        # Shorthands
        "mini": "o4-mini",  # Default 'mini' to latest mini model
@@ -46,6 +62,7 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
        "o4mini": "o4-mini",
        "o4minihigh": "o4-mini-high",
        "o4minihi": "o4-mini-high",
        "gpt4.1": "gpt-4.1-2025-04-14",
    }
    def __init__(self, api_key: str, **kwargs):
@@ -76,7 +93,7 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
            # O3 and O4 reasoning models only support temperature=1.0
            temp_constraint = FixedTemperatureConstraint(1.0)
        else:
-            # Other OpenAI models support 0.0-2.0 range
+            # Other OpenAI models (including GPT-4.1) support 0.0-2.0 range
            temp_constraint = RangeTemperatureConstraint(0.0, 2.0, 0.7)
        return ModelCapabilities(
@@ -88,6 +105,8 @@ class OpenAIModelProvider(OpenAICompatibleProvider):
            supports_system_prompts=True,
            supports_streaming=True,
            supports_function_calling=True,
            supports_images=config.get("supports_images", False),
            max_image_size_mb=config.get("max_image_size_mb", 0.0),
            temperature_constraint=temp_constraint,
        )
--- a/providers/openai_compatible.py
+++ b/providers/openai_compatible.py
@@ -1,5 +1,6 @@
 """Base class for OpenAI-compatible API providers."""
 import base64
 import ipaddress
 import logging
 import os
@@ -229,6 +230,7 @@ class OpenAICompatibleProvider(ModelProvider):
        system_prompt: Optional[str] = None,
        temperature: float = 0.7,
        max_output_tokens: Optional[int] = None,
        images: Optional[list[str]] = None,
        **kwargs,
    ) -> ModelResponse:
        """Generate content using the OpenAI-compatible API.
@@ -255,7 +257,32 @@ class OpenAICompatibleProvider(ModelProvider):
        messages = []
        if system_prompt:
            messages.append({"role": "system", "content": system_prompt})
-        messages.append({"role": "user", "content": prompt})
+
        # Prepare user message with text and potentially images
        user_content = []
        user_content.append({"type": "text", "text": prompt})
        # Add images if provided and model supports vision
        if images and self._supports_vision(model_name):
            for image_path in images:
                try:
                    image_content = self._process_image(image_path)
                    if image_content:
                        user_content.append(image_content)
                except Exception as e:
                    logging.warning(f"Failed to process image {image_path}: {e}")
                    # Continue with other images and text
                    continue
        elif images and not self._supports_vision(model_name):
            logging.warning(f"Model {model_name} does not support images, ignoring {len(images)} image(s)")
        # Add user message
        if len(user_content) == 1:
            # Only text content, use simple string format for compatibility
            messages.append({"role": "user", "content": prompt})
        else:
            # Text + images, use content array format
            messages.append({"role": "user", "content": user_content})
        # Prepare completion parameters
        completion_params = {
@@ -424,3 +451,66 @@ class OpenAICompatibleProvider(ModelProvider):
        Default is False for OpenAI-compatible providers.
        """
        return False
    def _supports_vision(self, model_name: str) -> bool:
        """Check if the model supports vision (image processing).
        Default implementation for OpenAI-compatible providers.
        Subclasses should override with specific model support.
        """
        # Common vision-capable models - only include models that actually support images
        vision_models = {
            "gpt-4o",
            "gpt-4o-mini",
            "gpt-4-turbo",
            "gpt-4-vision-preview",
            "gpt-4.1-2025-04-14",  # GPT-4.1 supports vision
            "o3",
            "o3-mini",
            "o3-pro",
            "o4-mini",
            "o4-mini-high",
            # Note: Claude models would be handled by a separate provider
        }
        supports = model_name.lower() in vision_models
        logging.debug(f"Model '{model_name}' vision support: {supports}")
        return supports
    def _process_image(self, image_path: str) -> Optional[dict]:
        """Process an image for OpenAI-compatible API."""
        try:
            if image_path.startswith("data:image/"):
                # Handle data URL: data:image/png;base64,iVBORw0...
                return {"type": "image_url", "image_url": {"url": image_path}}
            else:
                # Handle file path - translate for Docker environment
                from utils.file_utils import translate_path_for_environment
                translated_path = translate_path_for_environment(image_path)
                logging.debug(f"Translated image path from '{image_path}' to '{translated_path}'")
                if not os.path.exists(translated_path):
                    logging.warning(f"Image file not found: {translated_path} (original: {image_path})")
                    return None
                # Use translated path for all subsequent operations
                image_path = translated_path
                # Detect MIME type from file extension using centralized mappings
                from utils.file_types import get_image_mime_type
                ext = os.path.splitext(image_path)[1].lower()
                mime_type = get_image_mime_type(ext)
                logging.debug(f"Processing image '{image_path}' with extension '{ext}' as MIME type '{mime_type}'")
                # Read and encode the image
                with open(image_path, "rb") as f:
                    image_data = base64.b64encode(f.read()).decode()
                # Create data URL for OpenAI API
                data_url = f"data:{mime_type};base64,{image_data}"
                return {"type": "image_url", "image_url": {"url": data_url}}
        except Exception as e:
            logging.error(f"Error processing image {image_path}: {e}")
            return None
--- a/providers/openrouter_registry.py
+++ b/providers/openrouter_registry.py
@@ -23,6 +23,8 @@ class OpenRouterModelConfig:
    supports_streaming: bool = True
    supports_function_calling: bool = False
    supports_json_mode: bool = False
    supports_images: bool = False  # Whether model can process images
    max_image_size_mb: float = 0.0  # Maximum total size for all images in MB
    is_custom: bool = False  # True for models that should only be used with custom endpoints
    description: str = ""
@@ -37,6 +39,8 @@ class OpenRouterModelConfig:
            supports_system_prompts=self.supports_system_prompts,
            supports_streaming=self.supports_streaming,
            supports_function_calling=self.supports_function_calling,
            supports_images=self.supports_images,
            max_image_size_mb=self.max_image_size_mb,
            temperature_constraint=RangeTemperatureConstraint(0.0, 2.0, 1.0),
        )
@@ -66,7 +70,8 @@ class OpenRouterModelRegistry:
                translated_path = translate_path_for_environment(env_path)
                self.config_path = Path(translated_path)
            else:
-                # Default to conf/custom_models.json (already in container)
+                # Default to conf/custom_models.json - use relative path from this file
                # This works both in development and container environments
                self.config_path = Path(__file__).parent.parent / "conf" / "custom_models.json"
        # Load configuration
--- a/simulator_tests/init.py
+++ b/simulator_tests/init.py
@@ -24,6 +24,7 @@ from .test_redis_validation import RedisValidationTest
 from .test_refactor_validation import RefactorValidationTest
 from .test_testgen_validation import TestGenValidationTest
 from .test_token_allocation_validation import TokenAllocationValidationTest
 from .test_vision_capability import VisionCapabilityTest
 from .test_xai_models import XAIModelsTest
 # Test registry for dynamic loading
@@ -45,6 +46,7 @@ TEST_REGISTRY = {
    "testgen_validation": TestGenValidationTest,
    "refactor_validation": RefactorValidationTest,
    "conversation_chain_validation": ConversationChainValidationTest,
    "vision_capability": VisionCapabilityTest,
    "xai_models": XAIModelsTest,
    # "o3_pro_expensive": O3ProExpensiveTest,  # COMMENTED OUT - too expensive to run by default
 }
@@ -69,6 +71,7 @@ __all__ = [
    "TestGenValidationTest",
    "RefactorValidationTest",
    "ConversationChainValidationTest",
    "VisionCapabilityTest",
    "XAIModelsTest",
    "TEST_REGISTRY",
 ]
--- a/simulator_tests/test_vision_capability.py
+++ b/simulator_tests/test_vision_capability.py
@@ -0,0 +1,163 @@
 #!/usr/bin/env python3
 """
 Vision Capability Test
 Tests vision capability with the chat tool using O3 model:
 - Test file path image (PNG triangle)
 - Test base64 data URL image
 - Use chat tool with O3 model to analyze the images
 - Verify the model correctly identifies shapes
 """
 import base64
 import os
 from .base_test import BaseSimulatorTest
 class VisionCapabilityTest(BaseSimulatorTest):
    """Test vision capability with chat tool and O3 model"""
    @property
    def test_name(self) -> str:
        return "vision_capability"
    @property
    def test_description(self) -> str:
        return "Vision capability test with chat tool and O3 model"
    def get_triangle_png_path(self) -> str:
        """Get the path to the triangle.png file in tests directory"""
        # Get the project root and find the triangle.png in tests/
        current_dir = os.getcwd()
        triangle_path = os.path.join(current_dir, "tests", "triangle.png")
        if not os.path.exists(triangle_path):
            raise FileNotFoundError(f"triangle.png not found at {triangle_path}")
        abs_path = os.path.abspath(triangle_path)
        self.logger.debug(f"Using triangle PNG at host path: {abs_path}")
        return abs_path
    def create_base64_triangle_data_url(self) -> str:
        """Create a base64 data URL from the triangle.png file"""
        triangle_path = self.get_triangle_png_path()
        with open(triangle_path, "rb") as f:
            image_data = base64.b64encode(f.read()).decode()
        data_url = f"data:image/png;base64,{image_data}"
        self.logger.debug(f"Created base64 data URL with {len(image_data)} characters")
        return data_url
    def run_test(self) -> bool:
        """Test vision capability with O3 model"""
        try:
            self.logger.info("Test: Vision capability with O3 model")
            # Test 1: File path image
            self.logger.info("  1.1: Testing file path image (PNG triangle)")
            triangle_path = self.get_triangle_png_path()
            self.logger.info(f"  ✅ Using triangle PNG at: {triangle_path}")
            response1, continuation_id = self.call_mcp_tool(
                "chat",
                {
                    "prompt": "What shape do you see in this image? Please be specific and only mention the shape name.",
                    "images": [triangle_path],
                    "model": "o3",
                },
            )
            if not response1:
                self.logger.error("Failed to get response from O3 model for file path test")
                return False
            # Check for error indicators first
            response1_lower = response1.lower()
            if any(
                error_phrase in response1_lower
                for error_phrase in [
                    "don't have access",
                    "cannot see",
                    "no image",
                    "clarification_required",
                    "image you're referring to",
                    "supply the image",
                    "error",
                ]
            ):
                self.logger.error(f"  ❌ O3 model cannot access file path image. Response: {response1[:300]}...")
                return False
            if "triangle" not in response1_lower:
                self.logger.error(
                    f"  ❌ O3 did not identify triangle in file path test. Response: {response1[:200]}..."
                )
                return False
            self.logger.info("  ✅ O3 correctly identified file path image as triangle")
            # Test 2: Base64 data URL image
            self.logger.info("  1.2: Testing base64 data URL image")
            data_url = self.create_base64_triangle_data_url()
            response2, _ = self.call_mcp_tool(
                "chat",
                {
                    "prompt": "What shape do you see in this image? Please be specific and only mention the shape name.",
                    "images": [data_url],
                    "model": "o3",
                },
            )
            if not response2:
                self.logger.error("Failed to get response from O3 model for base64 test")
                return False
            response2_lower = response2.lower()
            if any(
                error_phrase in response2_lower
                for error_phrase in [
                    "don't have access",
                    "cannot see",
                    "no image",
                    "clarification_required",
                    "image you're referring to",
                    "supply the image",
                    "error",
                ]
            ):
                self.logger.error(f"  ❌ O3 model cannot access base64 image. Response: {response2[:300]}...")
                return False
            if "triangle" not in response2_lower:
                self.logger.error(f"  ❌ O3 did not identify triangle in base64 test. Response: {response2[:200]}...")
                return False
            self.logger.info("  ✅ O3 correctly identified base64 image as triangle")
            # Optional: Test continuation with same image
            if continuation_id:
                self.logger.info("  1.3: Testing continuation with same image")
                response3, _ = self.call_mcp_tool(
                    "chat",
                    {
                        "prompt": "What color is this triangle?",
                        "images": [triangle_path],  # Same image should be deduplicated
                        "continuation_id": continuation_id,
                        "model": "o3",
                    },
                )
                if response3:
                    self.logger.info("  ✅ Continuation also working correctly")
                else:
                    self.logger.warning("  ⚠️  Continuation response not received")
            self.logger.info("  ✅ Vision capability test completed successfully")
            return True
        except Exception as e:
            self.logger.error(f"Vision capability test failed: {e}")
            return False
--- a/tests/test_app_path_translation.py
+++ b/tests/test_app_path_translation.py
@@ -1,126 +0,0 @@
 """
 Test /app/ to ./ path translation for standalone mode.
 Tests that internal application paths work in both Docker and standalone modes.
 """
 import os
 import tempfile
 from unittest.mock import patch
 from utils.file_utils import translate_path_for_environment
 class TestAppPathTranslation:
    """Test translation of /app/ paths for different environments."""
    def test_app_path_translation_in_standalone_mode(self):
        """Test that /app/ paths are translated to ./ in standalone mode."""
        # Mock standalone environment (no Docker)
        with patch("utils.file_utils.CONTAINER_WORKSPACE") as mock_container_workspace:
            mock_container_workspace.exists.return_value = False
            # Clear WORKSPACE_ROOT to simulate standalone mode
            with patch.dict(os.environ, {}, clear=True):
                # Test translation of internal app paths
                test_cases = [
                    ("/app/conf/custom_models.json", "./conf/custom_models.json"),
                    ("/app/conf/other_config.json", "./conf/other_config.json"),
                    ("/app/logs/app.log", "./logs/app.log"),
                    ("/app/data/file.txt", "./data/file.txt"),
                ]
                for input_path, expected_output in test_cases:
                    result = translate_path_for_environment(input_path)
                    assert result == expected_output, f"Expected {expected_output}, got {result}"
    def test_allowed_app_path_unchanged_in_docker_mode(self):
        """Test that allowed /app/ paths remain unchanged in Docker mode."""
        with tempfile.TemporaryDirectory() as tmpdir:
            # Mock Docker environment
            with patch("utils.file_utils.CONTAINER_WORKSPACE") as mock_container_workspace:
                mock_container_workspace.exists.return_value = True
                mock_container_workspace.__str__.return_value = "/workspace"
                # Set WORKSPACE_ROOT to simulate Docker environment
                with patch.dict(os.environ, {"WORKSPACE_ROOT": tmpdir}):
                    # Only specifically allowed internal app paths should remain unchanged in Docker
                    allowed_path = "/app/conf/custom_models.json"
                    result = translate_path_for_environment(allowed_path)
                    assert (
                        result == allowed_path
                    ), f"Docker mode should preserve allowed path {allowed_path}, got {result}"
    def test_non_allowed_app_paths_blocked_in_docker_mode(self):
        """Test that non-allowed /app/ paths are blocked in Docker mode."""
        with tempfile.TemporaryDirectory() as tmpdir:
            # Mock Docker environment
            with patch("utils.file_utils.CONTAINER_WORKSPACE") as mock_container_workspace:
                mock_container_workspace.exists.return_value = True
                mock_container_workspace.__str__.return_value = "/workspace"
                # Set WORKSPACE_ROOT to simulate Docker environment
                with patch.dict(os.environ, {"WORKSPACE_ROOT": tmpdir}):
                    # Non-allowed internal app paths should be blocked in Docker for security
                    blocked_paths = [
                        "/app/conf/other_config.json",
                        "/app/logs/app.log",
                        "/app/server.py",
                    ]
                    for blocked_path in blocked_paths:
                        result = translate_path_for_environment(blocked_path)
                        assert result.startswith(
                            "/inaccessible/"
                        ), f"Docker mode should block non-allowed path {blocked_path}, got {result}"
    def test_non_app_paths_unchanged_in_standalone(self):
        """Test that non-/app/ paths are unchanged in standalone mode."""
        # Mock standalone environment
        with patch("utils.file_utils.CONTAINER_WORKSPACE") as mock_container_workspace:
            mock_container_workspace.exists.return_value = False
            with patch.dict(os.environ, {}, clear=True):
                # Non-app paths should be unchanged
                test_cases = [
                    "/home/user/file.py",
                    "/etc/config.conf",
                    "./local/file.txt",
                    "relative/path.py",
                    "/workspace/file.py",
                ]
                for input_path in test_cases:
                    result = translate_path_for_environment(input_path)
                    assert result == input_path, f"Non-app path {input_path} should be unchanged, got {result}"
    def test_edge_cases_in_app_translation(self):
        """Test edge cases in /app/ path translation."""
        # Mock standalone environment
        with patch("utils.file_utils.CONTAINER_WORKSPACE") as mock_container_workspace:
            mock_container_workspace.exists.return_value = False
            with patch.dict(os.environ, {}, clear=True):
                # Test edge cases
                test_cases = [
                    ("/app/", "./"),  # Root app directory
                    ("/app", "/app"),  # Exact match without trailing slash - not translated
                    ("/app/file", "./file"),  # File directly in app
                    ("/app//double/slash", "./double/slash"),  # Handle double slashes
                ]
                for input_path, expected_output in test_cases:
                    result = translate_path_for_environment(input_path)
                    assert (
                        result == expected_output
                    ), f"Edge case {input_path}: expected {expected_output}, got {result}"
--- a/tests/test_image_support_integration.py
+++ b/tests/test_image_support_integration.py
@@ -0,0 +1,591 @@
 """
 Integration tests for native image support feature.
 Tests the complete image support pipeline:
 - Conversation memory integration with images
 - Tool request validation and schema support
 - Provider image processing capabilities
 - Cross-tool image context preservation
 """
 import json
 import os
 import tempfile
 import uuid
 from unittest.mock import Mock, patch
 import pytest
 from tools.chat import ChatTool
 from tools.debug import DebugIssueTool
 from utils.conversation_memory import (
    ConversationTurn,
    ThreadContext,
    add_turn,
    create_thread,
    get_conversation_image_list,
    get_thread,
 )
 class TestImageSupportIntegration:
    """Integration tests for the complete image support feature."""
    def test_conversation_turn_includes_images(self):
        """Test that ConversationTurn can store and track images."""
        turn = ConversationTurn(
            role="user",
            content="Please analyze this diagram",
            timestamp="2025-01-01T00:00:00Z",
            files=["code.py"],
            images=["diagram.png", "flowchart.jpg"],
            tool_name="chat",
        )
        assert turn.images == ["diagram.png", "flowchart.jpg"]
        assert turn.files == ["code.py"]
        assert turn.content == "Please analyze this diagram"
    def test_get_conversation_image_list_newest_first(self):
        """Test that image list prioritizes newest references."""
        # Create thread context with multiple turns
        context = ThreadContext(
            thread_id=str(uuid.uuid4()),
            created_at="2025-01-01T00:00:00Z",
            last_updated_at="2025-01-01T00:00:00Z",
            tool_name="chat",
            turns=[
                ConversationTurn(
                    role="user",
                    content="Turn 1",
                    timestamp="2025-01-01T00:00:00Z",
                    images=["old_diagram.png", "shared.png"],
                ),
                ConversationTurn(
                    role="assistant", content="Turn 2", timestamp="2025-01-01T01:00:00Z", images=["middle.png"]
                ),
                ConversationTurn(
                    role="user",
                    content="Turn 3",
                    timestamp="2025-01-01T02:00:00Z",
                    images=["shared.png", "new_diagram.png"],  # shared.png appears again
                ),
            ],
            initial_context={},
        )
        image_list = get_conversation_image_list(context)
        # Should prioritize newest first, with duplicates removed (newest wins)
        expected = ["shared.png", "new_diagram.png", "middle.png", "old_diagram.png"]
        assert image_list == expected
    @patch("utils.conversation_memory.get_redis_client")
    def test_add_turn_with_images(self, mock_redis):
        """Test adding a conversation turn with images."""
        mock_client = Mock()
        mock_redis.return_value = mock_client
        # Mock the Redis operations to return success
        mock_client.set.return_value = True
        thread_id = create_thread("test_tool", {"initial": "context"})
        # Set up initial thread context for add_turn to find
        initial_context = ThreadContext(
            thread_id=thread_id,
            created_at="2025-01-01T00:00:00Z",
            last_updated_at="2025-01-01T00:00:00Z",
            tool_name="test_tool",
            turns=[],  # Empty initially
            initial_context={"initial": "context"},
        )
        mock_client.get.return_value = initial_context.model_dump_json()
        success = add_turn(
            thread_id=thread_id,
            role="user",
            content="Analyze these screenshots",
            files=["app.py"],
            images=["screenshot1.png", "screenshot2.png"],
            tool_name="debug",
        )
        assert success
        # Mock thread context for get_thread call
        updated_context = ThreadContext(
            thread_id=thread_id,
            created_at="2025-01-01T00:00:00Z",
            last_updated_at="2025-01-01T00:00:00Z",
            tool_name="test_tool",
            turns=[
                ConversationTurn(
                    role="user",
                    content="Analyze these screenshots",
                    timestamp="2025-01-01T00:00:00Z",
                    files=["app.py"],
                    images=["screenshot1.png", "screenshot2.png"],
                    tool_name="debug",
                )
            ],
            initial_context={"initial": "context"},
        )
        mock_client.get.return_value = updated_context.model_dump_json()
        # Retrieve and verify the thread
        context = get_thread(thread_id)
        assert context is not None
        assert len(context.turns) == 1
        turn = context.turns[0]
        assert turn.images == ["screenshot1.png", "screenshot2.png"]
        assert turn.files == ["app.py"]
        assert turn.content == "Analyze these screenshots"
    def test_chat_tool_schema_includes_images(self):
        """Test that ChatTool schema includes images field."""
        tool = ChatTool()
        schema = tool.get_input_schema()
        assert "images" in schema["properties"]
        images_field = schema["properties"]["images"]
        assert images_field["type"] == "array"
        assert images_field["items"]["type"] == "string"
        assert "visual context" in images_field["description"].lower()
    def test_debug_tool_schema_includes_images(self):
        """Test that DebugIssueTool schema includes images field."""
        tool = DebugIssueTool()
        schema = tool.get_input_schema()
        assert "images" in schema["properties"]
        images_field = schema["properties"]["images"]
        assert images_field["type"] == "array"
        assert images_field["items"]["type"] == "string"
        assert "error screens" in images_field["description"].lower()
    def test_tool_image_validation_limits(self):
        """Test that tools validate image size limits using real provider resolution."""
        tool = ChatTool()
        # Create small test images (each 0.5MB, total 1MB)
        small_images = []
        for _ in range(2):
            with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as temp_file:
                # Write 0.5MB of data
                temp_file.write(b"\x00" * (512 * 1024))
                small_images.append(temp_file.name)
        try:
            # Test with a model that should fail (no provider available in test environment)
            result = tool._validate_image_limits(small_images, "mistral-large")
            # Should return error because model not available
            assert result is not None
            assert result["status"] == "error"
            assert "does not support image processing" in result["content"]
            # Test that empty/None images always pass regardless of model
            result = tool._validate_image_limits([], "any-model")
            assert result is None
            result = tool._validate_image_limits(None, "any-model")
            assert result is None
        finally:
            # Clean up temp files
            for img_path in small_images:
                if os.path.exists(img_path):
                    os.unlink(img_path)
    def test_image_validation_model_specific_limits(self):
        """Test that different models have appropriate size limits using real provider resolution."""
        import importlib
        tool = ChatTool()
        # Test OpenAI O3 model (20MB limit) - Create 15MB image (should pass)
        small_image_path = None
        large_image_path = None
        # Save original environment
        original_env = {
            "OPENAI_API_KEY": os.environ.get("OPENAI_API_KEY"),
            "DEFAULT_MODEL": os.environ.get("DEFAULT_MODEL"),
        }
        try:
            # Create 15MB image (under 20MB O3 limit)
            with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as temp_file:
                temp_file.write(b"\x00" * (15 * 1024 * 1024))  # 15MB
                small_image_path = temp_file.name
            # Set up environment for OpenAI provider
            os.environ["OPENAI_API_KEY"] = "test-key-o3-validation-test-not-real"
            os.environ["DEFAULT_MODEL"] = "o3"
            # Clear other provider keys to isolate to OpenAI
            for key in ["GEMINI_API_KEY", "XAI_API_KEY", "OPENROUTER_API_KEY"]:
                os.environ.pop(key, None)
            # Reload config and clear registry
            import config
            importlib.reload(config)
            from providers.registry import ModelProviderRegistry
            ModelProviderRegistry._instance = None
            result = tool._validate_image_limits([small_image_path], "o3")
            assert result is None  # Should pass (15MB < 20MB limit)
            # Create 25MB image (over 20MB O3 limit)
            with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as temp_file:
                temp_file.write(b"\x00" * (25 * 1024 * 1024))  # 25MB
                large_image_path = temp_file.name
            result = tool._validate_image_limits([large_image_path], "o3")
            assert result is not None  # Should fail (25MB > 20MB limit)
            assert result["status"] == "error"
            assert "Image size limit exceeded" in result["content"]
            assert "20.0MB" in result["content"]  # O3 limit
            assert "25.0MB" in result["content"]  # Provided size
        finally:
            # Clean up temp files
            if small_image_path and os.path.exists(small_image_path):
                os.unlink(small_image_path)
            if large_image_path and os.path.exists(large_image_path):
                os.unlink(large_image_path)
            # Restore environment
            for key, value in original_env.items():
                if value is not None:
                    os.environ[key] = value
                else:
                    os.environ.pop(key, None)
            # Reload config and clear registry
            importlib.reload(config)
            ModelProviderRegistry._instance = None
    @pytest.mark.asyncio
    async def test_chat_tool_execution_with_images(self):
        """Test that ChatTool can execute with images parameter using real provider resolution."""
        import importlib
        # Create a temporary image file for testing
        with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as temp_file:
            # Write a simple PNG header (minimal valid PNG)
            png_header = b"\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00\x01\x00\x00\x00\x01\x08\x06\x00\x00\x00\x1f\x15\xc4\x89\x00\x00\x00\rIDATx\x9cc\x00\x01\x00\x00\x05\x00\x01\r\n-\xdb\x00\x00\x00\x00IEND\xaeB`\x82"
            temp_file.write(png_header)
            temp_image_path = temp_file.name
        # Save original environment
        original_env = {
            "OPENAI_API_KEY": os.environ.get("OPENAI_API_KEY"),
            "DEFAULT_MODEL": os.environ.get("DEFAULT_MODEL"),
        }
        try:
            # Set up environment for real provider resolution
            os.environ["OPENAI_API_KEY"] = "sk-test-key-images-test-not-real"
            os.environ["DEFAULT_MODEL"] = "gpt-4o"
            # Clear other provider keys to isolate to OpenAI
            for key in ["GEMINI_API_KEY", "XAI_API_KEY", "OPENROUTER_API_KEY"]:
                os.environ.pop(key, None)
            # Reload config and clear registry
            import config
            importlib.reload(config)
            from providers.registry import ModelProviderRegistry
            ModelProviderRegistry._instance = None
            tool = ChatTool()
            # Test with real provider resolution
            try:
                result = await tool.execute(
                    {"prompt": "What do you see in this image?", "images": [temp_image_path], "model": "gpt-4o"}
                )
                # If we get here, check the response format
                assert len(result) == 1
                # Should be a valid JSON response
                output = json.loads(result[0].text)
                assert "status" in output
                # Test passed - provider accepted images parameter
            except Exception as e:
                # Expected: API call will fail with fake key
                error_msg = str(e)
                # Should NOT be a mock-related error
                assert "MagicMock" not in error_msg
                assert "'<' not supported between instances" not in error_msg
                # Should be a real provider error (API key or network)
                assert any(
                    phrase in error_msg
                    for phrase in ["API", "key", "authentication", "provider", "network", "connection", "401", "403"]
                )
                # Test passed - provider processed images parameter before failing on auth
        finally:
            # Clean up temp file
            os.unlink(temp_image_path)
            # Restore environment
            for key, value in original_env.items():
                if value is not None:
                    os.environ[key] = value
                else:
                    os.environ.pop(key, None)
            # Reload config and clear registry
            importlib.reload(config)
            ModelProviderRegistry._instance = None
    @patch("utils.conversation_memory.get_redis_client")
    def test_cross_tool_image_context_preservation(self, mock_redis):
        """Test that images are preserved across different tools in conversation."""
        mock_client = Mock()
        mock_redis.return_value = mock_client
        # Mock the Redis operations to return success
        mock_client.set.return_value = True
        # Create initial thread with chat tool
        thread_id = create_thread("chat", {"initial": "context"})
        # Set up initial thread context for add_turn to find
        initial_context = ThreadContext(
            thread_id=thread_id,
            created_at="2025-01-01T00:00:00Z",
            last_updated_at="2025-01-01T00:00:00Z",
            tool_name="chat",
            turns=[],  # Empty initially
            initial_context={"initial": "context"},
        )
        mock_client.get.return_value = initial_context.model_dump_json()
        # Add turn with images from chat tool
        add_turn(
            thread_id=thread_id,
            role="user",
            content="Here's my UI design",
            images=["design.png", "mockup.jpg"],
            tool_name="chat",
        )
        add_turn(
            thread_id=thread_id, role="assistant", content="I can see your design. It looks good!", tool_name="chat"
        )
        # Add turn with different images from debug tool
        add_turn(
            thread_id=thread_id,
            role="user",
            content="Now I'm getting this error",
            images=["error_screen.png"],
            files=["error.log"],
            tool_name="debug",
        )
        # Mock complete thread context for get_thread call
        complete_context = ThreadContext(
            thread_id=thread_id,
            created_at="2025-01-01T00:00:00Z",
            last_updated_at="2025-01-01T00:05:00Z",
            tool_name="chat",
            turns=[
                ConversationTurn(
                    role="user",
                    content="Here's my UI design",
                    timestamp="2025-01-01T00:01:00Z",
                    images=["design.png", "mockup.jpg"],
                    tool_name="chat",
                ),
                ConversationTurn(
                    role="assistant",
                    content="I can see your design. It looks good!",
                    timestamp="2025-01-01T00:02:00Z",
                    tool_name="chat",
                ),
                ConversationTurn(
                    role="user",
                    content="Now I'm getting this error",
                    timestamp="2025-01-01T00:03:00Z",
                    images=["error_screen.png"],
                    files=["error.log"],
                    tool_name="debug",
                ),
            ],
            initial_context={"initial": "context"},
        )
        mock_client.get.return_value = complete_context.model_dump_json()
        # Retrieve thread and check image preservation
        context = get_thread(thread_id)
        assert context is not None
        # Get conversation image list (should prioritize newest first)
        image_list = get_conversation_image_list(context)
        expected = ["error_screen.png", "design.png", "mockup.jpg"]
        assert image_list == expected
        # Verify each turn has correct images
        assert context.turns[0].images == ["design.png", "mockup.jpg"]
        assert context.turns[1].images is None  # Assistant turn without images
        assert context.turns[2].images == ["error_screen.png"]
    def test_tool_request_base_class_has_images(self):
        """Test that base ToolRequest class includes images field."""
        from tools.base import ToolRequest
        # Create request with images
        request = ToolRequest(images=["test.png", "test2.jpg"])
        assert request.images == ["test.png", "test2.jpg"]
        # Test default value
        request_no_images = ToolRequest()
        assert request_no_images.images is None
    def test_data_url_image_format_support(self):
        """Test that tools can handle data URL format images."""
        import importlib
        tool = ChatTool()
        # Test with data URL (base64 encoded 1x1 transparent PNG)
        data_url = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="
        images = [data_url]
        # Save original environment
        original_env = {
            "OPENAI_API_KEY": os.environ.get("OPENAI_API_KEY"),
            "DEFAULT_MODEL": os.environ.get("DEFAULT_MODEL"),
        }
        try:
            # Set up environment for OpenAI provider
            os.environ["OPENAI_API_KEY"] = "test-key-data-url-test-not-real"
            os.environ["DEFAULT_MODEL"] = "o3"
            # Clear other provider keys to isolate to OpenAI
            for key in ["GEMINI_API_KEY", "XAI_API_KEY", "OPENROUTER_API_KEY"]:
                os.environ.pop(key, None)
            # Reload config and clear registry
            import config
            importlib.reload(config)
            from providers.registry import ModelProviderRegistry
            ModelProviderRegistry._instance = None
            # Use a model that should be available - o3 from OpenAI
            result = tool._validate_image_limits(images, "o3")
            assert result is None  # Small data URL should pass validation
            # Also test with a non-vision model to ensure validation works
            result = tool._validate_image_limits(images, "mistral-large")
            # This should fail because model not available with current setup
            assert result is not None
            assert result["status"] == "error"
            assert "does not support image processing" in result["content"]
        finally:
            # Restore environment
            for key, value in original_env.items():
                if value is not None:
                    os.environ[key] = value
                else:
                    os.environ.pop(key, None)
            # Reload config and clear registry
            importlib.reload(config)
            ModelProviderRegistry._instance = None
    def test_empty_images_handling(self):
        """Test that tools handle empty images lists gracefully."""
        tool = ChatTool()
        # Empty list should not fail validation (no need for provider setup)
        result = tool._validate_image_limits([], "test_model")
        assert result is None
        # None should not fail validation (no need for provider setup)
        result = tool._validate_image_limits(None, "test_model")
        assert result is None
    @patch("utils.conversation_memory.get_redis_client")
    def test_conversation_memory_thread_chaining_with_images(self, mock_redis):
        """Test that images work correctly with conversation thread chaining."""
        mock_client = Mock()
        mock_redis.return_value = mock_client
        # Mock the Redis operations to return success
        mock_client.set.return_value = True
        # Create parent thread with images
        parent_thread_id = create_thread("chat", {"parent": "context"})
        # Set up initial parent thread context for add_turn to find
        parent_context = ThreadContext(
            thread_id=parent_thread_id,
            created_at="2025-01-01T00:00:00Z",
            last_updated_at="2025-01-01T00:00:00Z",
            tool_name="chat",
            turns=[],  # Empty initially
            initial_context={"parent": "context"},
        )
        mock_client.get.return_value = parent_context.model_dump_json()
        add_turn(
            thread_id=parent_thread_id,
            role="user",
            content="Parent thread with images",
            images=["parent1.png", "shared.png"],
            tool_name="chat",
        )
        # Create child thread linked to parent
        child_thread_id = create_thread("debug", {"child": "context"}, parent_thread_id=parent_thread_id)
        add_turn(
            thread_id=child_thread_id,
            role="user",
            content="Child thread with more images",
            images=["child1.png", "shared.png"],  # shared.png appears again (should prioritize newer)
            tool_name="debug",
        )
        # Mock child thread context for get_thread call
        child_context = ThreadContext(
            thread_id=child_thread_id,
            created_at="2025-01-01T00:00:00Z",
            last_updated_at="2025-01-01T00:02:00Z",
            tool_name="debug",
            turns=[
                ConversationTurn(
                    role="user",
                    content="Child thread with more images",
                    timestamp="2025-01-01T00:02:00Z",
                    images=["child1.png", "shared.png"],
                    tool_name="debug",
                )
            ],
            initial_context={"child": "context"},
            parent_thread_id=parent_thread_id,
        )
        mock_client.get.return_value = child_context.model_dump_json()
        # Get child thread and verify image collection works across chain
        child_context = get_thread(child_thread_id)
        assert child_context is not None
        assert child_context.parent_thread_id == parent_thread_id
        # Test image collection for child thread only
        child_images = get_conversation_image_list(child_context)
        assert child_images == ["child1.png", "shared.png"]
--- a/tests/test_internal_config_file_access.py
+++ b/tests/test_internal_config_file_access.py
@@ -1,290 +0,0 @@
 """
 Integration tests for internal application configuration file access.
 These tests verify that:
 1. Specific internal config files are accessible (exact path matching)
 2. Path variations and traversal attempts are blocked (security)
 3. The OpenRouter model configuration loads properly
 4. Normal workspace file operations continue to work
 This follows the established testing patterns from test_docker_path_integration.py
 by using actual file operations and module reloading instead of mocks.
 """
 import importlib
 import os
 import tempfile
 from pathlib import Path
 from unittest.mock import patch
 import pytest
 from utils.file_utils import translate_path_for_environment
 class TestInternalConfigFileAccess:
    """Test access to internal application configuration files."""
    def test_allowed_internal_config_file_access(self):
        """Test that the specific internal config file is accessible."""
        with tempfile.TemporaryDirectory() as tmpdir:
            # Set up Docker-like environment
            host_workspace = Path(tmpdir) / "host_workspace"
            host_workspace.mkdir()
            container_workspace = Path(tmpdir) / "container_workspace"
            container_workspace.mkdir()
            original_env = os.environ.copy()
            try:
                os.environ["WORKSPACE_ROOT"] = str(host_workspace)
                # Reload modules to pick up environment
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
                # Test with Docker environment simulation
                with patch("utils.file_utils.CONTAINER_WORKSPACE", container_workspace):
                    # The exact allowed path should pass through unchanged
                    result = translate_path_for_environment("/app/conf/custom_models.json")
                    assert result == "/app/conf/custom_models.json"
            finally:
                # Restore environment
                os.environ.clear()
                os.environ.update(original_env)
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
    def test_blocked_config_file_variations(self):
        """Test that variations of the config file path are blocked."""
        with tempfile.TemporaryDirectory() as tmpdir:
            host_workspace = Path(tmpdir) / "host_workspace"
            host_workspace.mkdir()
            container_workspace = Path(tmpdir) / "container_workspace"
            container_workspace.mkdir()
            original_env = os.environ.copy()
            try:
                os.environ["WORKSPACE_ROOT"] = str(host_workspace)
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
                with patch("utils.file_utils.CONTAINER_WORKSPACE", container_workspace):
                    # Test blocked variations - these should return inaccessible paths
                    blocked_paths = [
                        "/app/conf/",  # Directory
                        "/app/conf/other_file.json",  # Different file
                        "/app/conf/custom_models.json.backup",  # Extra extension
                        "/app/conf/custom_models.txt",  # Different extension
                        "/app/conf/../server.py",  # Path traversal
                        "/app/server.py",  # Application code
                        "/etc/passwd",  # System file
                    ]
                    for path in blocked_paths:
                        result = translate_path_for_environment(path)
                        assert result.startswith("/inaccessible/"), f"Path {path} should be blocked but got: {result}"
            finally:
                os.environ.clear()
                os.environ.update(original_env)
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
    def test_workspace_files_continue_to_work(self):
        """Test that normal workspace file operations are unaffected."""
        with tempfile.TemporaryDirectory() as tmpdir:
            host_workspace = Path(tmpdir) / "host_workspace"
            host_workspace.mkdir()
            container_workspace = Path(tmpdir) / "container_workspace"
            container_workspace.mkdir()
            # Create a test file in the workspace
            test_file = host_workspace / "src" / "test.py"
            test_file.parent.mkdir(parents=True)
            test_file.write_text("# test file")
            original_env = os.environ.copy()
            try:
                os.environ["WORKSPACE_ROOT"] = str(host_workspace)
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
                with patch("utils.file_utils.CONTAINER_WORKSPACE", container_workspace):
                    # Normal workspace file should translate correctly
                    result = translate_path_for_environment(str(test_file))
                    expected = str(container_workspace / "src" / "test.py")
                    assert result == expected
            finally:
                os.environ.clear()
                os.environ.update(original_env)
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
    def test_openrouter_config_loading_real_world(self):
        """Test that OpenRouter configuration loading works in real container environment."""
        # This test validates that our fix works in the actual Docker environment
        # by checking that the translate_path_for_environment function handles
        # the exact internal config path correctly
        with tempfile.TemporaryDirectory() as tmpdir:
            host_workspace = Path(tmpdir) / "host_workspace"
            host_workspace.mkdir()
            container_workspace = Path(tmpdir) / "container_workspace"
            container_workspace.mkdir()
            original_env = os.environ.copy()
            try:
                os.environ["WORKSPACE_ROOT"] = str(host_workspace)
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
                with patch("utils.file_utils.CONTAINER_WORKSPACE", container_workspace):
                    # Test that the function correctly handles the config path
                    result = translate_path_for_environment("/app/conf/custom_models.json")
                    # The path should pass through unchanged (not be blocked)
                    assert result == "/app/conf/custom_models.json"
                    # Verify it's not marked as inaccessible
                    assert not result.startswith("/inaccessible/")
            finally:
                os.environ.clear()
                os.environ.update(original_env)
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
    def test_security_boundary_comprehensive(self):
        """Comprehensive test of all security boundaries in Docker environment."""
        with tempfile.TemporaryDirectory() as tmpdir:
            host_workspace = Path(tmpdir) / "host_workspace"
            host_workspace.mkdir()
            container_workspace = Path(tmpdir) / "container_workspace"
            container_workspace.mkdir()
            # Create a workspace file for testing
            workspace_file = host_workspace / "project" / "main.py"
            workspace_file.parent.mkdir(parents=True)
            workspace_file.write_text("# workspace file")
            original_env = os.environ.copy()
            try:
                os.environ["WORKSPACE_ROOT"] = str(host_workspace)
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
                with patch("utils.file_utils.CONTAINER_WORKSPACE", container_workspace):
                    # Test cases: (path, should_be_allowed, description)
                    test_cases = [
                        # Allowed cases
                        ("/app/conf/custom_models.json", True, "Exact allowed internal config"),
                        (str(workspace_file), True, "Workspace file"),
                        (str(container_workspace / "existing.py"), True, "Container path"),
                        # Blocked cases
                        ("/app/conf/", False, "Directory access"),
                        ("/app/conf/other.json", False, "Different config file"),
                        ("/app/conf/custom_models.json.backup", False, "Config with extra extension"),
                        ("/app/server.py", False, "Application source"),
                        ("/etc/passwd", False, "System file"),
                        ("../../../etc/passwd", False, "Relative path traversal"),
                        ("/app/conf/../server.py", False, "Path traversal through config dir"),
                    ]
                    for path, should_be_allowed, description in test_cases:
                        result = translate_path_for_environment(path)
                        if should_be_allowed:
                            # Should either pass through unchanged or translate to container path
                            assert not result.startswith(
                                "/inaccessible/"
                            ), f"{description}: {path} should be allowed but was blocked"
                        else:
                            # Should be blocked with inaccessible path
                            assert result.startswith(
                                "/inaccessible/"
                            ), f"{description}: {path} should be blocked but got: {result}"
            finally:
                os.environ.clear()
                os.environ.update(original_env)
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
    def test_exact_path_matching_prevents_wildcards(self):
        """Test that using exact path matching prevents any wildcard-like behavior."""
        with tempfile.TemporaryDirectory() as tmpdir:
            host_workspace = Path(tmpdir) / "host_workspace"
            host_workspace.mkdir()
            container_workspace = Path(tmpdir) / "container_workspace"
            container_workspace.mkdir()
            original_env = os.environ.copy()
            try:
                os.environ["WORKSPACE_ROOT"] = str(host_workspace)
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
                with patch("utils.file_utils.CONTAINER_WORKSPACE", container_workspace):
                    # Even subtle variations should be blocked
                    subtle_variations = [
                        "/app/conf/custom_models.jsonx",  # Extra char
                        "/app/conf/custom_models.jso",  # Missing char
                        "/app/conf/custom_models.JSON",  # Different case
                        "/app/conf/custom_models.json ",  # Trailing space
                        " /app/conf/custom_models.json",  # Leading space
                        "/app/conf/./custom_models.json",  # Current dir reference
                        "/app/conf/subdir/../custom_models.json",  # Up and down
                    ]
                    for variation in subtle_variations:
                        result = translate_path_for_environment(variation)
                        assert result.startswith(
                            "/inaccessible/"
                        ), f"Variation {variation} should be blocked but got: {result}"
            finally:
                os.environ.clear()
                os.environ.update(original_env)
                import utils.security_config
                importlib.reload(utils.security_config)
                importlib.reload(utils.file_utils)
 if __name__ == "__main__":
    pytest.main([__file__, "-v"])
--- a/tests/triangle.png
+++ b/tests/triangle.png
--- a/tools/analyze.py
+++ b/tools/analyze.py
@@ -87,7 +87,13 @@ class AnalyzeTool(BaseTool):
                },
                "use_websearch": {
                    "type": "boolean",
-                    "description": "Enable web search for documentation, best practices, and current information. Particularly useful for: brainstorming sessions, architectural design discussions, exploring industry best practices, working with specific frameworks/technologies, researching solutions to complex problems, or when current documentation and community insights would enhance the analysis.",
+                    "description": (
                        "Enable web search for documentation, best practices, and current information. "
                        "Particularly useful for: brainstorming sessions, architectural design discussions, "
                        "exploring industry best practices, working with specific frameworks/technologies, "
                        "researching solutions to complex problems, or when current documentation and "
                        "community insights would enhance the analysis."
                    ),
                    "default": True,
                },
                "continuation_id": {
--- a/tools/base.py
+++ b/tools/base.py
@@ -27,6 +27,7 @@ if TYPE_CHECKING:
 from config import MCP_PROMPT_SIZE_LIMIT
 from providers import ModelProvider, ModelProviderRegistry
 from providers.base import ProviderType
 from utils import check_token_limit
 from utils.conversation_memory import (
    MAX_CONVERSATION_TURNS,
@@ -84,6 +85,17 @@ class ToolRequest(BaseModel):
            "additional findings, or answers to follow-up questions. Can be used across different tools."
        ),
    )
    images: Optional[list[str]] = Field(
        None,
        description=(
            "Optional image(s) for visual context. Accepts absolute file paths or "
            "base64 data URLs. Only provide when user explicitly mentions images. "
            "When including images, please describe what you believe each image contains "
            "(e.g., 'screenshot of error dialog', 'architecture diagram', 'code snippet') "
            "to aid with contextual understanding. Useful for UI discussions, diagrams, "
            "visual problems, error screens, architecture mockups, and visual analysis tasks."
        ),
    )
 class BaseTool(ABC):
@@ -981,6 +993,141 @@ When recommending searches, be specific about what information you need and why
            }
        return None
    def _validate_image_limits(
        self, images: Optional[list[str]], model_name: str, continuation_id: Optional[str] = None
    ) -> Optional[dict]:
        """
        Validate image size against model capabilities at MCP boundary.
        This performs strict validation to ensure we don't exceed model-specific
        image size limits. Uses capability-based validation with actual model
        configuration rather than hard-coded limits.
        Args:
            images: List of image paths/data URLs to validate
            model_name: Name of the model to check limits against
        Returns:
            Optional[dict]: Error response if validation fails, None if valid
        """
        if not images:
            return None
        # Get model capabilities to check image support and size limits
        try:
            provider = self.get_model_provider(model_name)
            capabilities = provider.get_capabilities(model_name)
        except Exception as e:
            logger.warning(f"Failed to get capabilities for model {model_name}: {e}")
            # Fall back to checking custom models configuration
            capabilities = None
        # Check if model supports images at all
        supports_images = False
        max_size_mb = 0.0
        if capabilities:
            supports_images = capabilities.supports_images
            max_size_mb = capabilities.max_image_size_mb
        else:
            # Fall back to custom models configuration
            try:
                import json
                from pathlib import Path
                custom_models_path = Path(__file__).parent.parent / "conf" / "custom_models.json"
                if custom_models_path.exists():
                    with open(custom_models_path) as f:
                        custom_config = json.load(f)
                    # Check if model is in custom models list
                    for model_config in custom_config.get("models", []):
                        if model_config.get("model_name") == model_name or model_name in model_config.get(
                            "aliases", []
                        ):
                            supports_images = model_config.get("supports_images", False)
                            max_size_mb = model_config.get("max_image_size_mb", 0.0)
                            break
            except Exception as e:
                logger.warning(f"Failed to load custom models config: {e}")
        # If model doesn't support images, reject
        if not supports_images:
            return {
                "status": "error",
                "content": (
                    f"Image support not available: Model '{model_name}' does not support image processing. "
                    f"Please use a vision-capable model such as 'gemini-2.5-flash-preview-05-20', 'o3', "
                    f"or 'claude-3-opus' for image analysis tasks."
                ),
                "content_type": "text",
                "metadata": {
                    "error_type": "validation_error",
                    "model_name": model_name,
                    "supports_images": False,
                    "image_count": len(images),
                },
            }
        # Calculate total size of all images
        total_size_mb = 0.0
        for image_path in images:
            try:
                if image_path.startswith("data:image/"):
                    # Handle data URL: data:image/png;base64,iVBORw0...
                    _, data = image_path.split(",", 1)
                    # Base64 encoding increases size by ~33%, so decode to get actual size
                    import base64
                    actual_size = len(base64.b64decode(data))
                    actual_size = len(base64.b64decode(data))
                    total_size_mb += actual_size / (1024 * 1024)
                else:
                    # Handle file path
                    if os.path.exists(image_path):
                        file_size = os.path.getsize(image_path)
                        total_size_mb += file_size / (1024 * 1024)
                    else:
                        logger.warning(f"Image file not found: {image_path}")
                        # Assume a reasonable size for missing files to avoid breaking validation
                        total_size_mb += 1.0  # 1MB assumption
            except Exception as e:
                logger.warning(f"Failed to get size for image {image_path}: {e}")
                # Assume a reasonable size for problematic files
                total_size_mb += 1.0  # 1MB assumption
        # Apply 40MB cap for custom models as requested
        effective_limit_mb = max_size_mb
        if hasattr(capabilities, "provider") and capabilities.provider == ProviderType.CUSTOM:
            effective_limit_mb = min(max_size_mb, 40.0)
        elif not capabilities:  # Fallback case for custom models
            effective_limit_mb = min(max_size_mb, 40.0)
        # Validate against size limit
        if total_size_mb > effective_limit_mb:
            return {
                "status": "error",
                "content": (
                    f"Image size limit exceeded: Model '{model_name}' supports maximum {effective_limit_mb:.1f}MB "
                    f"for all images combined, but {total_size_mb:.1f}MB was provided. "
                    f"Please reduce image sizes or count and try again."
                ),
                "content_type": "text",
                "metadata": {
                    "error_type": "validation_error",
                    "model_name": model_name,
                    "total_size_mb": round(total_size_mb, 2),
                    "limit_mb": round(effective_limit_mb, 2),
                    "image_count": len(images),
                    "supports_images": supports_images,
                },
            }
        # All validations passed
        logger.debug(f"Image validation passed: {len(images)} images")
        return None
    def estimate_tokens_smart(self, file_path: str) -> int:
        """
        Estimate tokens for a file using file-type aware ratios.
@@ -1131,6 +1278,9 @@ When recommending searches, be specific about what information you need and why
                )
                return [TextContent(type="text", text=error_output.model_dump_json())]
            # Extract and validate images from request
            images = getattr(request, "images", None) or []
            # Check if we have continuation_id - if so, conversation history is already embedded
            continuation_id = getattr(request, "continuation_id", None)
@@ -1215,6 +1365,12 @@ When recommending searches, be specific about what information you need and why
            # Only set this after auto mode validation to prevent "auto" being used as a model name
            self._current_model_name = model_name
            # Validate images at MCP boundary if any were provided
            if images:
                image_validation_error = self._validate_image_limits(images, model_name, continuation_id)
                if image_validation_error:
                    return [TextContent(type="text", text=json.dumps(image_validation_error))]
            temperature = getattr(request, "temperature", None)
            if temperature is None:
                temperature = self.get_default_temperature()
@@ -1247,6 +1403,7 @@ When recommending searches, be specific about what information you need and why
                system_prompt=system_prompt,
                temperature=temperature,
                thinking_mode=thinking_mode if provider.supports_thinking_mode(model_name) else None,
                images=images if images else None,  # Pass images via kwargs
            )
            logger.info(f"Received response from {provider.get_provider_type().value} API for {self.name}")
@@ -1298,6 +1455,7 @@ When recommending searches, be specific about what information you need and why
                        system_prompt=system_prompt,
                        temperature=temperature,
                        thinking_mode=thinking_mode if provider.supports_thinking_mode(model_name) else None,
                        images=images if images else None,  # Pass images via kwargs in retry too
                    )
                    if retry_response.content:
@@ -1398,6 +1556,7 @@ When recommending searches, be specific about what information you need and why
        continuation_id = getattr(request, "continuation_id", None)
        if continuation_id:
            request_files = getattr(request, "files", []) or []
            request_images = getattr(request, "images", []) or []
            # Extract model metadata for conversation tracking
            model_provider = None
            model_name = None
@@ -1417,6 +1576,7 @@ When recommending searches, be specific about what information you need and why
                "assistant",
                formatted_content,
                files=request_files,
                images=request_images,
                tool_name=self.name,
                model_provider=model_provider,
                model_name=model_name,
@@ -1519,6 +1679,7 @@ When recommending searches, be specific about what information you need and why
            # Use actually processed files from file preparation instead of original request files
            # This ensures directories are tracked as their individual expanded files
            request_files = getattr(self, "_actually_processed_files", []) or getattr(request, "files", []) or []
            request_images = getattr(request, "images", []) or []
            # Extract model metadata
            model_provider = None
            model_name = None
@@ -1538,6 +1699,7 @@ When recommending searches, be specific about what information you need and why
                "assistant",
                content,
                files=request_files,
                images=request_images,
                tool_name=self.name,
                model_provider=model_provider,
                model_name=model_name,
--- a/tools/chat.py
+++ b/tools/chat.py
@@ -20,12 +20,25 @@ class ChatRequest(ToolRequest):
    prompt: str = Field(
        ...,
-        description="Your question, topic, or current thinking to discuss",
+        description=(
            "Your thorough, expressive question with as much context as possible. Remember: you're talking to "
            "another Claude assistant who has deep expertise and can provide nuanced insights. Include your "
            "current thinking, specific challenges, background context, what you've already tried, and what "
            "kind of response would be most helpful. The more context and detail you provide, the more "
            "valuable and targeted the response will be."
        ),
    )
    files: Optional[list[str]] = Field(
        default_factory=list,
        description="Optional files for context (must be absolute paths)",
    )
    images: Optional[list[str]] = Field(
        default_factory=list,
        description=(
            "Optional images for visual context. Useful for UI discussions, diagrams, visual problems, "
            "error screens, or architectural mockups."
        ),
    )
 class ChatTool(BaseTool):
@@ -42,7 +55,8 @@ class ChatTool(BaseTool):
            "Also great for: explanations, comparisons, general development questions. "
            "Use this when you want to ask questions, brainstorm ideas, get opinions, discuss topics, "
            "share your thinking, or need explanations about concepts and approaches. "
-            "Note: If you're not currently using a top-tier model such as Opus 4 or above, these tools can provide enhanced capabilities."
+            "Note: If you're not currently using a top-tier model such as Opus 4 or above, these tools can "
            "provide enhanced capabilities."
        )
    def get_input_schema(self) -> dict[str, Any]:
@@ -51,13 +65,27 @@ class ChatTool(BaseTool):
            "properties": {
                "prompt": {
                    "type": "string",
-                    "description": "Your question, topic, or current thinking to discuss",
+                    "description": (
                        "Your thorough, expressive question with as much context as possible. Remember: you're "
                        "talking to another Claude assistant who has deep expertise and can provide nuanced "
                        "insights. Include your current thinking, specific challenges, background context, what "
                        "you've already tried, and what kind of response would be most helpful. The more context "
                        "and detail you provide, the more valuable and targeted the response will be."
                    ),
                },
                "files": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Optional files for context (must be absolute paths)",
                },
                "images": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": (
                        "Optional images for visual context. Useful for UI discussions, diagrams, visual "
                        "problems, error screens, or architectural mockups."
                    ),
                },
                "model": self.get_model_field_schema(),
                "temperature": {
                    "type": "number",
@@ -68,16 +96,29 @@ class ChatTool(BaseTool):
                "thinking_mode": {
                    "type": "string",
                    "enum": ["minimal", "low", "medium", "high", "max"],
-                    "description": "Thinking depth: minimal (0.5% of model max), low (8%), medium (33%), high (67%), max (100% of model max)",
+                    "description": (
                        "Thinking depth: minimal (0.5% of model max), low (8%), medium (33%), high (67%), "
                        "max (100% of model max)"
                    ),
                },
                "use_websearch": {
                    "type": "boolean",
-                    "description": "Enable web search for documentation, best practices, and current information. Particularly useful for: brainstorming sessions, architectural design discussions, exploring industry best practices, working with specific frameworks/technologies, researching solutions to complex problems, or when current documentation and community insights would enhance the analysis.",
+                    "description": (
                        "Enable web search for documentation, best practices, and current information. "
                        "Particularly useful for: brainstorming sessions, architectural design discussions, "
                        "exploring industry best practices, working with specific frameworks/technologies, "
                        "researching solutions to complex problems, or when current documentation and "
                        "community insights would enhance the analysis."
                    ),
                    "default": True,
                },
                "continuation_id": {
                    "type": "string",
-                    "description": "Thread continuation ID for multi-turn conversations. Can be used to continue conversations across different tools. Only provide this if continuing a previous conversation thread.",
+                    "description": (
                        "Thread continuation ID for multi-turn conversations. Can be used to continue "
                        "conversations across different tools. Only provide this if continuing a previous "
                        "conversation thread."
                    ),
                },
            },
            "required": ["prompt"] + (["model"] if self.is_effective_auto_mode() else []),
@@ -157,4 +198,7 @@ Please provide a thoughtful, comprehensive response:"""
    def format_response(self, response: str, request: ChatRequest, model_info: Optional[dict] = None) -> str:
        """Format the chat response"""
-        return f"{response}\n\n---\n\n**Claude's Turn:** Evaluate this perspective alongside your analysis to form a comprehensive solution and continue with the user's request and task at hand."
+        return (
            f"{response}\n\n---\n\n**Claude's Turn:** Evaluate this perspective alongside your analysis to "
            "form a comprehensive solution and continue with the user's request and task at hand."
        )
--- a/tools/codereview.py
+++ b/tools/codereview.py
@@ -41,6 +41,10 @@ class CodeReviewRequest(ToolRequest):
        ...,
        description="User's summary of what the code does, expected behavior, constraints, and review objectives",
    )
    images: Optional[list[str]] = Field(
        None,
        description="Optional images of architecture diagrams, UI mockups, design documents, or visual references for code review context",
    )
    review_type: str = Field("full", description="Type of review: full|security|performance|quick")
    focus_on: Optional[str] = Field(
        None,
@@ -94,6 +98,11 @@ class CodeReviewTool(BaseTool):
                    "type": "string",
                    "description": "User's summary of what the code does, expected behavior, constraints, and review objectives",
                },
                "images": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Optional images of architecture diagrams, UI mockups, design documents, or visual references for code review context",
                },
                "review_type": {
                    "type": "string",
                    "enum": ["full", "security", "performance", "quick"],
--- a/tools/debug.py
+++ b/tools/debug.py
@@ -24,6 +24,10 @@ class DebugIssueRequest(ToolRequest):
        None,
        description="Files or directories that might be related to the issue (must be absolute paths)",
    )
    images: Optional[list[str]] = Field(
        None,
        description="Optional images showing error screens, UI issues, logs displays, or visual debugging information",
    )
    runtime_info: Optional[str] = Field(None, description="Environment, versions, or runtime information")
    previous_attempts: Optional[str] = Field(None, description="What has been tried already")
@@ -69,6 +73,11 @@ class DebugIssueTool(BaseTool):
                    "items": {"type": "string"},
                    "description": "Files or directories that might be related to the issue (must be absolute paths)",
                },
                "images": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Optional images showing error screens, UI issues, logs displays, or visual debugging information",
                },
                "runtime_info": {
                    "type": "string",
                    "description": "Environment, versions, or runtime information",
--- a/tools/precommit.py
+++ b/tools/precommit.py
@@ -78,6 +78,10 @@ class PrecommitRequest(ToolRequest):
        None,
        description="Optional files or directories to provide as context (must be absolute paths). These files are not part of the changes but provide helpful context like configs, docs, or related code.",
    )
    images: Optional[list[str]] = Field(
        None,
        description="Optional images showing expected UI changes, design requirements, or visual references for the changes being validated",
    )
 class Precommit(BaseTool):
@@ -170,6 +174,11 @@ class Precommit(BaseTool):
                    "items": {"type": "string"},
                    "description": "Optional files or directories to provide as context (must be absolute paths). These files are not part of the changes but provide helpful context like configs, docs, or related code.",
                },
                "images": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Optional images showing expected UI changes, design requirements, or visual references for the changes being validated",
                },
                "use_websearch": {
                    "type": "boolean",
                    "description": "Enable web search for documentation, best practices, and current information. Particularly useful for: brainstorming sessions, architectural design discussions, exploring industry best practices, working with specific frameworks/technologies, researching solutions to complex problems, or when current documentation and community insights would enhance the analysis.",
--- a/tools/thinkdeep.py
+++ b/tools/thinkdeep.py
@@ -33,6 +33,10 @@ class ThinkDeepRequest(ToolRequest):
        None,
        description="Optional file paths or directories for additional context (must be absolute paths)",
    )
    images: Optional[list[str]] = Field(
        None,
        description="Optional images for visual analysis - diagrams, charts, system architectures, or any visual information to analyze",
    )
 class ThinkDeepTool(BaseTool):
@@ -60,7 +64,13 @@ class ThinkDeepTool(BaseTool):
            "properties": {
                "prompt": {
                    "type": "string",
-                    "description": "Your current thinking/analysis to extend and validate. IMPORTANT: Before using this tool, Claude MUST first think deeply and establish a deep understanding of the topic and question by thinking through all relevant details, context, constraints, and implications. Share these extended thoughts and ideas in the prompt so the model has comprehensive information to work with for the best analysis.",
+                    "description": (
                        "Your current thinking/analysis to extend and validate. IMPORTANT: Before using this tool, "
                        "Claude MUST first think deeply and establish a deep understanding of the topic and question "
                        "by thinking through all relevant details, context, constraints, and implications. Share "
                        "these extended thoughts and ideas in the prompt so the model has comprehensive information "
                        "to work with for the best analysis."
                    ),
                },
                "model": self.get_model_field_schema(),
                "problem_context": {
@@ -77,6 +87,11 @@ class ThinkDeepTool(BaseTool):
                    "items": {"type": "string"},
                    "description": "Optional file paths or directories for additional context (must be absolute paths)",
                },
                "images": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Optional images for visual analysis - diagrams, charts, system architectures, or any visual information to analyze",
                },
                "temperature": {
                    "type": "number",
                    "description": "Temperature for creative thinking (0-1, default 0.7)",
--- a/tools/tracer.py
+++ b/tools/tracer.py
@@ -22,11 +22,29 @@ class TracerRequest(ToolRequest):
    prompt: str = Field(
        ...,
-        description="Detailed description of what to trace and WHY you need this analysis. Include context about what you're trying to understand, debug, or analyze. For precision mode: describe the specific method/function and what aspect of its execution flow you need to understand. For dependencies mode: describe the class/module and what relationships you need to map. Example: 'I need to understand how BookingManager.finalizeInvoice method is called throughout the system and what side effects it has, as I'm debugging payment processing issues' rather than just 'BookingManager finalizeInvoice method'",
+        description=(
            "Detailed description of what to trace and WHY you need this analysis. Include context about what "
            "you're trying to understand, debug, or analyze. For precision mode: describe the specific "
            "method/function and what aspect of its execution flow you need to understand. For dependencies "
            "mode: describe the class/module and what relationships you need to map. Example: 'I need to "
            "understand how BookingManager.finalizeInvoice method is called throughout the system and what "
            "side effects it has, as I'm debugging payment processing issues' rather than just "
            "'BookingManager finalizeInvoice method'"
        ),
    )
    trace_mode: Literal["precision", "dependencies"] = Field(
        ...,
-        description="Trace mode: 'precision' (for methods/functions - shows execution flow and usage patterns) or 'dependencies' (for classes/modules/protocols - shows structural relationships)",
+        description=(
            "Trace mode: 'precision' (for methods/functions - shows execution flow and usage patterns) or "
            "'dependencies' (for classes/modules/protocols - shows structural relationships)"
        ),
    )
    images: list[str] = Field(
        default_factory=list,
        description=(
            "Optional images of system architecture diagrams, flow charts, or visual references to help "
            "understand the tracing context"
        ),
    )
@@ -44,11 +62,15 @@ class TracerTool(BaseTool):
    def get_description(self) -> str:
        return (
            "ANALYSIS PROMPT GENERATOR - Creates structured prompts for static code analysis. "
-            "Helps generate detailed analysis requests with specific method/function names, file paths, and component context. "
+            "Helps generate detailed analysis requests with specific method/function names, file paths, and "
-            "Type 'precision': For methods/functions - traces execution flow, call chains, call stacks, and shows when/how they are used. "
+            "component context. "
-            "Type 'dependencies': For classes/modules/protocols - maps structural relationships and bidirectional dependencies. "
+            "Type 'precision': For methods/functions - traces execution flow, call chains, call stacks, and "
            "shows when/how they are used. "
            "Type 'dependencies': For classes/modules/protocols - maps structural relationships and "
            "bidirectional dependencies. "
            "Returns detailed instructions on how to perform the analysis and format the results. "
-            "Use this to create focused analysis requests that can be fed back to Claude with the appropriate code files. "
+            "Use this to create focused analysis requests that can be fed back to Claude with the appropriate "
            "code files. "
        )
    def get_input_schema(self) -> dict[str, Any]:
@@ -57,13 +79,26 @@ class TracerTool(BaseTool):
            "properties": {
                "prompt": {
                    "type": "string",
-                    "description": "Detailed description of what to trace and WHY you need this analysis. Include context about what you're trying to understand, debug, or analyze. For precision mode: describe the specific method/function and what aspect of its execution flow you need to understand. For dependencies mode: describe the class/module and what relationships you need to map. Example: 'I need to understand how BookingManager.finalizeInvoice method is called throughout the system and what side effects it has, as I'm debugging payment processing issues' rather than just 'BookingManager finalizeInvoice method'",
+                    "description": (
                        "Detailed description of what to trace and WHY you need this analysis. Include context "
                        "about what you're trying to understand, debug, or analyze. For precision mode: describe "
                        "the specific method/function and what aspect of its execution flow you need to understand. "
                        "For dependencies mode: describe the class/module and what relationships you need to map. "
                        "Example: 'I need to understand how BookingManager.finalizeInvoice method is called "
                        "throughout the system and what side effects it has, as I'm debugging payment processing "
                        "issues' rather than just 'BookingManager finalizeInvoice method'"
                    ),
                },
                "trace_mode": {
                    "type": "string",
                    "enum": ["precision", "dependencies"],
                    "description": "Trace mode: 'precision' (for methods/functions - shows execution flow and usage patterns) or 'dependencies' (for classes/modules/protocols - shows structural relationships)",
                },
                "images": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Optional images of system architecture diagrams, flow charts, or visual references to help understand the tracing context",
                },
            },
            "required": ["prompt", "trace_mode"],
        }
--- a/utils/conversation_memory.py
+++ b/utils/conversation_memory.py
@@ -142,6 +142,7 @@ class ConversationTurn(BaseModel):
        content: The actual message content/response
        timestamp: ISO timestamp when this turn was created
        files: List of file paths referenced in this specific turn
        images: List of image paths referenced in this specific turn
        tool_name: Which tool generated this turn (for cross-tool tracking)
        model_provider: Provider used (e.g., "google", "openai")
        model_name: Specific model used (e.g., "gemini-2.5-flash-preview-05-20", "o3-mini")
@@ -152,6 +153,7 @@ class ConversationTurn(BaseModel):
    content: str
    timestamp: str
    files: Optional[list[str]] = None  # Files referenced in this turn
    images: Optional[list[str]] = None  # Images referenced in this turn
    tool_name: Optional[str] = None  # Tool used for this turn
    model_provider: Optional[str] = None  # Model provider (google, openai, etc)
    model_name: Optional[str] = None  # Specific model used
@@ -300,6 +302,7 @@ def add_turn(
    role: str,
    content: str,
    files: Optional[list[str]] = None,
    images: Optional[list[str]] = None,
    tool_name: Optional[str] = None,
    model_provider: Optional[str] = None,
    model_name: Optional[str] = None,
@@ -318,6 +321,7 @@ def add_turn(
        role: "user" (Claude) or "assistant" (Gemini/O3/etc)
        content: The actual message/response content
        files: Optional list of files referenced in this turn
        images: Optional list of images referenced in this turn
        tool_name: Name of the tool adding this turn (for attribution)
        model_provider: Provider used (e.g., "google", "openai")
        model_name: Specific model used (e.g., "gemini-2.5-flash-preview-05-20", "o3-mini")
@@ -335,6 +339,7 @@ def add_turn(
        - Refreshes thread TTL to configured timeout on successful update
        - Turn limits prevent runaway conversations
        - File references are preserved for cross-tool access with atomic ordering
        - Image references are preserved for cross-tool visual context
        - Model information enables cross-provider conversations
    """
    logger.debug(f"[FLOW] Adding {role} turn to {thread_id} ({tool_name})")
@@ -355,6 +360,7 @@ def add_turn(
        content=content,
        timestamp=datetime.now(timezone.utc).isoformat(),
        files=files,  # Preserved for cross-tool file context
        images=images,  # Preserved for cross-tool visual context
        tool_name=tool_name,  # Track which tool generated this turn
        model_provider=model_provider,  # Track model provider
        model_name=model_name,  # Track specific model
@@ -489,6 +495,78 @@ def get_conversation_file_list(context: ThreadContext) -> list[str]:
    return file_list
 def get_conversation_image_list(context: ThreadContext) -> list[str]:
    """
    Extract all unique images from conversation turns with newest-first prioritization.
    This function implements the identical prioritization logic as get_conversation_file_list()
    to ensure consistency in how images are handled across conversation turns. It walks
    backwards through conversation turns (from newest to oldest) and collects unique image
    references, ensuring that when the same image appears in multiple turns, the reference
    from the NEWEST turn takes precedence.
    PRIORITIZATION ALGORITHM:
    1. Iterate through turns in REVERSE order (index len-1 down to 0)
    2. For each turn, process images in the order they appear in turn.images
    3. Add image to result list only if not already seen (newest reference wins)
    4. Skip duplicate images that were already added from newer turns
    This ensures that:
    - Images from newer conversation turns appear first in the result
    - When the same image is referenced multiple times, only the newest reference is kept
    - The order reflects the most recent conversation context
    Example:
        Turn 1: images = ["diagram.png", "flow.jpg"]
        Turn 2: images = ["error.png"]
        Turn 3: images = ["diagram.png", "updated.png"]  # diagram.png appears again
        Result: ["diagram.png", "updated.png", "error.png", "flow.jpg"]
        (diagram.png from Turn 3 takes precedence over Turn 1)
    Args:
        context: ThreadContext containing all conversation turns to process
    Returns:
        list[str]: Unique image paths ordered by newest reference first.
                   Empty list if no turns exist or no images are referenced.
    Performance:
        - Time Complexity: O(n*m) where n=turns, m=avg images per turn
        - Space Complexity: O(i) where i=total unique images
        - Uses set for O(1) duplicate detection
    """
    if not context.turns:
        logger.debug("[IMAGES] No turns found, returning empty image list")
        return []
    # Collect images by walking backwards (newest to oldest turns)
    seen_images = set()
    image_list = []
    logger.debug(f"[IMAGES] Collecting images from {len(context.turns)} turns (newest first)")
    # Process turns in reverse order (newest first) - this is the CORE of newest-first prioritization
    # By iterating from len-1 down to 0, we encounter newer turns before older turns
    # When we find a duplicate image, we skip it because the newer version is already in our list
    for i in range(len(context.turns) - 1, -1, -1):  # REVERSE: newest turn first
        turn = context.turns[i]
        if turn.images:
            logger.debug(f"[IMAGES] Turn {i + 1} has {len(turn.images)} images: {turn.images}")
            for image_path in turn.images:
                if image_path not in seen_images:
                    # First time seeing this image - add it (this is the NEWEST reference)
                    seen_images.add(image_path)
                    image_list.append(image_path)
                    logger.debug(f"[IMAGES] Added new image: {image_path} (from turn {i + 1})")
                else:
                    # Image already seen from a NEWER turn - skip this older reference
                    logger.debug(f"[IMAGES] Skipping duplicate image: {image_path} (newer version already included)")
    logger.debug(f"[IMAGES] Final image list ({len(image_list)}): {image_list}")
    return image_list
 def _plan_file_inclusion_by_size(all_files: list[str], max_file_tokens: int) -> tuple[list[str], list[str], int]:
    """
    Plan which files to include based on size constraints.
--- a/utils/file_types.py
+++ b/utils/file_types.py
@@ -88,8 +88,9 @@ TEXT_DATA = {
    ".lock",  # Lock files
 }
-# Image file extensions
+# Image file extensions - limited to what AI models actually support
-IMAGES = {".jpg", ".jpeg", ".png", ".gif", ".bmp", ".svg", ".webp", ".ico", ".tiff", ".tif"}
+# Based on OpenAI and Gemini supported formats: PNG, JPEG, GIF, WebP
 IMAGES = {".jpg", ".jpeg", ".png", ".gif", ".webp"}
 # Binary executable and library extensions
 BINARIES = {
@@ -240,3 +241,30 @@ def get_token_estimation_ratio(file_path: str) -> float:
    extension = Path(file_path).suffix.lower()
    return TOKEN_ESTIMATION_RATIOS.get(extension, 3.5)  # Conservative default
 # MIME type mappings for image files - limited to what AI models actually support
 # Based on OpenAI and Gemini supported formats: PNG, JPEG, GIF, WebP
 IMAGE_MIME_TYPES = {
    ".jpg": "image/jpeg",
    ".jpeg": "image/jpeg",
    ".png": "image/png",
    ".gif": "image/gif",
    ".webp": "image/webp",
 }
 def get_image_mime_type(extension: str) -> str:
    """
    Get the MIME type for an image file extension.
    Args:
        extension: File extension (with or without leading dot)
    Returns:
        MIME type string (default: image/jpeg for unknown extensions)
    """
    if not extension.startswith("."):
        extension = "." + extension
    extension = extension.lower()
    return IMAGE_MIME_TYPES.get(extension, "image/jpeg")
--- a/utils/file_utils.py
+++ b/utils/file_utils.py
@@ -48,6 +48,36 @@ from .file_types import BINARY_EXTENSIONS, CODE_EXTENSIONS, IMAGE_EXTENSIONS, TE
 from .security_config import CONTAINER_WORKSPACE, EXCLUDED_DIRS, MCP_SIGNATURE_FILES, SECURITY_ROOT, WORKSPACE_ROOT
 from .token_utils import DEFAULT_CONTEXT_WINDOW, estimate_tokens
 def _is_builtin_custom_models_config(path_str: str) -> bool:
    """
    Check if path points to the server's built-in custom_models.json config file.
    This only matches the server's internal config, not user-specified CUSTOM_MODELS_CONFIG_PATH.
    We identify the built-in config by checking if it resolves to the server's conf directory.
    Args:
        path_str: Path to check
    Returns:
        True if this is the server's built-in custom_models.json config file
    """
    try:
        path = Path(path_str)
        # Get the server root by going up from this file: utils/file_utils.py -> server_root
        server_root = Path(__file__).parent.parent
        builtin_config = server_root / "conf" / "custom_models.json"
        # Check if the path resolves to the same file as our built-in config
        # This handles both relative and absolute paths to the same file
        return path.resolve() == builtin_config.resolve()
    except Exception:
        # If path resolution fails, it's not our built-in config
        return False
 logger = logging.getLogger(__name__)
@@ -271,7 +301,8 @@ def translate_path_for_environment(path_str: str) -> str:
    tools and utilities throughout the codebase. It handles:
    1. Docker host-to-container path translation (host paths -> /workspace/...)
    2. Direct mode (no translation needed)
-    3. Security validation and error handling
+    3. Internal server files (conf/custom_models.json)
    4. Security validation and error handling
    Docker Path Translation Logic:
    - Input: /Users/john/project/src/file.py (host path from Claude)
@@ -284,32 +315,9 @@ def translate_path_for_environment(path_str: str) -> str:
    Returns:
        Translated path appropriate for the current environment
    """
-    # Allow access to specific internal application configuration files
+    # Handle built-in server config file - no translation needed
-    # Store as relative paths so they work in both Docker and standalone modes
+    if _is_builtin_custom_models_config(path_str):
-    # Use exact paths for security - no wildcards or prefix matching
+        return path_str
    ALLOWED_INTERNAL_PATHS = {
        "conf/custom_models.json",
        # Add other specific internal files here as needed
    }
    # Check for internal app paths - extract relative part if it's an /app/ path
    relative_internal_path = None
    if path_str.startswith("/app/"):
        relative_internal_path = path_str[5:]  # Remove "/app/" prefix
        if relative_internal_path.startswith("/"):
            relative_internal_path = relative_internal_path[1:]  # Remove leading slash if present
    # Check if this is an allowed internal file
    if relative_internal_path and relative_internal_path in ALLOWED_INTERNAL_PATHS:
        # Translate to appropriate path for current environment
        if not WORKSPACE_ROOT or not WORKSPACE_ROOT.strip() or not CONTAINER_WORKSPACE.exists():
            # Standalone mode: use relative path
            return "./" + relative_internal_path
        else:
            # Docker mode: use absolute app path
            return "/app/" + relative_internal_path
    # Handle other /app/ paths in standalone mode (for non-whitelisted files)
    if not WORKSPACE_ROOT or not WORKSPACE_ROOT.strip() or not CONTAINER_WORKSPACE.exists():
        if path_str.startswith("/app/"):
            # Convert Docker internal paths to local relative paths for standalone mode
		`@@ -0,0 +1 @@`
							`- Before any commit / push to github, you must first always run and confirm run that code quality checks pass. Use @code_quality_checks.sh and confirm that we have 100% unit tests passing.`