fix: removed use_websearch; this parameter was confusing Codex. It started using this to prompt the external model to perform searches! web-search is enabled by Claude / Codex etc by default and the external agent can ask claude to search on its behalf.

2025-10-01 18:44:11 +04:00
parent 28cabe0833
commit cff6d8998f
27 changed files with 45 additions and 129 deletions
--- a/docs/advanced-usage.md
+++ b/docs/advanced-usage.md
@@ -165,7 +165,7 @@ All tools that work with files support **both individual files and entire direct
 - `analysis_type`: architecture|performance|security|quality|general
 - `output_format`: summary|detailed|actionable
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
- `use_websearch`: Enable web search for documentation and best practices - allows model to request Claude perform searches (default: true)
+- **Web search capability**: The assistant now automatically requests web searches when it needs current documentation or best practices—no parameter required
 ```
 "Analyze the src/ directory for architectural patterns" (auto mode picks best model)
@@ -198,7 +198,7 @@ All tools that work with files support **both individual files and entire direct
 - `runtime_info`: Environment details
 - `previous_attempts`: What you've tried
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
- `use_websearch`: Enable web search for error messages and solutions - allows model to request Claude perform searches (default: true)
+- **Web search capability**: Automatically initiates searches for relevant error messages or recent fixes when needed
 ```
 "Debug this logic error with context from backend/" (auto mode picks best model)
@@ -213,7 +213,7 @@ All tools that work with files support **both individual files and entire direct
 - `focus_areas`: Specific aspects to focus on
 - `files`: Files or directories for context
 - `thinking_mode`: minimal|low|medium|high|max (default: max, Gemini only)
- `use_websearch`: Enable web search for documentation and insights - allows model to request Claude perform searches (default: true)
+- **Web search capability**: Automatically calls for research when architecture references or external insights are required
 ```
 "Think deeper about my design with reference to src/models/" (auto mode picks best model)
@@ -444,7 +444,7 @@ Claude can then search for these specific topics and provide you with the most c
 **Web search control:**
 Web search is enabled by default, allowing models to request Claude perform searches for current documentation and solutions. If you prefer the model to work only with its training data, you can disable web search:
 ```
-"Use gemini to review this code with use_websearch false"
+"Use gemini to review this code and confirm whether any new framework changes affect the recommendation"
 ```
 ## System Prompts
@@ -467,4 +467,4 @@ Each tool has a unique system prompt that defines its role and approach:
 To modify tool behavior, you can:
 1. Edit prompts in `prompts/tool_prompts.py` for global changes
 2. Override `get_system_prompt()` in a tool class for tool-specific changes
-3. Use the `temperature` parameter to adjust response style (0.2 for focused, 0.7 for creative)
+3. Use the `temperature` parameter to adjust response style (0.2 for focused, 0.7 for creative)
--- a/docs/tools/analyze.md
+++ b/docs/tools/analyze.md
@@ -45,7 +45,7 @@ This workflow ensures methodical analysis before expert insights, resulting in d
 - **Cross-file relationship mapping**: Understand dependencies and interactions
 - **Architecture visualization**: Describe system structure and component relationships
 - **Image support**: Analyze architecture diagrams, UML charts, flowcharts: `"Analyze this system diagram with gemini to understand the data flow and identify bottlenecks"`
- **Web search capability**: When enabled with `use_websearch` (default: true), the model can request Claude to perform web searches and share results back to enhance analysis with current documentation, design patterns, and best practices
+- **Web search capability**: Automatically requests Claude to perform web searches when fresh documentation, patterns, or best practices are needed, ensuring the analysis stays current
 ## Tool Parameters
@@ -70,7 +70,6 @@ This workflow ensures methodical analysis before expert insights, resulting in d
 - `output_format`: summary|detailed|actionable (default: detailed)
 - `temperature`: Temperature for analysis (0-1, default 0.2)
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
 - `use_websearch`: Enable web search for documentation and best practices (default: true)
 - `use_assistant_model`: Whether to use expert analysis phase (default: true, set to false to use Claude only)
 - `continuation_id`: Continue previous analysis sessions
@@ -196,4 +195,4 @@ After analysis: "Recommended searches for Claude: 'FastAPI async best practices
 - **Use `codereview`** for: Finding bugs and security issues with actionable fixes
 - **Use `debug`** for: Diagnosing specific runtime errors or performance problems
 - **Use `refactor`** for: Getting specific refactoring recommendations and implementation plans
- **Use `chat`** for: Open-ended discussions about code without structured analysis
+- **Use `chat`** for: Open-ended discussions about code without structured analysis
--- a/docs/tools/chat.md
+++ b/docs/tools/chat.md
@@ -28,7 +28,7 @@ and then debate with the other models to give me a final verdict
 - **File reference support**: `"Use gemini to explain this algorithm with context from algorithm.py"`
 - **Image support**: Include screenshots, diagrams, UI mockups for visual analysis: `"Chat with gemini about this error dialog screenshot to understand the user experience issue"`
 - **Dynamic collaboration**: Gemini can request additional files or context during the conversation if needed for a more thorough response
- **Web search capability**: Analyzes when web searches would be helpful and recommends specific searches for Claude to perform, ensuring access to current documentation and best practices
+- **Web search awareness**: Automatically identifies when online research would help and instructs Claude to perform targeted searches using continuation IDs
 ## Tool Parameters
@@ -38,7 +38,6 @@ and then debate with the other models to give me a final verdict
 - `images`: Optional images for visual context (absolute paths)
 - `temperature`: Response creativity (0-1, default 0.5)
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
 - `use_websearch`: Enable web search for documentation and insights (default: true)
 - `continuation_id`: Continue previous conversations
 ## Usage Examples
@@ -74,11 +73,11 @@ and then debate with the other models to give me a final verdict
 - **Ask for trade-offs**: Request pros/cons for better decision-making
 - **Use conversation continuation**: Build on previous discussions with `continuation_id`
 - **Leverage visual context**: Include diagrams, mockups, or screenshots when discussing UI/UX
- **Request web searches**: Ask for current best practices or recent developments in technologies
+- **Encourage research**: When you suspect documentation has changed, explicitly ask the assistant to confirm by requesting a web search
 ## When to Use Chat vs Other Tools
 - **Use `chat`** for: Open-ended discussions, brainstorming, getting second opinions, technology comparisons
 - **Use `thinkdeep`** for: Extending specific analysis, challenging assumptions, deeper reasoning
 - **Use `analyze`** for: Understanding existing code structure and patterns
- **Use `debug`** for: Specific error diagnosis and troubleshooting
+- **Use `debug`** for: Specific error diagnosis and troubleshooting
--- a/docs/tools/codereview.md
+++ b/docs/tools/codereview.md
@@ -87,7 +87,6 @@ The above prompt will simultaneously run two separate `codereview` tools with tw
 - `severity_filter`: critical|high|medium|low|all (default: all)
 - `temperature`: Temperature for consistency (0-1, default 0.2)
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
 - `use_websearch`: Enable web search for best practices and documentation (default: true)
 - `use_assistant_model`: Whether to use expert analysis phase (default: true, set to false to use Claude only)
 - `continuation_id`: Continue previous review discussions
@@ -171,4 +170,4 @@ Reviews include:
 - **Use `codereview`** for: Finding bugs, security issues, performance problems, code quality assessment
 - **Use `analyze`** for: Understanding code structure without finding issues
 - **Use `debug`** for: Diagnosing specific runtime errors or exceptions
- **Use `refactor`** for: Identifying structural improvements and modernization opportunities
+- **Use `refactor`** for: Identifying structural improvements and modernization opportunities
--- a/docs/tools/consensus.md
+++ b/docs/tools/consensus.md
@@ -69,7 +69,6 @@ Get a consensus from gemini supporting the idea for implementing X, grok opposin
 - `focus_areas`: Specific aspects to emphasize
 - `temperature`: Control consistency (default: 0.2 for stable consensus)
 - `thinking_mode`: Analysis depth (minimal/low/medium/high/max)
 - `use_websearch`: Enable research for enhanced analysis (default: true)
 - `continuation_id`: Continue previous consensus discussions
 ## Model Configuration Examples
@@ -142,4 +141,4 @@ The consensus tool includes built-in ethical safeguards:
 - **Use `consensus`** for: Multi-perspective analysis, structured debates, major technical decisions
 - **Use `chat`** for: Open-ended discussions and brainstorming
 - **Use `thinkdeep`** for: Extending specific analysis with deeper reasoning
- **Use `analyze`** for: Understanding existing systems without debate
+- **Use `analyze`** for: Understanding existing systems without debate
--- a/docs/tools/debug.md
+++ b/docs/tools/debug.md
@@ -75,7 +75,6 @@ This structured approach ensures Claude performs methodical groundwork before ex
 **Model Selection:**
 - `model`: auto|pro|flash|flash-2.0|flashlite|o3|o3-mini|o4-mini|gpt4.1|gpt5|gpt5-mini|gpt5-nano (default: server default)
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
 - `use_websearch`: Enable web search for documentation and solutions (default: true)
 - `use_assistant_model`: Whether to use expert analysis phase (default: true, set to false to use Claude only)
 ## Usage Examples
@@ -208,4 +207,4 @@ After analysis: "Recommended searches for Claude: 'Django 4.2 migration error sp
 **Step 3:** "Found suspicious async/await pattern in session_manager.py lines 45-67. The await might be missing exception handling. This could explain silent failures."
-**Completion:** Investigation reveals likely root cause in exception handling, ready for expert analysis with full context.
+**Completion:** Investigation reveals likely root cause in exception handling, ready for expert analysis with full context.
--- a/docs/tools/precommit.md
+++ b/docs/tools/precommit.md
@@ -149,7 +149,6 @@ Use zen and perform a thorough precommit ensuring there aren't any new regressio
 - `focus_on`: Specific aspects to focus on
 - `temperature`: Temperature for response (default: 0.2)
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
 - `use_websearch`: Enable web search for best practices (default: true)
 - `use_assistant_model`: Whether to use expert validation phase (default: true, set to false to use Claude only)
 - `continuation_id`: Continue previous validation discussions
@@ -250,4 +249,4 @@ Validation results include:
 - **Use `precommit`** for: Validating changes before git commit, ensuring requirement compliance
 - **Use `codereview`** for: General code quality assessment without git context
 - **Use `debug`** for: Diagnosing specific runtime issues
- **Use `analyze`** for: Understanding existing code without validation context
+- **Use `analyze`** for: Understanding existing code without validation context
--- a/docs/tools/secaudit.md
+++ b/docs/tools/secaudit.md
@@ -94,7 +94,6 @@ security remediation plan using planner
 - `severity_filter`: critical|high|medium|low|all (default: all)
 - `temperature`: Temperature for analytical consistency (0-1, default 0.2)
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
 - `use_websearch`: Enable web search for security best practices and vulnerability databases (default: true)
 - `use_assistant_model`: Whether to use expert security analysis phase (default: true)
 - `continuation_id`: Continue previous security audit discussions
@@ -219,4 +218,4 @@ Security audits include:
 - **Use `codereview`** for: General code quality with some security considerations
 - **Use `analyze`** for: Understanding security architecture without vulnerability assessment
 - **Use `debug`** for: Investigating specific security incidents or exploit attempts
- **Use `precommit`** for: Pre-deployment security validation and change impact assessment
+- **Use `precommit`** for: Pre-deployment security validation and change impact assessment
--- a/docs/tools/thinkdeep.md
+++ b/docs/tools/thinkdeep.md
@@ -25,7 +25,7 @@ with the best architecture for my project
 - **File reference support**: `"Use gemini to think deeper about my API design with reference to api/routes.py"`
 - **Image support**: Analyze architectural diagrams, flowcharts, design mockups: `"Think deeper about this system architecture diagram with gemini pro using max thinking mode"`
 - **Enhanced Critical Evaluation (v2.10.0)**: After Gemini's analysis, Claude is prompted to critically evaluate the suggestions, consider context and constraints, identify risks, and synthesize a final recommendation - ensuring a balanced, well-considered solution
- **Web search capability**: When enabled (default: true), identifies areas where current documentation or community solutions would strengthen the analysis and suggests specific searches for Claude
+- **Web search capability**: Automatically identifies areas where current documentation or community solutions would strengthen the analysis and instructs Claude to perform targeted searches
 ## Tool Parameters
@@ -37,7 +37,6 @@ with the best architecture for my project
 - `images`: Optional images for visual analysis (absolute paths)
 - `temperature`: Temperature for creative thinking (0-1, default 0.7)
 - `thinking_mode`: minimal|low|medium|high|max (default: high, Gemini only)
 - `use_websearch`: Enable web search for documentation and insights (default: true)
 - `continuation_id`: Continue previous conversations
 ## Usage Examples
@@ -94,4 +93,4 @@ This ensures you get both deep reasoning and practical, context-aware advice.
 - **Use `thinkdeep`** for: Extending specific analysis, challenging assumptions, architectural decisions
 - **Use `chat`** for: Open-ended brainstorming and general discussions
 - **Use `analyze`** for: Understanding existing code without extending analysis
- **Use `codereview`** for: Finding specific bugs and security issues
+- **Use `codereview`** for: Finding specific bugs and security issues
--- a/tests/test_chat_simple.py
+++ b/tests/test_chat_simple.py
@@ -97,7 +97,7 @@ class TestChatTool:
    @pytest.mark.asyncio
    async def test_prompt_preparation(self):
        """Test that prompt preparation works correctly"""
-        request = ChatRequest(prompt="Test prompt", files=[], use_websearch=True)
+        request = ChatRequest(prompt="Test prompt", files=[])
        # Mock the system prompt and file handling
        with patch.object(self.tool, "get_system_prompt", return_value="System prompt"):
@@ -181,7 +181,6 @@ class TestChatRequestModel:
        assert hasattr(request, "model")
        assert hasattr(request, "temperature")
        assert hasattr(request, "thinking_mode")
        assert hasattr(request, "use_websearch")
        assert hasattr(request, "continuation_id")
        assert hasattr(request, "images")  # From base model too
--- a/tools/analyze.py
+++ b/tools/analyze.py
@@ -125,8 +125,7 @@ class AnalyzeWorkflowRequest(WorkflowRequest):
        "detailed", description=ANALYZE_WORKFLOW_FIELD_DESCRIPTIONS["output_format"]
    )
-    # Keep thinking_mode and use_websearch from original analyze tool
+    # Keep thinking_mode from original analyze tool; temperature is inherited from WorkflowRequest
    # temperature is inherited from WorkflowRequest
    @model_validator(mode="after")
    def validate_step_one_requirements(self):
--- a/tools/chat.py
+++ b/tools/chat.py
@@ -23,9 +23,9 @@ from .simple.base import SimpleTool
 CHAT_FIELD_DESCRIPTIONS = {
    "prompt": (
        "Your question or idea for collaborative thinking. Provide detailed context, including your goal, what you've tried, and any specific challenges. "
-        "CRITICAL: To discuss code, provide file paths using the 'files' parameter instead of pasting large code blocks here."
+        "CRITICAL: To discuss code, use 'files' parameter instead of pasting code blocks here."
    ),
-    "files": "Absolute full-paths to existing files / folders for context. DO NOT SHORTEN.",
+    "files": "Always pass absolute full-paths (do NOT shorten) to existing files / folders containing code being discussed.",
    "images": (
        "Optional images for visual context (must be FULL absolute paths to real files / folders - DO NOT SHORTEN - OR these can be bas64 data)"
    ),
@@ -56,8 +56,8 @@ class ChatTool(SimpleTool):
    def get_description(self) -> str:
        return (
-            "General chat and collaborative thinking partner for brainstorming, development discussion, getting second opinions, and exploring ideas. "
+            "General chat and collaborative thinking partner for brainstorming, development discussion, "
-            "Use for bouncing ideas, validating approaches, asking questions, and getting explanations. "
+            "getting second opinions, and exploring ideas. Use for ideas, validations, questions, and thoughtful explanations."
        )
    def get_system_prompt(self) -> str:
@@ -116,11 +116,6 @@ class ChatTool(SimpleTool):
                    "enum": ["minimal", "low", "medium", "high", "max"],
                    "description": COMMON_FIELD_DESCRIPTIONS["thinking_mode"],
                },
                "use_websearch": {
                    "type": "boolean",
                    "description": COMMON_FIELD_DESCRIPTIONS["use_websearch"],
                    "default": True,
                },
                "continuation_id": {
                    "type": "string",
                    "description": COMMON_FIELD_DESCRIPTIONS["continuation_id"],
--- a/tools/codereview.py
+++ b/tools/codereview.py
@@ -118,7 +118,6 @@ class CodeReviewRequest(WorkflowRequest):
    # Override inherited fields to exclude them from schema (except model which needs to be available)
    temperature: Optional[float] = Field(default=None, exclude=True)
    thinking_mode: Optional[str] = Field(default=None, exclude=True)
    use_websearch: Optional[bool] = Field(default=None, exclude=True)
    @model_validator(mode="after")
    def validate_step_one_requirements(self):
--- a/tools/consensus.py
+++ b/tools/consensus.py
@@ -116,7 +116,6 @@ class ConsensusRequest(WorkflowRequest):
    # Override inherited fields to exclude them from schema
    temperature: float | None = Field(default=None, exclude=True)
    thinking_mode: str | None = Field(default=None, exclude=True)
    use_websearch: bool | None = Field(default=None, exclude=True)
    # Not used in consensus workflow
    files_checked: list[str] | None = Field(default_factory=list, exclude=True)
@@ -290,7 +289,6 @@ of the evidence, even when it strongly points in one direction.""",
            "model",  # Consensus uses 'models' field instead
            "temperature",  # Not used in consensus workflow
            "thinking_mode",  # Not used in consensus workflow
            "use_websearch",  # Not used in consensus workflow
        ]
        # Build schema with proper field exclusion
--- a/tools/debug.py
+++ b/tools/debug.py
@@ -104,7 +104,6 @@ class DebugInvestigationRequest(WorkflowRequest):
    # Override inherited fields to exclude them from schema (except model which needs to be available)
    temperature: Optional[float] = Field(default=None, exclude=True)
    thinking_mode: Optional[str] = Field(default=None, exclude=True)
    use_websearch: Optional[bool] = Field(default=None, exclude=True)
 class DebugIssueTool(WorkflowTool):
--- a/tools/docgen.py
+++ b/tools/docgen.py
@@ -243,7 +243,6 @@ class DocgenTool(WorkflowTool):
            "model",  # Documentation doesn't need external model selection
            "temperature",  # Documentation doesn't need temperature control
            "thinking_mode",  # Documentation doesn't need thinking mode
            "use_websearch",  # Documentation doesn't need web search
            "images",  # Documentation doesn't use images
        ]
--- a/tools/planner.py
+++ b/tools/planner.py
@@ -88,7 +88,6 @@ class PlannerRequest(WorkflowRequest):
    # Exclude other non-planning fields
    temperature: float | None = Field(default=None, exclude=True)
    thinking_mode: str | None = Field(default=None, exclude=True)
    use_websearch: bool | None = Field(default=None, exclude=True)
    use_assistant_model: bool | None = Field(default=False, exclude=True, description="Planning is self-contained")
    images: list | None = Field(default=None, exclude=True, description="Planning doesn't use images")
@@ -218,7 +217,6 @@ class PlannerTool(WorkflowTool):
        excluded_common_fields = [
            "temperature",  # Planning doesn't need temperature control
            "thinking_mode",  # Planning doesn't need thinking mode
            "use_websearch",  # Planning doesn't need web search
            "images",  # Planning doesn't use images
            "files",  # Planning doesn't use files
        ]
--- a/tools/precommit.py
+++ b/tools/precommit.py
@@ -119,7 +119,6 @@ class PrecommitRequest(WorkflowRequest):
    # Override inherited fields to exclude them from schema (except model which needs to be available)
    temperature: Optional[float] = Field(default=None, exclude=True)
    thinking_mode: Optional[str] = Field(default=None, exclude=True)
    use_websearch: Optional[bool] = Field(default=None, exclude=True)
    @model_validator(mode="after")
    def validate_step_one_requirements(self):
--- a/tools/refactor.py
+++ b/tools/refactor.py
@@ -131,7 +131,6 @@ class RefactorRequest(WorkflowRequest):
    # Override inherited fields to exclude them from schema (except model which needs to be available)
    temperature: Optional[float] = Field(default=None, exclude=True)
    thinking_mode: Optional[str] = Field(default=None, exclude=True)
    use_websearch: Optional[bool] = Field(default=None, exclude=True)
    @model_validator(mode="after")
    def validate_step_one_requirements(self):
--- a/tools/shared/base_models.py
+++ b/tools/shared/base_models.py
@@ -21,25 +21,17 @@ logger = logging.getLogger(__name__)
 # Shared field descriptions to avoid duplication
 COMMON_FIELD_DESCRIPTIONS = {
    "model": (
-        "Model to use. See tool's input schema for available models. "
+        "Model to use. See tool's input schema for available models if required. Use 'auto' select the best model for the task."
        "Use 'auto' to let Claude select the best model for the task."
    ),
    "temperature": (
        "Lower values: focused/deterministic; higher: creative. Tool-specific defaults apply if unspecified."
    ),
    "temperature": ("Lower values: deterministic; higher: creative."),
    "thinking_mode": (
        "Thinking depth: minimal (0.5%), low (8%), medium (33%), high (67%), "
        "max (100% of model max). Higher modes: deeper reasoning but slower."
    ),
    "use_websearch": (
        "Enable web search for docs and current info. Model can request Claude to perform web-search for "
        "best practices, framework docs, solution research, latest API information."
    ),
    "continuation_id": (
-        "Unique thread continuation ID for multi-turn conversations. Reuse last continuation_id "
+        "Unique thread continuation ID for multi-turn conversations. Works across different tools. "
-        "when continuing discussion (unless user provides different ID) using exact unique identifer. "
+        "ALWAYS reuse last continuation_id you were provided as-is when re-communicating with Zen MCP, "
-        "Embeds complete conversation history. Build upon history without repeating. "
+        "unless user provides different ID. When supplied, your complete conversation history is available, so focus on new insights."
        "Focus on new insights. Works across different tools."
    ),
    "images": (
        "Optional images for visual context. MUST be absolute paths or base64. "
@@ -88,9 +80,6 @@ class ToolRequest(BaseModel):
    temperature: Optional[float] = Field(None, ge=0.0, le=1.0, description=COMMON_FIELD_DESCRIPTIONS["temperature"])
    thinking_mode: Optional[str] = Field(None, description=COMMON_FIELD_DESCRIPTIONS["thinking_mode"])
    # Features
    use_websearch: Optional[bool] = Field(True, description=COMMON_FIELD_DESCRIPTIONS["use_websearch"])
    # Conversation support
    continuation_id: Optional[str] = Field(None, description=COMMON_FIELD_DESCRIPTIONS["continuation_id"])
--- a/tools/shared/base_tool.py
+++ b/tools/shared/base_tool.py
@@ -205,10 +205,10 @@ class BaseTool(ABC):
    def _should_require_model_selection(self, model_name: str) -> bool:
        """
-        Check if we should require Claude to select a model at runtime.
+        Check if we should require the CLI to select a model at runtime.
        This is called during request execution to determine if we need
-        to return an error asking Claude to provide a model parameter.
+        to return an error asking the CLI to provide a model parameter.
        Args:
            model_name: The model name from the request or DEFAULT_MODEL
@@ -237,7 +237,7 @@ class BaseTool(ABC):
        Only returns models from providers that have valid API keys configured.
        This fixes the namespace collision bug where models from disabled providers
-        were shown to Claude, causing routing conflicts.
+        were shown to the CLI, causing routing conflicts.
        Returns:
            List of model names from enabled providers only
@@ -405,7 +405,7 @@ class BaseTool(ABC):
                    if model_configs:
                        model_desc_parts.append("\nOpenRouter models (use these aliases):")
-                        for alias, config in model_configs:  # Show ALL models so Claude can choose
+                        for alias, config in model_configs:  # Show ALL models so the CLI can choose
                            # Format context window in human-readable form
                            context_tokens = config.context_window
                            if context_tokens >= 1_000_000:
@@ -445,7 +445,7 @@ class BaseTool(ABC):
        else:
            # Normal mode - model is optional with default
            available_models = self._get_available_models()
-            models_str = ", ".join(f"'{m}'" for m in available_models)  # Show ALL models so Claude can choose
+            models_str = ", ".join(f"'{m}'" for m in available_models)  # Show ALL models so the CLI can choose
            description = f"Model to use. Native models: {models_str}."
            if has_openrouter:
@@ -456,7 +456,7 @@ class BaseTool(ABC):
                    # Show ALL aliases from the configuration
                    if aliases:
-                        # Show all aliases so Claude knows every option available
+                        # Show all aliases so the CLI knows every option available
                        all_aliases = sorted(aliases)
                        alias_list = ", ".join(f"'{a}'" for a in all_aliases)
                        description += f" OpenRouter aliases: {alias_list}."
@@ -763,7 +763,7 @@ class BaseTool(ABC):
        This file is treated specially as the main prompt, not as an embedded file.
        This mechanism allows us to work around MCP's ~25K token limit by having
-        Claude save large prompts to a file, effectively using the file transfer
+        the CLI save large prompts to a file, effectively using the file transfer
        mechanism to bypass token constraints while preserving response capacity.
        Args:
@@ -839,7 +839,7 @@ class BaseTool(ABC):
        Check if USER INPUT text is too large for MCP transport boundary.
        IMPORTANT: This method should ONLY be used to validate user input that crosses
-        the Claude CLI ↔ MCP Server transport boundary. It should NOT be used to limit
+        the CLI ↔ MCP Server transport boundary. It should NOT be used to limit
        internal MCP Server operations.
        Args:
@@ -1051,9 +1051,9 @@ class BaseTool(ABC):
        base_instruction = """
-WEB SEARCH CAPABILITY: You can request Claude to perform web searches to enhance your analysis with current information!
+WEB SEARCH CAPABILITY: You can request the calling agent to perform web searches to enhance your analysis with current information!
-IMPORTANT: When you identify areas where web searches would significantly improve your response (such as checking current documentation, finding recent solutions, verifying best practices, or gathering community insights), you MUST explicitly instruct Claude to perform specific web searches and then respond back using the continuation_id from this response to continue the analysis.
+IMPORTANT: When you identify areas where web searches would significantly improve your response (such as checking current documentation, finding recent solutions, verifying best practices, or gathering community insights), you MUST explicitly instruct the agent to perform specific web searches and then respond back using the continuation_id from this response to continue the analysis.
 Use clear, direct language based on the value of the search:
@@ -1083,7 +1083,7 @@ Consider requesting searches for:
 - Security advisories and patches
 - Performance benchmarks and optimizations
-When recommending searches, be specific about what information you need and why it would improve your analysis. Always remember to instruct Claude to use the continuation_id from this response when providing search results."""
+When recommending searches, be specific about what information you need and why it would improve your analysis. Always remember to instruct agent to use the continuation_id from this response when providing search results."""
    def get_language_instruction(self) -> str:
        """
@@ -1158,10 +1158,10 @@ When recommending searches, be specific about what information you need and why
    def _should_require_model_selection(self, model_name: str) -> bool:
        """
-        Check if we should require Claude to select a model at runtime.
+        Check if we should require the CLI to select a model at runtime.
        This is called during request execution to determine if we need
-        to return an error asking Claude to provide a model parameter.
+        to return an error asking the CLI to provide a model parameter.
        Args:
            model_name: The model name from the request or DEFAULT_MODEL
@@ -1189,7 +1189,7 @@ When recommending searches, be specific about what information you need and why
        Only returns models from providers that have valid API keys configured.
        This fixes the namespace collision bug where models from disabled providers
-        were shown to Claude, causing routing conflicts.
+        were shown to the CLI, causing routing conflicts.
        Returns:
            List of model names from enabled providers only
--- a/tools/shared/schema_builders.py
+++ b/tools/shared/schema_builders.py
@@ -32,11 +32,6 @@ class SchemaBuilder:
            "enum": ["minimal", "low", "medium", "high", "max"],
            "description": COMMON_FIELD_DESCRIPTIONS["thinking_mode"],
        },
        "use_websearch": {
            "type": "boolean",
            "description": COMMON_FIELD_DESCRIPTIONS["use_websearch"],
            "default": True,
        },
        "continuation_id": {
            "type": "string",
            "description": COMMON_FIELD_DESCRIPTIONS["continuation_id"],
--- a/tools/simple/base.py
+++ b/tools/simple/base.py
@@ -234,13 +234,6 @@ class SimpleTool(BaseTool):
        except AttributeError:
            return []
    def get_request_use_websearch(self, request) -> bool:
        """Get use_websearch from request. Override for custom websearch handling."""
        try:
            return request.use_websearch if request.use_websearch is not None else True
        except AttributeError:
            return True
    def get_request_as_dict(self, request) -> dict:
        """Convert request to dictionary. Override for custom serialization."""
        try:
@@ -787,11 +780,8 @@ class SimpleTool(BaseTool):
        content_to_validate = self.get_prompt_content_for_size_validation(user_content)
        self._validate_token_limit(content_to_validate, "Content")
-        # Add web search instruction if enabled
+        # Add standardized web search guidance
-        websearch_instruction = ""
+        websearch_instruction = self.get_websearch_instruction(True, self.get_websearch_guidance())
        use_websearch = self.get_request_use_websearch(request)
        if use_websearch:
            websearch_instruction = self.get_websearch_instruction(use_websearch, self.get_websearch_guidance())
        # Combine system prompt with user content
        full_prompt = f"""{system_prompt}{websearch_instruction}
--- a/tools/testgen.py
+++ b/tools/testgen.py
@@ -115,7 +115,6 @@ class TestGenRequest(WorkflowRequest):
    # Override inherited fields to exclude them from schema (except model which needs to be available)
    temperature: Optional[float] = Field(default=None, exclude=True)
    thinking_mode: Optional[str] = Field(default=None, exclude=True)
    use_websearch: Optional[bool] = Field(default=None, exclude=True)
    @model_validator(mode="after")
    def validate_step_one_requirements(self):
--- a/tools/thinkdeep.py
+++ b/tools/thinkdeep.py
@@ -90,11 +90,6 @@ class ThinkDeepWorkflowRequest(WorkflowRequest):
        default=None,
        description="Depth: minimal/low/medium/high/max. Default 'high'.",
    )
    use_websearch: Optional[bool] = Field(
        default=None,
        description="Enable web search for docs, brainstorming, architecture, solutions.",
    )
    # Context files and investigation scope
    problem_context: Optional[str] = Field(
        default=None,
@@ -200,11 +195,6 @@ class ThinkDeepTool(WorkflowTool):
        except AttributeError:
            self.stored_request_params["thinking_mode"] = None
        try:
            self.stored_request_params["use_websearch"] = request.use_websearch
        except AttributeError:
            self.stored_request_params["use_websearch"] = None
        # Add thinking-specific context to response
        response_data.update(
            {
@@ -349,16 +339,6 @@ but also acknowledge strong insights and valid conclusions.
            pass
        return super().get_request_thinking_mode(request)
    def get_request_use_websearch(self, request) -> bool:
        """Use stored use_websearch from initial request."""
        try:
            stored_params = self.stored_request_params
            if stored_params and stored_params.get("use_websearch") is not None:
                return stored_params["use_websearch"]
        except AttributeError:
            pass
        return super().get_request_use_websearch(request)
    def _get_problem_context(self, request) -> str:
        """Get problem context from request. Override for custom context handling."""
        try:
--- a/tools/tracer.py
+++ b/tools/tracer.py
@@ -122,7 +122,6 @@ class TracerRequest(WorkflowRequest):
    # Exclude other non-tracing fields
    temperature: Optional[float] = Field(default=None, exclude=True)
    thinking_mode: Optional[str] = Field(default=None, exclude=True)
    use_websearch: Optional[bool] = Field(default=None, exclude=True)
    use_assistant_model: Optional[bool] = Field(default=False, exclude=True, description="Tracing is self-contained")
    @field_validator("step_number")
@@ -228,7 +227,6 @@ class TracerTool(WorkflowTool):
        excluded_common_fields = [
            "temperature",  # Tracing doesn't need temperature control
            "thinking_mode",  # Tracing doesn't need thinking mode
            "use_websearch",  # Tracing doesn't need web search
            "files",  # Tracing uses relevant_files instead
        ]
--- a/tools/workflow/workflow_mixin.py
+++ b/tools/workflow/workflow_mixin.py
@@ -267,13 +267,6 @@ class BaseWorkflowMixin(ABC):
        except AttributeError:
            return self.get_expert_thinking_mode()
    def get_request_use_websearch(self, request) -> bool:
        """Get use_websearch from request. Override for custom websearch handling."""
        try:
            return request.use_websearch if request.use_websearch is not None else True
        except AttributeError:
            return True
    def get_expert_analysis_instruction(self) -> str:
        """
        Get the instruction to append after the expert context.
@@ -590,10 +583,7 @@ class BaseWorkflowMixin(ABC):
        # Create a simple reference note
        file_names = [os.path.basename(f) for f in request_files]
-        reference_note = (
+        reference_note = f"Files referenced in this step: {', '.join(file_names)}\n"
            f"Files referenced in this step: {', '.join(file_names)}\n"
            f"(File content available via conversation history or can be discovered by Claude)"
        )
        self._file_reference_note = reference_note
        logger.debug(f"[WORKFLOW_FILES] {self.get_name()}: Set _file_reference_note: {self._file_reference_note}")
@@ -1514,7 +1504,6 @@ class BaseWorkflowMixin(ABC):
                system_prompt=system_prompt,
                temperature=validated_temperature,
                thinking_mode=self.get_request_thinking_mode(request),
                use_websearch=self.get_request_use_websearch(request),
                images=list(set(self.consolidated_findings.images)) if self.consolidated_findings.images else None,
            )