Re-imagined and re-written Debug tool. Instead of prompting Claude to perform initial analysis (and hoping it did), the tool now works through the debug process as an 'investigation', encouraging Claud to gather its 'findings' / 'hypothesis', stepping back as needed, collecting files it's gone through and keeping track of files relevant to the issue at hand. This structured investiion is then passed to the other model with far greater insight than the original debug tool ever could.

Improved prompts, guard against overengineering and flag that as an antipattern
2025-06-19 10:22:30 +04:00
parent 2641c78f8d
commit fccfb0d999
16 changed files with 2243 additions and 707 deletions
--- a/tools/debug.py
+++ b/tools/debug.py
@@ -1,7 +1,9 @@
 """
-Debug Issue tool - Root cause analysis and debugging assistance
+Debug Issue tool - Root cause analysis and debugging assistance with systematic investigation
 """

+import json
+import logging
 from typing import TYPE_CHECKING, Any, Optional

 from pydantic import Field
@@ -14,155 +16,207 @@ from systemprompts import DEBUG_ISSUE_PROMPT

 from .base import BaseTool, ToolRequest

-# Field descriptions to avoid duplication between Pydantic and JSON schema
-DEBUG_FIELD_DESCRIPTIONS = {
-    "prompt": (
-        "MANDATORY: You MUST first think deep about the issue, what it is, why it might be happening, what code might be involved, "
-        "is it an error stemming out of the code directly or is it a side-effect of some part of the existing code. If it's an error "
-        "message, could it be coming from an external resource and NOT directly from the project? What part of the code seems most likely"
-        "the culprit. MUST try and ZERO IN on the issue and surrounding code. Include all the details into the prompt that you can provide: "
-        "error messages, symptoms, when it occurs, steps to reproduce, environment details, "
-        "recent changes, and any other relevant information. Mention any previous attempts at fixing this issue, "
-        "including any past fix that was in place but has now regressed. "
-        "The more context available, the better the analysis. "
-        "PERFORM SYSTEMATIC INVESTIGATION: You MUST begin by thinking hard and performing a thorough investigation using a systematic approach. "
-        "First understand the issue, find the code that may be causing it or code that is breaking, as well as any related code that could have caused this as a side effect. "
-        "You MUST maintain detailed investigation notes in a DEBUGGING_{issue_description}.md file within the project folder, "
-        "updating it as it performs step-by-step analysis of the code, trying to determine the actual root cause and understanding how a minimal, appropriate fix can be found. "
-        "This file MUST contain functions, methods, files visited OR determined to be part of the problem. You MUST update this and remove any references that it finds to be irrelevant during its investigation. "
-        "CRITICAL: If after thorough investigation You has very high confidence that NO BUG EXISTS that correlates to the reported symptoms, "
-        "You should consider the possibility that the reported issue may not actually be present, may be a misunderstanding, or may be conflated with something else entirely. "
-        "In such cases, you should gather more information from the user through targeted questioning rather than continue hunting for non-existent bugs. "
-        "Once complete, you MUST provide also pass in this file into the files parameter of this tool. "
-        "It is ESSENTIAL that this detailed work is performed by you before sharing all the relevant details with its development assistant. This will greatly help in zeroing in on the root cause."
+logger = logging.getLogger(__name__)
+
+# Field descriptions for the investigation steps
+DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS = {
+    "step": (
+        "Your current investigation step. For the first step, describe the issue/error to investigate. "
+        "For subsequent steps, describe what you're investigating, what code you're examining, "
+        "what patterns you're looking for, or what hypothesis you're testing."
    ),
+    "step_number": "Current step number in the investigation sequence (starts at 1)",
+    "total_steps": "Current estimate of total investigation steps needed (can be adjusted as investigation progresses)",
+    "next_step_required": "Whether another investigation step is required",
    "findings": (
-        "You MUST first perform its own investigation, gather its findings and analysis. Include: steps taken to analyze the issue, "
-        "code patterns discovered, initial hypotheses formed, any relevant classes/functions/methods examined, "
-        "and any preliminary conclusions. If investigation yields no concrete evidence of a bug correlating to the reported symptoms, "
-        "You should clearly state this finding and consider that the issue may not exist as described. "
-        "This provides context for the assistant model's analysis."
+        "Current findings from this investigation step. Include code patterns discovered, "
+        "potential causes identified, hypotheses formed, or evidence gathered."
    ),
-    "files": (
-        "Essential files for debugging - ONLY include files that are directly related to the issue, "
-        "contain the problematic code, or are necessary for understanding the root cause. "
-        "This can include any relevant log files, error description documents, investigation documents, "
-        "Your own findings as a document, related code that may help with analysis."
-        "DO NOT include every file scanned during investigation (must be FULL absolute paths - DO NOT SHORTEN)."
+    "files_checked": (
+        "List of files you've examined so far in the investigation (cumulative list). "
+        "Include all files you've looked at, even if they turned out to be irrelevant."
    ),
-    "error_context": "Stack trace, snippet from logs, or additional error context. For very large text you MUST instead"
-    "save the context as a temporary file within the project folder and share it as a FULL absolute file path - DO NOT SHORTEN"
-    "reference to the files parameter.",
-    "images": "Optional images showing error screens, UI issues, logs displays, or visual debugging information",
+    "relevant_files": (
+        "List of files that are definitely related to the issue (subset of files_checked). "
+        "Only include files that contain code directly related to the problem."
+    ),
+    "relevant_methods": (
+        "List of specific methods/functions that are involved in the issue. "
+        "Format: 'ClassName.methodName' or 'functionName'"
+    ),
+    "hypothesis": (
+        "Your current working hypothesis about the root cause. This can be updated/revised "
+        "as the investigation progresses."
+    ),
+    "confidence": "Your confidence level in the current hypothesis: 'low', 'medium', or 'high'",
+    "backtrack_from_step": "If you need to revise a previous finding, which step number to backtrack from",
+    "continuation_id": "Thread continuation ID for multi-turn investigation sessions",
+    "images": (
+        "Optional images showing error screens, UI issues, logs displays, or visual debugging information "
+        "that help understand the issue (must be FULL absolute paths - DO NOT SHORTEN)"
+    ),
+}
+
+# Field descriptions for the final debug request
+DEBUG_FIELD_DESCRIPTIONS = {
+    "initial_issue": "The original issue description that started the investigation",
+    "investigation_summary": "Complete summary of the systematic investigation performed",
+    "findings": "Consolidated findings from all investigation steps",
+    "files": "Essential files identified during investigation (must be FULL absolute paths - DO NOT SHORTEN)",
+    "error_context": "Stack trace, logs, or error context discovered during investigation",
+    "relevant_methods": "List of methods/functions identified as involved in the issue",
+    "hypothesis": "Final hypothesis about the root cause after investigation",
+    "images": "Optional images showing error screens, UI issues, or visual debugging information",
 }


-class DebugIssueRequest(ToolRequest):
-    """Request model for debug tool"""
+class DebugInvestigationRequest(ToolRequest):
+    """Request model for debug investigation steps"""

-    prompt: str = Field(..., description=DEBUG_FIELD_DESCRIPTIONS["prompt"])
-    findings: Optional[str] = Field(None, description=DEBUG_FIELD_DESCRIPTIONS["findings"])
-    files: Optional[list[str]] = Field(None, description=DEBUG_FIELD_DESCRIPTIONS["files"])
-    error_context: Optional[str] = Field(None, description=DEBUG_FIELD_DESCRIPTIONS["error_context"])
-    images: Optional[list[str]] = Field(None, description=DEBUG_FIELD_DESCRIPTIONS["images"])
+    # Required fields for each investigation step
+    step: str = Field(..., description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["step"])
+    step_number: int = Field(..., description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["step_number"])
+    total_steps: int = Field(..., description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["total_steps"])
+    next_step_required: bool = Field(..., description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["next_step_required"])
+
+    # Investigation tracking fields
+    findings: str = Field(..., description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["findings"])
+    files_checked: list[str] = Field(
+        default_factory=list, description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["files_checked"]
+    )
+    relevant_files: list[str] = Field(
+        default_factory=list, description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["relevant_files"]
+    )
+    relevant_methods: list[str] = Field(
+        default_factory=list, description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["relevant_methods"]
+    )
+    hypothesis: Optional[str] = Field(None, description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["hypothesis"])
+    confidence: Optional[str] = Field("low", description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["confidence"])
+
+    # Optional backtracking field
+    backtrack_from_step: Optional[int] = Field(
+        None, description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["backtrack_from_step"]
+    )
+
+    # Optional continuation field
+    continuation_id: Optional[str] = Field(None, description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["continuation_id"])
+
+    # Optional images for visual debugging
+    images: Optional[list[str]] = Field(default=None, description=DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["images"])
+
+    # Override inherited fields to exclude them
+    model: Optional[str] = Field(default=None, exclude=True)
+    temperature: Optional[float] = Field(default=None, exclude=True)
+    thinking_mode: Optional[str] = Field(default=None, exclude=True)
+    use_websearch: Optional[bool] = Field(default=None, exclude=True)


 class DebugIssueTool(BaseTool):
-    """Advanced debugging and root cause analysis tool"""
+    """Advanced debugging tool with systematic self-investigation"""
+
+    def __init__(self):
+        super().__init__()
+        self.investigation_history = []
+        self.consolidated_findings = {
+            "files_checked": set(),
+            "relevant_files": set(),
+            "relevant_methods": set(),
+            "findings": [],
+            "hypotheses": [],
+            "images": [],
+        }

    def get_name(self) -> str:
        return "debug"

    def get_description(self) -> str:
        return (
-            "DEBUG & ROOT CAUSE ANALYSIS - Expert debugging for complex issues with systematic investigation support. "
-            "Use this when you need to debug code, find out why something is failing, identify root causes, "
-            "trace errors, or diagnose issues. "
-            "MANDATORY: Claud you MUST first think deep and follow these instructions when using this tool"
-            "SYSTEMATIC INVESTIGATION WORKFLOW: "
-            "You MUST begin by thinking hard and performing a thorough investigation using a systematic approach. "
-            "First understand the issue, find the code that may be causing it or code that is breaking, as well as any related code that could have caused this as a side effect. "
-            "You MUST maintain detailed investigation notes while it performs its analysis, "
-            "updating it as it performs step-by-step analysis of the code, trying to determine the actual root cause and understanding how a minimal, appropriate fix can be found. "
-            "This file MUST contain functions, methods, files visited OR determined to be part of the problem. You MUST update this and remove any references that it finds to be irrelevant during its investigation. "
-            "Once complete, You MUST provide Zen's debug tool with this file passed into the files parameter. "
-            "1. INVESTIGATE SYSTEMATICALLY: You MUST think and use a methodical approach to trace through error reports, "
-            "examine code, and gather evidence step by step "
-            "2. DOCUMENT FINDINGS: Maintain detailed investigation notes to "
-            "keep the user informed during its initial investigation. This investigation MUST be shared with this tool for the assistant "
-            "to be able to help more effectively. "
-            "3. USE TRACER TOOL: For complex method calls, class references, or side effects use Zen's tracer tool and include its output as part of the "
-            "prompt or additional context "
-            "4. COLLECT EVIDENCE: Document important discoveries and validation attempts "
-            "5. PROVIDE COMPREHENSIVE FINDINGS: Pass complete findings to this tool for expert analysis "
-            "INVESTIGATION METHODOLOGY: "
-            "- Start with error messages/symptoms and work backwards to root cause "
-            "- Examine code flow and identify potential failure points "
-            "- Use tracer tool for complex method interactions and dependencies if and as needed but continue with the investigation after using it "
-            "- Test hypotheses against actual code and logs and confirm the idea holds "
-            "- Document everything systematically "
-            "- CRITICAL: If investigation yields no concrete evidence of a bug, consider that the reported issue may not exist as described and gather more information through questioning "
-            "ESSENTIAL FILES ONLY: Include only files (documents, code etc) directly related to the issue. "
-            "Focus on quality over quantity for assistant model analysis. "
-            "STRUCTURED OUTPUT: Assistant models return JSON responses with hypothesis "
-            "ranking, evidence correlation, and actionable fixes. "
-            "Choose thinking_mode based on issue complexity: 'low' for simple errors, "
-            "'medium' for standard debugging (default), 'high' for complex system issues, "
-            "'max' for extremely challenging bugs requiring deepest analysis. "
-            "Note: If you're not currently using a top-tier model such as Opus 4 or above, these tools can provide enhanced capabilities."
+            "DEBUG & ROOT CAUSE ANALYSIS - Systematic self-investigation followed by expert analysis. "
+            "This tool guides you through a step-by-step investigation process where you:\n\n"
+            "1. Start with step 1: describe the issue to investigate\n"
+            "2. Continue with investigation steps: examine code, trace errors, test hypotheses\n"
+            "3. Track findings, relevant files, and methods throughout\n"
+            "4. Update hypotheses as understanding evolves\n"
+            "5. Backtrack and revise findings when needed\n"
+            "6. Once investigation is complete, receive expert analysis\n\n"
+            "The tool enforces systematic investigation methodology:\n"
+            "- Methodical code examination and evidence collection\n"
+            "- Hypothesis formation and validation\n"
+            "- File and method tracking for context\n"
+            "- Confidence assessment and revision capabilities\n\n"
+            "Perfect for: complex bugs, mysterious errors, performance issues, "
+            "race conditions, memory leaks, integration problems."
        )

    def get_input_schema(self) -> dict[str, Any]:
        schema = {
            "type": "object",
            "properties": {
-                "prompt": {
+                # Investigation step fields
+                "step": {
                    "type": "string",
-                    "description": DEBUG_FIELD_DESCRIPTIONS["prompt"],
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["step"],
+                },
+                "step_number": {
+                    "type": "integer",
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["step_number"],
+                    "minimum": 1,
+                },
+                "total_steps": {
+                    "type": "integer",
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["total_steps"],
+                    "minimum": 1,
+                },
+                "next_step_required": {
+                    "type": "boolean",
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["next_step_required"],
                },
-                "model": self.get_model_field_schema(),
                "findings": {
                    "type": "string",
-                    "description": DEBUG_FIELD_DESCRIPTIONS["findings"],
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["findings"],
                },
-                "files": {
+                "files_checked": {
                    "type": "array",
                    "items": {"type": "string"},
-                    "description": DEBUG_FIELD_DESCRIPTIONS["files"],
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["files_checked"],
                },
-                "error_context": {
+                "relevant_files": {
+                    "type": "array",
+                    "items": {"type": "string"},
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["relevant_files"],
+                },
+                "relevant_methods": {
+                    "type": "array",
+                    "items": {"type": "string"},
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["relevant_methods"],
+                },
+                "hypothesis": {
                    "type": "string",
-                    "description": DEBUG_FIELD_DESCRIPTIONS["error_context"],
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["hypothesis"],
+                },
+                "confidence": {
+                    "type": "string",
+                    "enum": ["low", "medium", "high"],
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["confidence"],
+                },
+                "backtrack_from_step": {
+                    "type": "integer",
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["backtrack_from_step"],
+                    "minimum": 1,
+                },
+                "continuation_id": {
+                    "type": "string",
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["continuation_id"],
                },
                "images": {
                    "type": "array",
                    "items": {"type": "string"},
-                    "description": DEBUG_FIELD_DESCRIPTIONS["images"],
-                },
-                "temperature": {
-                    "type": "number",
-                    "description": "Temperature (0-1, default 0.2 for accuracy)",
-                    "minimum": 0,
-                    "maximum": 1,
-                },
-                "thinking_mode": {
-                    "type": "string",
-                    "enum": ["minimal", "low", "medium", "high", "max"],
-                    "description": "Thinking depth: minimal (0.5% of model max), low (8%), medium (33%), high (67%), max (100% of model max)",
-                },
-                "use_websearch": {
-                    "type": "boolean",
-                    "description": "Enable web search for documentation, best practices, and current information. Particularly useful for: brainstorming sessions, architectural design discussions, exploring industry best practices, working with specific frameworks/technologies, researching solutions to complex problems, or when current documentation and community insights would enhance the analysis.",
-                    "default": True,
-                },
-                "continuation_id": {
-                    "type": "string",
-                    "description": "Thread continuation ID for multi-turn conversations. Can be used to continue conversations across different tools. Only provide this if continuing a previous conversation thread.",
+                    "description": DEBUG_INVESTIGATION_FIELD_DESCRIPTIONS["images"],
                },
            },
-            "required": ["prompt"] + (["model"] if self.is_effective_auto_mode() else []),
+            # Required fields for investigation
+            "required": ["step", "step_number", "total_steps", "next_step_required", "findings"],
        }
-
        return schema

    def get_system_prompt(self) -> str:
@@ -171,8 +225,6 @@ class DebugIssueTool(BaseTool):
    def get_default_temperature(self) -> float:
        return TEMPERATURE_ANALYTICAL

-    # Line numbers are enabled by default from base class for precise error location
-
    def get_model_category(self) -> "ToolModelCategory":
        """Debug requires deep analysis and reasoning"""
        from tools.models import ToolModelCategory
@@ -180,138 +232,342 @@ class DebugIssueTool(BaseTool):
        return ToolModelCategory.EXTENDED_REASONING

    def get_request_model(self):
-        return DebugIssueRequest
+        return DebugInvestigationRequest

-    async def prepare_prompt(self, request: DebugIssueRequest) -> str:
-        """Prepare the debugging prompt"""
-        # Check for prompt.txt in files
-        prompt_content, updated_files = self.handle_prompt_file(request.files)
+    def requires_model(self) -> bool:
+        """
+        Debug tool manages its own model interactions.
+        It doesn't need model during investigation steps, only for final analysis.
+        """
+        return False

-        # If prompt.txt was found, use it as prompt or error_context
-        if prompt_content:
-            if not request.prompt or request.prompt == "":
-                request.prompt = prompt_content
+    async def execute(self, arguments: dict[str, Any]) -> list:
+        """
+        Override execute to implement self-investigation pattern.
+
+        Investigation Flow:
+        1. Claude calls debug with investigation steps
+        2. Tool tracks findings, files, methods progressively
+        3. Once investigation is complete, tool calls AI model for expert analysis
+        4. Returns structured response combining investigation + expert analysis
+        """
+        from mcp.types import TextContent
+
+        from utils.conversation_memory import add_turn, create_thread
+
+        try:
+            # Validate request
+            request = DebugInvestigationRequest(**arguments)
+
+            # Adjust total steps if needed
+            if request.step_number > request.total_steps:
+                request.total_steps = request.step_number
+
+            # Handle continuation
+            continuation_id = request.continuation_id
+
+            # Create thread for first step
+            if not continuation_id and request.step_number == 1:
+                continuation_id = create_thread("debug", arguments)
+                # Store initial issue description
+                self.initial_issue = request.step
+
+            # Handle backtracking first if requested
+            if request.backtrack_from_step:
+                # Remove findings after the backtrack point
+                self.investigation_history = [
+                    s for s in self.investigation_history if s["step_number"] < request.backtrack_from_step
+                ]
+                # Reprocess consolidated findings to match truncated history
+                self._reprocess_consolidated_findings()
+
+                # Log if step number needs correction
+                expected_step_number = len(self.investigation_history) + 1
+                if request.step_number != expected_step_number:
+                    logger.debug(
+                        f"Step number adjusted from {request.step_number} to {expected_step_number} after backtracking"
+                    )
+
+            # Process investigation step
+            step_data = {
+                "step": request.step,
+                "step_number": request.step_number,
+                "findings": request.findings,
+                "files_checked": request.files_checked,
+                "relevant_files": request.relevant_files,
+                "relevant_methods": request.relevant_methods,
+                "hypothesis": request.hypothesis,
+                "confidence": request.confidence,
+                "images": request.images,
+            }
+
+            # Store in history
+            self.investigation_history.append(step_data)
+
+            # Update consolidated findings
+            self.consolidated_findings["files_checked"].update(request.files_checked)
+            self.consolidated_findings["relevant_files"].update(request.relevant_files)
+            self.consolidated_findings["relevant_methods"].update(request.relevant_methods)
+            self.consolidated_findings["findings"].append(f"Step {request.step_number}: {request.findings}")
+            if request.hypothesis:
+                self.consolidated_findings["hypotheses"].append(
+                    {"step": request.step_number, "hypothesis": request.hypothesis, "confidence": request.confidence}
+                )
+            if request.images:
+                self.consolidated_findings["images"].extend(request.images)
+
+            # Build response
+            response_data = {
+                "status": "investigation_in_progress",
+                "step_number": request.step_number,
+                "total_steps": request.total_steps,
+                "next_step_required": request.next_step_required,
+                "investigation_status": {
+                    "files_checked": len(self.consolidated_findings["files_checked"]),
+                    "relevant_files": len(self.consolidated_findings["relevant_files"]),
+                    "relevant_methods": len(self.consolidated_findings["relevant_methods"]),
+                    "hypotheses_formed": len(self.consolidated_findings["hypotheses"]),
+                    "images_collected": len(set(self.consolidated_findings["images"])),
+                    "current_confidence": request.confidence,
+                },
+                "output": {
+                    "instructions": "Continue systematic investigation. Present findings clearly and proceed to next step if required.",
+                    "format": "systematic_investigation",
+                },
+            }
+
+            if continuation_id:
+                response_data["continuation_id"] = continuation_id
+
+            # If investigation is complete, call the AI model for expert analysis
+            if not request.next_step_required:
+                response_data["status"] = "calling_expert_analysis"
+                response_data["investigation_complete"] = True
+
+                # Prepare consolidated investigation summary
+                investigation_summary = self._prepare_investigation_summary()
+
+                # Call the AI model with full context
+                expert_analysis = await self._call_expert_analysis(
+                    initial_issue=getattr(self, "initial_issue", request.step),
+                    investigation_summary=investigation_summary,
+                    relevant_files=list(self.consolidated_findings["relevant_files"]),
+                    relevant_methods=list(self.consolidated_findings["relevant_methods"]),
+                    final_hypothesis=request.hypothesis,
+                    error_context=self._extract_error_context(),
+                    images=list(set(self.consolidated_findings["images"])),  # Unique images
+                    model_info=arguments.get("_model_context"),
+                    model_override=arguments.get("model"),  # Pass model selection from final step
+                )
+
+                # Combine investigation and expert analysis
+                response_data["expert_analysis"] = expert_analysis
+                response_data["complete_investigation"] = {
+                    "initial_issue": getattr(self, "initial_issue", request.step),
+                    "steps_taken": len(self.investigation_history),
+                    "files_examined": list(self.consolidated_findings["files_checked"]),
+                    "relevant_files": list(self.consolidated_findings["relevant_files"]),
+                    "relevant_methods": list(self.consolidated_findings["relevant_methods"]),
+                    "investigation_summary": investigation_summary,
+                }
+                response_data["next_steps"] = (
+                    "Investigation complete with expert analysis. Present the findings, hypotheses, "
+                    "and recommended fixes to the user. Focus on the most likely root cause and "
+                    "provide actionable implementation guidance."
+                )
            else:
-                request.error_context = prompt_content
+                response_data["next_steps"] = (
+                    f"Continue investigation with step {request.step_number + 1}. "
+                    f"Focus on: examining relevant code, testing hypotheses, gathering evidence."
+                )

-        # Check user input sizes at MCP transport boundary (before adding internal content)
-        size_check = self.check_prompt_size(request.prompt)
-        if size_check:
-            from tools.models import ToolOutput
+            # Store in conversation memory
+            if continuation_id:
+                add_turn(
+                    thread_id=continuation_id,
+                    role="assistant",
+                    content=json.dumps(response_data, indent=2),
+                    tool_name="debug",
+                    files=list(self.consolidated_findings["relevant_files"]),
+                    images=request.images,
+                )

-            raise ValueError(f"MCP_SIZE_CHECK:{ToolOutput(**size_check).model_dump_json()}")
+            return [TextContent(type="text", text=json.dumps(response_data, indent=2))]

-        if request.error_context:
-            size_check = self.check_prompt_size(request.error_context)
-            if size_check:
-                from tools.models import ToolOutput
+        except Exception as e:
+            logger.error(f"Error in debug investigation: {e}", exc_info=True)
+            error_data = {
+                "status": "investigation_failed",
+                "error": str(e),
+                "step_number": arguments.get("step_number", 0),
+            }
+            return [TextContent(type="text", text=json.dumps(error_data, indent=2))]

-                raise ValueError(f"MCP_SIZE_CHECK:{ToolOutput(**size_check).model_dump_json()}")
+    def _reprocess_consolidated_findings(self):
+        """Reprocess consolidated findings after backtracking"""
+        self.consolidated_findings = {
+            "files_checked": set(),
+            "relevant_files": set(),
+            "relevant_methods": set(),
+            "findings": [],
+            "hypotheses": [],
+            "images": [],
+        }

-        # Update request files list
-        if updated_files is not None:
-            request.files = updated_files
+        for step in self.investigation_history:
+            self.consolidated_findings["files_checked"].update(step.get("files_checked", []))
+            self.consolidated_findings["relevant_files"].update(step.get("relevant_files", []))
+            self.consolidated_findings["relevant_methods"].update(step.get("relevant_methods", []))
+            self.consolidated_findings["findings"].append(f"Step {step['step_number']}: {step['findings']}")
+            if step.get("hypothesis"):
+                self.consolidated_findings["hypotheses"].append(
+                    {
+                        "step": step["step_number"],
+                        "hypothesis": step["hypothesis"],
+                        "confidence": step.get("confidence", "low"),
+                    }
+                )
+            if step.get("images"):
+                self.consolidated_findings["images"].extend(step["images"])

-        # File size validation happens at MCP boundary in server.py
+    def _prepare_investigation_summary(self) -> str:
+        """Prepare a comprehensive summary of the investigation"""
+        summary_parts = [
+            "=== SYSTEMATIC INVESTIGATION SUMMARY ===",
+            f"Total steps: {len(self.investigation_history)}",
+            f"Files examined: {len(self.consolidated_findings['files_checked'])}",
+            f"Relevant files identified: {len(self.consolidated_findings['relevant_files'])}",
+            f"Methods/functions involved: {len(self.consolidated_findings['relevant_methods'])}",
+            "",
+            "=== INVESTIGATION PROGRESSION ===",
+        ]

-        # Build context sections
-        context_parts = [f"=== ISSUE DESCRIPTION ===\n{request.prompt}\n=== END DESCRIPTION ==="]
+        for finding in self.consolidated_findings["findings"]:
+            summary_parts.append(finding)

-        if request.findings:
-            context_parts.append(f"\n=== CLAUDE'S INVESTIGATION FINDINGS ===\n{request.findings}\n=== END FINDINGS ===")
-
-        if request.error_context:
-            context_parts.append(f"\n=== ERROR CONTEXT/STACK TRACE ===\n{request.error_context}\n=== END CONTEXT ===")
-
-        # Add relevant files if provided
-        if request.files:
-            # Use centralized file processing logic
-            continuation_id = getattr(request, "continuation_id", None)
-            file_content, processed_files = self._prepare_file_content_for_prompt(
-                request.files, continuation_id, "Code"
+        if self.consolidated_findings["hypotheses"]:
+            summary_parts.extend(
+                [
+                    "",
+                    "=== HYPOTHESIS EVOLUTION ===",
+                ]
            )
-            self._actually_processed_files = processed_files
+            for hyp in self.consolidated_findings["hypotheses"]:
+                summary_parts.append(f"Step {hyp['step']} ({hyp['confidence']} confidence): {hyp['hypothesis']}")

+        return "\n".join(summary_parts)
+
+    def _extract_error_context(self) -> Optional[str]:
+        """Extract error context from investigation findings"""
+        error_patterns = ["error", "exception", "stack trace", "traceback", "failure"]
+        error_context_parts = []
+
+        for finding in self.consolidated_findings["findings"]:
+            if any(pattern in finding.lower() for pattern in error_patterns):
+                error_context_parts.append(finding)
+
+        return "\n".join(error_context_parts) if error_context_parts else None
+
+    async def _call_expert_analysis(
+        self,
+        initial_issue: str,
+        investigation_summary: str,
+        relevant_files: list[str],
+        relevant_methods: list[str],
+        final_hypothesis: Optional[str],
+        error_context: Optional[str],
+        images: list[str],
+        model_info: Optional[Any] = None,
+        model_override: Optional[str] = None,
+    ) -> dict:
+        """Call AI model for expert analysis of the investigation"""
+        # Prepare the debug prompt with all investigation context
+        prompt_parts = [
+            f"=== ISSUE DESCRIPTION ===\n{initial_issue}\n=== END DESCRIPTION ===",
+            f"\n=== CLAUDE'S INVESTIGATION FINDINGS ===\n{investigation_summary}\n=== END FINDINGS ===",
+        ]
+
+        if error_context:
+            prompt_parts.append(f"\n=== ERROR CONTEXT/STACK TRACE ===\n{error_context}\n=== END CONTEXT ===")
+
+        if relevant_methods:
+            prompt_parts.append(
+                "\n=== RELEVANT METHODS/FUNCTIONS ===\n"
+                + "\n".join(f"- {method}" for method in relevant_methods)
+                + "\n=== END METHODS ==="
+            )
+
+        if final_hypothesis:
+            prompt_parts.append(f"\n=== FINAL HYPOTHESIS ===\n{final_hypothesis}\n=== END HYPOTHESIS ===")
+
+        if images:
+            prompt_parts.append(
+                "\n=== VISUAL DEBUGGING INFORMATION ===\n"
+                + "\n".join(f"- {img}" for img in images)
+                + "\n=== END VISUAL INFORMATION ==="
+            )
+
+        # Add file content if we have relevant files
+        if relevant_files:
+            file_content, _ = self._prepare_file_content_for_prompt(relevant_files, None, "Essential debugging files")
            if file_content:
-                context_parts.append(
+                prompt_parts.append(
                    f"\n=== ESSENTIAL FILES FOR DEBUGGING ===\n{file_content}\n=== END ESSENTIAL FILES ==="
                )

-        full_context = "\n".join(context_parts)
+        full_prompt = "\n".join(prompt_parts)

-        # Check token limits
-        self._validate_token_limit(full_context, "Context")
+        # Get appropriate model and provider
+        from config import DEFAULT_MODEL
+        from providers.registry import ModelProviderRegistry

-        # Add web search instruction if enabled
-        websearch_instruction = self.get_websearch_instruction(
-            request.use_websearch,
-            """When debugging issues, consider if searches for these would help:
- The exact error message to find known solutions
- Framework-specific error codes and their meanings
- Similar issues in forums, GitHub issues, or Stack Overflow
- Workarounds and patches for known bugs
- Version-specific issues and compatibility problems""",
-        )
+        model_name = model_override or DEFAULT_MODEL  # Use override if provided
+        provider = ModelProviderRegistry.get_provider_for_model(model_name)

-        # Combine everything
-        full_prompt = f"""{self.get_system_prompt()}{websearch_instruction}
+        if not provider:
+            return {"error": f"No provider available for model {model_name}", "status": "provider_error"}

-{full_context}
+        # Generate AI response
+        try:
+            full_analysis_prompt = f"{self.get_system_prompt()}\n\n{full_prompt}\n\nPlease debug this issue following the structured format in the system prompt."

-Please debug this issue following the structured format in the system prompt.
-Focus on finding the root cause and providing actionable solutions."""
+            # Prepare generation kwargs
+            generation_kwargs = {
+                "prompt": full_analysis_prompt,
+                "model_name": model_name,
+                "system_prompt": "",  # Already included in prompt
+                "temperature": self.get_default_temperature(),
+                "thinking_mode": "high",  # High thinking for debug analysis
+            }

-        return full_prompt
+            # Add images if available
+            if images:
+                generation_kwargs["images"] = images

-    def _get_model_name(self, model_info: Optional[dict]) -> str:
-        """Extract friendly model name from model info."""
-        if model_info and model_info.get("model_response"):
-            return model_info["model_response"].friendly_name or "the model"
-        return "the model"
+            model_response = provider.generate_content(**generation_kwargs)

-    def _generate_systematic_next_steps(self, model_name: str) -> str:
-        """Generate next steps for systematic investigation completion."""
-        return f"""**Expert Analysis Complete**
+            if model_response.content:
+                # Try to parse as JSON
+                try:
+                    analysis_result = json.loads(model_response.content.strip())
+                    return analysis_result
+                except json.JSONDecodeError:
+                    # Return as text if not valid JSON
+                    return {
+                        "status": "analysis_complete",
+                        "raw_analysis": model_response.content,
+                        "parse_error": "Response was not valid JSON",
+                    }
+            else:
+                return {"error": "No response from model", "status": "empty_response"}

-{model_name} has analyzed your systematic investigation findings.
+        except Exception as e:
+            logger.error(f"Error calling expert analysis: {e}", exc_info=True)
+            return {"error": str(e), "status": "analysis_error"}

-**Next Steps:**
-1. **UPDATE INVESTIGATION DOCUMENT**: Add the expert analysis to your DEBUGGING_*.md file
-2. **REVIEW HYPOTHESES**: Examine the ranked hypotheses and evidence validation
-3. **IMPLEMENT FIXES**: Apply recommended minimal fixes in order of likelihood
-4. **VALIDATE CHANGES**: Test each fix thoroughly to ensure no regressions
-5. **DOCUMENT RESOLUTION**: Update investigation document with final resolution"""
+    # Stub implementations for base class requirements
+    async def prepare_prompt(self, request) -> str:
+        return ""  # Not used - execute() is overridden

-    def _generate_standard_analysis_steps(self, model_name: str) -> str:
-        """Generate next steps for standard analysis completion."""
-        return f"""**Expert Analysis Complete**
-
-{model_name} has analyzed your investigation findings.
-
-**Next Steps:**
-1. **REVIEW HYPOTHESES**: Examine the ranked hypotheses and evidence
-2. **IMPLEMENT FIXES**: Apply recommended minimal fixes in order of likelihood
-3. **VALIDATE CHANGES**: Test each fix thoroughly to ensure no regressions"""
-
-    def _generate_general_analysis_steps(self, model_name: str) -> str:
-        """Generate next steps for general analysis responses."""
-        return f"""**Analysis from {model_name}**
-
-**Next Steps:** Continue your systematic investigation based on the guidance provided, then return
-with comprehensive findings for expert analysis."""
-
-    def format_response(self, response: str, request: DebugIssueRequest, model_info: Optional[dict] = None) -> str:
-        """Format the debugging response for Claude to present to user"""
-        # The base class automatically handles structured responses like 'files_required_to_continue'
-        # and 'analysis_complete' via SPECIAL_STATUS_MODELS, so we only handle normal text responses here
-
-        model_name = self._get_model_name(model_info)
-
-        # For normal text responses, provide general guidance
-        next_steps = self._generate_general_analysis_steps(model_name)
-
-        return f"""{response}
-
---
-
-{next_steps}"""
+    def format_response(self, response: str, request, model_info: dict = None) -> str:
+        return response  # Not used - execute() is overridden