🚀 Major Enhancement: Workflow-Based Tool Architecture v5.5.0 (#95)

* WIP: new workflow architecture * WIP: further improvements and cleanup * WIP: cleanup and docks, replace old tool with new * WIP: cleanup and docks, replace old tool with new * WIP: new planner implementation using workflow * WIP: precommit tool working as a workflow instead of a basic tool Support for passing False to use_assistant_model to skip external models completely and use Claude only * WIP: precommit workflow version swapped with old * WIP: codereview * WIP: replaced codereview * WIP: replaced codereview * WIP: replaced refactor * WIP: workflow for thinkdeep * WIP: ensure files get embedded correctly * WIP: thinkdeep replaced with workflow version * WIP: improved messaging when an external model's response is received * WIP: analyze tool swapped * WIP: updated tests * Extract only the content when building history * Use "relevant_files" for workflow tools only * WIP: updated tests * Extract only the content when building history * Use "relevant_files" for workflow tools only * WIP: fixed get_completion_next_steps_message missing param * Fixed tests Request for files consistently * Fixed tests Request for files consistently * Fixed tests * New testgen workflow tool Updated docs * Swap testgen workflow * Fix CI test failures by excluding API-dependent tests - Update GitHub Actions workflow to exclude simulation tests that require API keys - Fix collaboration tests to properly mock workflow tool expert analysis calls - Update test assertions to handle new workflow tool response format - Ensure unit tests run without external API dependencies in CI 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * WIP - Update tests to match new tools * WIP - Update tests to match new tools --------- Co-authored-by: Claude <noreply@anthropic.com>
2025-06-21 00:08:11 +04:00
parent 4dae6e457e
commit 69a3121452
76 changed files with 17111 additions and 7725 deletions
--- a/tools/codereview.py
+++ b/tools/codereview.py
@@ -1,316 +1,671 @@
 """
-Code Review tool - Comprehensive code analysis and review
+CodeReview Workflow tool - Systematic code review with step-by-step analysis

-This tool provides professional-grade code review capabilities using
-the chosen model's understanding of code patterns, best practices, and common issues.
-It can analyze individual files or entire codebases, providing actionable
-feedback categorized by severity.
+This tool provides a structured workflow for comprehensive code review and analysis.
+It guides Claude through systematic investigation steps with forced pauses between each step
+to ensure thorough code examination, issue identification, and quality assessment before proceeding.
+The tool supports complex review scenarios including security analysis, performance evaluation,
+and architectural assessment.

-Key Features:
- Multi-file and directory support
- Configurable review types (full, security, performance, quick)
- Severity-based issue filtering
- Custom focus areas and coding standards
- Structured output with specific remediation steps
+Key features:
+- Step-by-step code review workflow with progress tracking
+- Context-aware file embedding (references during investigation, full content for analysis)
+- Automatic issue tracking with severity classification
+- Expert analysis integration with external models
+- Support for focused reviews (security, performance, architecture)
+- Confidence-based workflow optimization
 """

-from typing import Any, Optional
+import logging
+from typing import TYPE_CHECKING, Any, Literal, Optional

-from pydantic import Field
+from pydantic import Field, model_validator
+
+if TYPE_CHECKING:
+    from tools.models import ToolModelCategory

 from config import TEMPERATURE_ANALYTICAL
 from systemprompts import CODEREVIEW_PROMPT
+from tools.shared.base_models import WorkflowRequest

-from .base import BaseTool, ToolRequest
+from .workflow.base import WorkflowTool

-# Field descriptions to avoid duplication between Pydantic and JSON schema
-CODEREVIEW_FIELD_DESCRIPTIONS = {
-    "files": "Code files or directories to review that are relevant to the code that needs review or are closely "
-    "related to the code or component that needs to be reviewed (must be FULL absolute paths to real files / folders - DO NOT SHORTEN)."
-    "Validate that these files exist on disk before sharing and only share code that is relevant.",
-    "prompt": (
-        "User's summary of what the code does, expected behavior, constraints, and review objectives. "
-        "IMPORTANT: Before using this tool, you should first perform its own preliminary review - "
-        "examining the code structure, identifying potential issues, understanding the business logic, "
-        "and noting areas of concern. Include your initial observations about code quality, potential "
-        "bugs, architectural patterns, and specific areas that need deeper scrutiny. This dual-perspective "
-        "approach (your analysis + external model's review) provides more comprehensive feedback and "
-        "catches issues that either reviewer might miss alone."
+logger = logging.getLogger(__name__)
+
+# Tool-specific field descriptions for code review workflow
+CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS = {
+    "step": (
+        "Describe what you're currently investigating for code review by thinking deeply about the code structure, "
+        "patterns, and potential issues. In step 1, clearly state your review plan and begin forming a systematic "
+        "approach after thinking carefully about what needs to be analyzed. CRITICAL: Remember to thoroughly examine "
+        "code quality, security implications, performance concerns, and architectural patterns. Consider not only "
+        "obvious bugs and issues but also subtle concerns like over-engineering, unnecessary complexity, design "
+        "patterns that could be simplified, areas where architecture might not scale well, missing abstractions, "
+        "and ways to reduce complexity while maintaining functionality. Map out the codebase structure, understand "
+        "the business logic, and identify areas requiring deeper analysis. In all later steps, continue exploring "
+        "with precision: trace dependencies, verify assumptions, and adapt your understanding as you uncover more evidence."
+    ),
+    "step_number": (
+        "The index of the current step in the code review sequence, beginning at 1. Each step should build upon or "
+        "revise the previous one."
+    ),
+    "total_steps": (
+        "Your current estimate for how many steps will be needed to complete the code review. "
+        "Adjust as new findings emerge."
+    ),
+    "next_step_required": (
+        "Set to true if you plan to continue the investigation with another step. False means you believe the "
+        "code review analysis is complete and ready for expert validation."
+    ),
+    "findings": (
+        "Summarize everything discovered in this step about the code being reviewed. Include analysis of code quality, "
+        "security concerns, performance issues, architectural patterns, design decisions, potential bugs, code smells, "
+        "and maintainability considerations. Be specific and avoid vague language—document what you now know about "
+        "the code and how it affects your assessment. IMPORTANT: Document both positive findings (good patterns, "
+        "proper implementations, well-designed components) and concerns (potential issues, anti-patterns, security "
+        "risks, performance bottlenecks). In later steps, confirm or update past findings with additional evidence."
+    ),
+    "files_checked": (
+        "List all files (as absolute paths, do not clip or shrink file names) examined during the code review "
+        "investigation so far. Include even files ruled out or found to be unrelated, as this tracks your "
+        "exploration path."
+    ),
+    "relevant_files": (
+        "Subset of files_checked (as full absolute paths) that contain code directly relevant to the review or "
+        "contain significant issues, patterns, or examples worth highlighting. Only list those that are directly "
+        "tied to important findings, security concerns, performance issues, or architectural decisions. This could "
+        "include core implementation files, configuration files, or files with notable patterns."
+    ),
+    "relevant_context": (
+        "List methods, functions, classes, or modules that are central to the code review findings, in the format "
+        "'ClassName.methodName', 'functionName', or 'module.ClassName'. Prioritize those that contain issues, "
+        "demonstrate patterns, show security concerns, or represent key architectural decisions."
+    ),
+    "issues_found": (
+        "List of issues identified during the investigation. Each issue should be a dictionary with 'severity' "
+        "(critical, high, medium, low) and 'description' fields. Include security vulnerabilities, performance "
+        "bottlenecks, code quality issues, architectural concerns, maintainability problems, over-engineering, "
+        "unnecessary complexity, etc."
+    ),
+    "confidence": (
+        "Indicate your current confidence in the code review assessment. Use: 'exploring' (starting analysis), 'low' "
+        "(early investigation), 'medium' (some evidence gathered), 'high' (strong evidence), 'certain' (only when "
+        "the code review is thoroughly complete and all significant issues are identified). Do NOT use 'certain' "
+        "unless the code review is comprehensively complete, use 'high' instead not 100% sure. Using 'certain' "
+        "prevents additional expert analysis."
+    ),
+    "backtrack_from_step": (
+        "If an earlier finding or assessment needs to be revised or discarded, specify the step number from which to "
+        "start over. Use this to acknowledge investigative dead ends and correct the course."
    ),
    "images": (
-        "Optional images of architecture diagrams, UI mockups, design documents, or visual references "
-        "for code review context"
+        "Optional list of absolute paths to architecture diagrams, UI mockups, design documents, or visual references "
+        "that help with code review context. Only include if they materially assist understanding or assessment."
    ),
-    "review_type": "Type of review to perform",
-    "focus_on": "Specific aspects to focus on, or additional context that would help understand areas of concern",
-    "standards": "Coding standards to enforce",
-    "severity_filter": "Minimum severity level to report",
+    "review_type": "Type of review to perform (full, security, performance, quick)",
+    "focus_on": "Specific aspects to focus on or additional context that would help understand areas of concern",
+    "standards": "Coding standards to enforce during the review",
+    "severity_filter": "Minimum severity level to report on the issues found",
 }


-class CodeReviewRequest(ToolRequest):
-    """
-    Request model for the code review tool.
+class CodeReviewRequest(WorkflowRequest):
+    """Request model for code review workflow investigation steps"""

-    This model defines all parameters that can be used to customize
-    the code review process, from selecting files to specifying
-    review focus and standards.
+    # Required fields for each investigation step
+    step: str = Field(..., description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["step"])
+    step_number: int = Field(..., description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["step_number"])
+    total_steps: int = Field(..., description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["total_steps"])
+    next_step_required: bool = Field(..., description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["next_step_required"])
+
+    # Investigation tracking fields
+    findings: str = Field(..., description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["findings"])
+    files_checked: list[str] = Field(
+        default_factory=list, description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["files_checked"]
+    )
+    relevant_files: list[str] = Field(
+        default_factory=list, description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["relevant_files"]
+    )
+    relevant_context: list[str] = Field(
+        default_factory=list, description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["relevant_context"]
+    )
+    issues_found: list[dict] = Field(
+        default_factory=list, description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["issues_found"]
+    )
+    confidence: Optional[str] = Field("low", description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["confidence"])
+
+    # Optional backtracking field
+    backtrack_from_step: Optional[int] = Field(
+        None, description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["backtrack_from_step"]
+    )
+
+    # Optional images for visual context
+    images: Optional[list[str]] = Field(default=None, description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["images"])
+
+    # Code review-specific fields (only used in step 1 to initialize)
+    review_type: Optional[Literal["full", "security", "performance", "quick"]] = Field(
+        "full", description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["review_type"]
+    )
+    focus_on: Optional[str] = Field(None, description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["focus_on"])
+    standards: Optional[str] = Field(None, description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["standards"])
+    severity_filter: Optional[Literal["critical", "high", "medium", "low", "all"]] = Field(
+        "all", description=CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["severity_filter"]
+    )
+
+    # Override inherited fields to exclude them from schema (except model which needs to be available)
+    temperature: Optional[float] = Field(default=None, exclude=True)
+    thinking_mode: Optional[str] = Field(default=None, exclude=True)
+    use_websearch: Optional[bool] = Field(default=None, exclude=True)
+
+    @model_validator(mode="after")
+    def validate_step_one_requirements(self):
+        """Ensure step 1 has required relevant_files field."""
+        if self.step_number == 1 and not self.relevant_files:
+            raise ValueError("Step 1 requires 'relevant_files' field to specify code files or directories to review")
+        return self
+
+
+class CodeReviewTool(WorkflowTool):
+    """
+    Code Review workflow tool for step-by-step code review and expert analysis.
+
+    This tool implements a structured code review workflow that guides users through
+    methodical investigation steps, ensuring thorough code examination, issue identification,
+    and quality assessment before reaching conclusions. It supports complex review scenarios
+    including security audits, performance analysis, architectural review, and maintainability assessment.
    """

-    files: list[str] = Field(..., description=CODEREVIEW_FIELD_DESCRIPTIONS["files"])
-    prompt: str = Field(..., description=CODEREVIEW_FIELD_DESCRIPTIONS["prompt"])
-    images: Optional[list[str]] = Field(None, description=CODEREVIEW_FIELD_DESCRIPTIONS["images"])
-    review_type: str = Field("full", description=CODEREVIEW_FIELD_DESCRIPTIONS["review_type"])
-    focus_on: Optional[str] = Field(None, description=CODEREVIEW_FIELD_DESCRIPTIONS["focus_on"])
-    standards: Optional[str] = Field(None, description=CODEREVIEW_FIELD_DESCRIPTIONS["standards"])
-    severity_filter: str = Field("all", description=CODEREVIEW_FIELD_DESCRIPTIONS["severity_filter"])
-
-
-class CodeReviewTool(BaseTool):
-    """
-    Professional code review tool implementation.
-
-    This tool analyzes code for bugs, security vulnerabilities, performance
-    issues, and code quality problems. It provides detailed feedback with
-    severity ratings and specific remediation steps.
-    """
+    def __init__(self):
+        super().__init__()
+        self.initial_request = None
+        self.review_config = {}

    def get_name(self) -> str:
        return "codereview"

    def get_description(self) -> str:
        return (
-            "PROFESSIONAL CODE REVIEW - Comprehensive analysis for bugs, security, and quality. "
-            "Supports both individual files and entire directories/projects. "
-            "Use this when you need to review code, check for issues, find bugs, or perform security audits. "
-            "ALSO use this to validate claims about code, verify code flow and logic, confirm assertions, "
-            "cross-check functionality, or investigate how code actually behaves when you need to be certain. "
-            "I'll identify issues by severity (Critical→High→Medium→Low) with specific fixes. "
-            "Supports focused reviews: security, performance, or quick checks. "
-            "Choose thinking_mode based on review scope: 'low' for small code snippets, "
-            "'medium' for standard files/modules (default), 'high' for complex systems/architectures, "
-            "'max' for critical security audits or large codebases requiring deepest analysis. "
-            "Note: If you're not currently using a top-tier model such as Opus 4 or above, these tools "
-            "can provide enhanced capabilities."
+            "COMPREHENSIVE CODE REVIEW WORKFLOW - Step-by-step code review with expert analysis. "
+            "This tool guides you through a systematic investigation process where you:\\n\\n"
+            "1. Start with step 1: describe your code review investigation plan\\n"
+            "2. STOP and investigate code structure, patterns, and potential issues\\n"
+            "3. Report findings in step 2 with concrete evidence from actual code analysis\\n"
+            "4. Continue investigating between each step\\n"
+            "5. Track findings, relevant files, and issues throughout\\n"
+            "6. Update assessments as understanding evolves\\n"
+            "7. Once investigation is complete, receive expert analysis\\n\\n"
+            "IMPORTANT: This tool enforces investigation between steps:\\n"
+            "- After each call, you MUST investigate before calling again\\n"
+            "- Each step must include NEW evidence from code examination\\n"
+            "- No recursive calls without actual investigation work\\n"
+            "- The tool will specify which step number to use next\\n"
+            "- Follow the required_actions list for investigation guidance\\n\\n"
+            "Perfect for: comprehensive code review, security audits, performance analysis, "
+            "architectural assessment, code quality evaluation, anti-pattern detection."
        )

-    def get_input_schema(self) -> dict[str, Any]:
-        schema = {
-            "type": "object",
-            "properties": {
-                "files": {
-                    "type": "array",
-                    "items": {"type": "string"},
-                    "description": CODEREVIEW_FIELD_DESCRIPTIONS["files"],
-                },
-                "model": self.get_model_field_schema(),
-                "prompt": {
-                    "type": "string",
-                    "description": CODEREVIEW_FIELD_DESCRIPTIONS["prompt"],
-                },
-                "images": {
-                    "type": "array",
-                    "items": {"type": "string"},
-                    "description": CODEREVIEW_FIELD_DESCRIPTIONS["images"],
-                },
-                "review_type": {
-                    "type": "string",
-                    "enum": ["full", "security", "performance", "quick"],
-                    "default": "full",
-                    "description": CODEREVIEW_FIELD_DESCRIPTIONS["review_type"],
-                },
-                "focus_on": {
-                    "type": "string",
-                    "description": CODEREVIEW_FIELD_DESCRIPTIONS["focus_on"],
-                },
-                "standards": {
-                    "type": "string",
-                    "description": CODEREVIEW_FIELD_DESCRIPTIONS["standards"],
-                },
-                "severity_filter": {
-                    "type": "string",
-                    "enum": ["critical", "high", "medium", "low", "all"],
-                    "default": "all",
-                    "description": CODEREVIEW_FIELD_DESCRIPTIONS["severity_filter"],
-                },
-                "temperature": {
-                    "type": "number",
-                    "description": "Temperature (0-1, default 0.2 for consistency)",
-                    "minimum": 0,
-                    "maximum": 1,
-                },
-                "thinking_mode": {
-                    "type": "string",
-                    "enum": ["minimal", "low", "medium", "high", "max"],
-                    "description": (
-                        "Thinking depth: minimal (0.5% of model max), low (8%), medium (33%), high (67%), "
-                        "max (100% of model max)"
-                    ),
-                },
-                "use_websearch": {
-                    "type": "boolean",
-                    "description": (
-                        "Enable web search for documentation, best practices, and current information. "
-                        "Particularly useful for: brainstorming sessions, architectural design discussions, "
-                        "exploring industry best practices, working with specific frameworks/technologies, "
-                        "researching solutions to complex problems, or when current documentation and community "
-                        "insights would enhance the analysis."
-                    ),
-                    "default": True,
-                },
-                "continuation_id": {
-                    "type": "string",
-                    "description": (
-                        "Thread continuation ID for multi-turn conversations. Can be used to continue "
-                        "conversations across different tools. Only provide this if continuing a previous "
-                        "conversation thread."
-                    ),
-                },
-            },
-            "required": ["files", "prompt"] + (["model"] if self.is_effective_auto_mode() else []),
-        }
-
-        return schema
-
    def get_system_prompt(self) -> str:
        return CODEREVIEW_PROMPT

    def get_default_temperature(self) -> float:
        return TEMPERATURE_ANALYTICAL

-    # Line numbers are enabled by default from base class for precise feedback
+    def get_model_category(self) -> "ToolModelCategory":
+        """Code review requires thorough analysis and reasoning"""
+        from tools.models import ToolModelCategory

-    def get_request_model(self):
+        return ToolModelCategory.EXTENDED_REASONING
+
+    def get_workflow_request_model(self):
+        """Return the code review workflow-specific request model."""
        return CodeReviewRequest

-    async def prepare_prompt(self, request: CodeReviewRequest) -> str:
-        """
-        Prepare the code review prompt with customized instructions.
+    def get_input_schema(self) -> dict[str, Any]:
+        """Generate input schema using WorkflowSchemaBuilder with code review-specific overrides."""
+        from .workflow.schema_builders import WorkflowSchemaBuilder

-        This method reads the requested files, validates token limits,
-        and constructs a detailed prompt based on the review parameters.
+        # Code review workflow-specific field overrides
+        codereview_field_overrides = {
+            "step": {
+                "type": "string",
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["step"],
+            },
+            "step_number": {
+                "type": "integer",
+                "minimum": 1,
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["step_number"],
+            },
+            "total_steps": {
+                "type": "integer",
+                "minimum": 1,
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["total_steps"],
+            },
+            "next_step_required": {
+                "type": "boolean",
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["next_step_required"],
+            },
+            "findings": {
+                "type": "string",
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["findings"],
+            },
+            "files_checked": {
+                "type": "array",
+                "items": {"type": "string"},
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["files_checked"],
+            },
+            "relevant_files": {
+                "type": "array",
+                "items": {"type": "string"},
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["relevant_files"],
+            },
+            "confidence": {
+                "type": "string",
+                "enum": ["exploring", "low", "medium", "high", "certain"],
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["confidence"],
+            },
+            "backtrack_from_step": {
+                "type": "integer",
+                "minimum": 1,
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["backtrack_from_step"],
+            },
+            "issues_found": {
+                "type": "array",
+                "items": {"type": "object"},
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["issues_found"],
+            },
+            "images": {
+                "type": "array",
+                "items": {"type": "string"},
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["images"],
+            },
+            # Code review-specific fields (for step 1)
+            "review_type": {
+                "type": "string",
+                "enum": ["full", "security", "performance", "quick"],
+                "default": "full",
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["review_type"],
+            },
+            "focus_on": {
+                "type": "string",
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["focus_on"],
+            },
+            "standards": {
+                "type": "string",
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["standards"],
+            },
+            "severity_filter": {
+                "type": "string",
+                "enum": ["critical", "high", "medium", "low", "all"],
+                "default": "all",
+                "description": CODEREVIEW_WORKFLOW_FIELD_DESCRIPTIONS["severity_filter"],
+            },
+        }

-        Args:
-            request: The validated review request
-
-        Returns:
-            str: Complete prompt for the model
-
-        Raises:
-            ValueError: If the code exceeds token limits
-        """
-        # Check for prompt.txt in files
-        prompt_content, updated_files = self.handle_prompt_file(request.files)
-
-        # If prompt.txt was found, incorporate it into the prompt
-        if prompt_content:
-            request.prompt = prompt_content + "\n\n" + request.prompt
-
-        # Update request files list
-        if updated_files is not None:
-            request.files = updated_files
-
-        # File size validation happens at MCP boundary in server.py
-
-        # Check user input size at MCP transport boundary (before adding internal content)
-        user_content = request.prompt
-        size_check = self.check_prompt_size(user_content)
-        if size_check:
-            from tools.models import ToolOutput
-
-            raise ValueError(f"MCP_SIZE_CHECK:{ToolOutput(**size_check).model_dump_json()}")
-
-        # Also check focus_on field if provided (user input)
-        if request.focus_on:
-            focus_size_check = self.check_prompt_size(request.focus_on)
-            if focus_size_check:
-                from tools.models import ToolOutput
-
-                raise ValueError(f"MCP_SIZE_CHECK:{ToolOutput(**focus_size_check).model_dump_json()}")
-
-        # Use centralized file processing logic
-        continuation_id = getattr(request, "continuation_id", None)
-        file_content, processed_files = self._prepare_file_content_for_prompt(request.files, continuation_id, "Code")
-        self._actually_processed_files = processed_files
-
-        # Build customized review instructions based on review type
-        review_focus = []
-        if request.review_type == "security":
-            review_focus.append("Focus on security vulnerabilities and authentication issues")
-        elif request.review_type == "performance":
-            review_focus.append("Focus on performance bottlenecks and optimization opportunities")
-        elif request.review_type == "quick":
-            review_focus.append("Provide a quick review focusing on critical issues only")
-
-        # Add any additional focus areas specified by the user
-        if request.focus_on:
-            review_focus.append(f"Pay special attention to: {request.focus_on}")
-
-        # Include custom coding standards if provided
-        if request.standards:
-            review_focus.append(f"Enforce these standards: {request.standards}")
-
-        # Apply severity filtering to reduce noise if requested
-        if request.severity_filter != "all":
-            review_focus.append(f"Only report issues of {request.severity_filter} severity or higher")
-
-        focus_instruction = "\n".join(review_focus) if review_focus else ""
-
-        # Add web search instruction if enabled
-        websearch_instruction = self.get_websearch_instruction(
-            request.use_websearch,
-            """When reviewing code, consider if searches for these would help:
- Security vulnerabilities and CVEs for libraries/frameworks used
- Best practices for the languages and frameworks in the code
- Common anti-patterns and their solutions
- Performance optimization techniques
- Recent updates or deprecations in APIs used""",
+        # Use WorkflowSchemaBuilder with code review-specific tool fields
+        return WorkflowSchemaBuilder.build_schema(
+            tool_specific_fields=codereview_field_overrides,
+            model_field_schema=self.get_model_field_schema(),
+            auto_mode=self.is_effective_auto_mode(),
+            tool_name=self.get_name(),
        )

-        # Construct the complete prompt with system instructions and code
-        full_prompt = f"""{self.get_system_prompt()}{websearch_instruction}
+    def get_required_actions(self, step_number: int, confidence: str, findings: str, total_steps: int) -> list[str]:
+        """Define required actions for each investigation phase."""
+        if step_number == 1:
+            # Initial code review investigation tasks
+            return [
+                "Read and understand the code files specified for review",
+                "Examine the overall structure, architecture, and design patterns used",
+                "Identify the main components, classes, and functions in the codebase",
+                "Understand the business logic and intended functionality",
+                "Look for obvious issues: bugs, security concerns, performance problems",
+                "Note any code smells, anti-patterns, or areas of concern",
+            ]
+        elif confidence in ["exploring", "low"]:
+            # Need deeper investigation
+            return [
+                "Examine specific code sections you've identified as concerning",
+                "Analyze security implications: input validation, authentication, authorization",
+                "Check for performance issues: algorithmic complexity, resource usage, inefficiencies",
+                "Look for architectural problems: tight coupling, missing abstractions, scalability issues",
+                "Identify code quality issues: readability, maintainability, error handling",
+                "Search for over-engineering, unnecessary complexity, or design patterns that could be simplified",
+            ]
+        elif confidence in ["medium", "high"]:
+            # Close to completion - need final verification
+            return [
+                "Verify all identified issues have been properly documented with severity levels",
+                "Check for any missed critical security vulnerabilities or performance bottlenecks",
+                "Confirm that architectural concerns and code quality issues are comprehensively captured",
+                "Ensure positive aspects and well-implemented patterns are also noted",
+                "Validate that your assessment aligns with the review type and focus areas specified",
+                "Double-check that findings are actionable and provide clear guidance for improvements",
+            ]
+        else:
+            # General investigation needed
+            return [
+                "Continue examining the codebase for additional patterns and potential issues",
+                "Gather more evidence using appropriate code analysis techniques",
+                "Test your assumptions about code behavior and design decisions",
+                "Look for patterns that confirm or refute your current assessment",
+                "Focus on areas that haven't been thoroughly examined yet",
+            ]

-=== USER CONTEXT ===
-{request.prompt}
-=== END CONTEXT ===
-
-{focus_instruction}
-
-=== CODE TO REVIEW ===
-{file_content}
-=== END CODE ===
-
-Please provide a code review aligned with the user's context and expectations, following the format specified """
-        "in the system prompt." ""
-
-        return full_prompt
-
-    def format_response(self, response: str, request: CodeReviewRequest, model_info: Optional[dict] = None) -> str:
+    def should_call_expert_analysis(self, consolidated_findings, request=None) -> bool:
        """
-        Format the review response.
+        Decide when to call external model based on investigation completeness.

-        Args:
-            response: The raw review from the model
-            request: The original request for context
-            model_info: Optional dict with model metadata
-
-        Returns:
-            str: Formatted response with next steps
+        Don't call expert analysis if Claude has certain confidence - trust their judgment.
        """
-        return f"""{response}
+        # Check if user requested to skip assistant model
+        if request and not self.get_request_use_assistant_model(request):
+            return False

---
+        # Check if we have meaningful investigation data
+        return (
+            len(consolidated_findings.relevant_files) > 0
+            or len(consolidated_findings.findings) >= 2
+            or len(consolidated_findings.issues_found) > 0
+        )

-**Your Next Steps:**
+    def prepare_expert_analysis_context(self, consolidated_findings) -> str:
+        """Prepare context for external model call for final code review validation."""
+        context_parts = [
+            f"=== CODE REVIEW REQUEST ===\\n{self.initial_request or 'Code review workflow initiated'}\\n=== END REQUEST ==="
+        ]

-1. **Understand the Context**: First examine the specific functions, files, and code sections mentioned in """
-        """the review to understand each issue thoroughly.
+        # Add investigation summary
+        investigation_summary = self._build_code_review_summary(consolidated_findings)
+        context_parts.append(
+            f"\\n=== CLAUDE'S CODE REVIEW INVESTIGATION ===\\n{investigation_summary}\\n=== END INVESTIGATION ==="
+        )

-2. **Present Options to User**: After understanding the issues, ask the user which specific improvements """
-        """they would like to implement, presenting them as a clear list of options.
+        # Add review configuration context if available
+        if self.review_config:
+            config_text = "\\n".join(f"- {key}: {value}" for key, value in self.review_config.items() if value)
+            context_parts.append(f"\\n=== REVIEW CONFIGURATION ===\\n{config_text}\\n=== END CONFIGURATION ===")

-3. **Implement Selected Fixes**: Only implement the fixes the user chooses, ensuring each change is made """
-        """correctly and maintains code quality.
+        # Add relevant code elements if available
+        if consolidated_findings.relevant_context:
+            methods_text = "\\n".join(f"- {method}" for method in consolidated_findings.relevant_context)
+            context_parts.append(f"\\n=== RELEVANT CODE ELEMENTS ===\\n{methods_text}\\n=== END CODE ELEMENTS ===")

-Remember: Always understand the code context before suggesting fixes, and let the user decide which """
-        """improvements to implement."""
+        # Add issues found if available
+        if consolidated_findings.issues_found:
+            issues_text = "\\n".join(
+                f"[{issue.get('severity', 'unknown').upper()}] {issue.get('description', 'No description')}"
+                for issue in consolidated_findings.issues_found
+            )
+            context_parts.append(f"\\n=== ISSUES IDENTIFIED ===\\n{issues_text}\\n=== END ISSUES ===")
+
+        # Add assessment evolution if available
+        if consolidated_findings.hypotheses:
+            assessments_text = "\\n".join(
+                f"Step {h['step']} ({h['confidence']} confidence): {h['hypothesis']}"
+                for h in consolidated_findings.hypotheses
+            )
+            context_parts.append(f"\\n=== ASSESSMENT EVOLUTION ===\\n{assessments_text}\\n=== END ASSESSMENTS ===")
+
+        # Add images if available
+        if consolidated_findings.images:
+            images_text = "\\n".join(f"- {img}" for img in consolidated_findings.images)
+            context_parts.append(
+                f"\\n=== VISUAL REVIEW INFORMATION ===\\n{images_text}\\n=== END VISUAL INFORMATION ==="
+            )
+
+        return "\\n".join(context_parts)
+
+    def _build_code_review_summary(self, consolidated_findings) -> str:
+        """Prepare a comprehensive summary of the code review investigation."""
+        summary_parts = [
+            "=== SYSTEMATIC CODE REVIEW INVESTIGATION SUMMARY ===",
+            f"Total steps: {len(consolidated_findings.findings)}",
+            f"Files examined: {len(consolidated_findings.files_checked)}",
+            f"Relevant files identified: {len(consolidated_findings.relevant_files)}",
+            f"Code elements analyzed: {len(consolidated_findings.relevant_context)}",
+            f"Issues identified: {len(consolidated_findings.issues_found)}",
+            "",
+            "=== INVESTIGATION PROGRESSION ===",
+        ]
+
+        for finding in consolidated_findings.findings:
+            summary_parts.append(finding)
+
+        return "\\n".join(summary_parts)
+
+    def should_include_files_in_expert_prompt(self) -> bool:
+        """Include files in expert analysis for comprehensive code review."""
+        return True
+
+    def should_embed_system_prompt(self) -> bool:
+        """Embed system prompt in expert analysis for proper context."""
+        return True
+
+    def get_expert_thinking_mode(self) -> str:
+        """Use high thinking mode for thorough code review analysis."""
+        return "high"
+
+    def get_expert_analysis_instruction(self) -> str:
+        """Get specific instruction for code review expert analysis."""
+        return (
+            "Please provide comprehensive code review analysis based on the investigation findings. "
+            "Focus on identifying any remaining issues, validating the completeness of the analysis, "
+            "and providing final recommendations for code improvements, following the severity-based "
+            "format specified in the system prompt."
+        )
+
+    # Hook method overrides for code review-specific behavior
+
+    def prepare_step_data(self, request) -> dict:
+        """
+        Map code review-specific fields for internal processing.
+        """
+        step_data = {
+            "step": request.step,
+            "step_number": request.step_number,
+            "findings": request.findings,
+            "files_checked": request.files_checked,
+            "relevant_files": request.relevant_files,
+            "relevant_context": request.relevant_context,
+            "issues_found": request.issues_found,
+            "confidence": request.confidence,
+            "hypothesis": request.findings,  # Map findings to hypothesis for compatibility
+            "images": request.images or [],
+        }
+        return step_data
+
+    def should_skip_expert_analysis(self, request, consolidated_findings) -> bool:
+        """
+        Code review workflow skips expert analysis when Claude has "certain" confidence.
+        """
+        return request.confidence == "certain" and not request.next_step_required
+
+    def store_initial_issue(self, step_description: str):
+        """Store initial request for expert analysis."""
+        self.initial_request = step_description
+
+    # Override inheritance hooks for code review-specific behavior
+
+    def get_completion_status(self) -> str:
+        """Code review tools use review-specific status."""
+        return "code_review_complete_ready_for_implementation"
+
+    def get_completion_data_key(self) -> str:
+        """Code review uses 'complete_code_review' key."""
+        return "complete_code_review"
+
+    def get_final_analysis_from_request(self, request):
+        """Code review tools use 'findings' field."""
+        return request.findings
+
+    def get_confidence_level(self, request) -> str:
+        """Code review tools use 'certain' for high confidence."""
+        return "certain"
+
+    def get_completion_message(self) -> str:
+        """Code review-specific completion message."""
+        return (
+            "Code review complete with CERTAIN confidence. You have identified all significant issues "
+            "and provided comprehensive analysis. MANDATORY: Present the user with the complete review results "
+            "categorized by severity, and IMMEDIATELY proceed with implementing the highest priority fixes "
+            "or provide specific guidance for improvements. Focus on actionable recommendations."
+        )
+
+    def get_skip_reason(self) -> str:
+        """Code review-specific skip reason."""
+        return "Claude completed comprehensive code review with full confidence"
+
+    def get_skip_expert_analysis_status(self) -> str:
+        """Code review-specific expert analysis skip status."""
+        return "skipped_due_to_certain_review_confidence"
+
+    def prepare_work_summary(self) -> str:
+        """Code review-specific work summary."""
+        return self._build_code_review_summary(self.consolidated_findings)
+
+    def get_completion_next_steps_message(self, expert_analysis_used: bool = False) -> str:
+        """
+        Code review-specific completion message.
+        """
+        base_message = (
+            "CODE REVIEW IS COMPLETE. You MUST now summarize and present ALL review findings organized by "
+            "severity (Critical → High → Medium → Low), specific code locations with line numbers, and exact "
+            "recommendations for improvement. Clearly prioritize the top 3 issues that need immediate attention. "
+            "Provide concrete, actionable guidance for each issue—make it easy for a developer to understand "
+            "exactly what needs to be fixed and how to implement the improvements."
+        )
+
+        # Add expert analysis guidance only when expert analysis was actually used
+        if expert_analysis_used:
+            expert_guidance = self.get_expert_analysis_guidance()
+            if expert_guidance:
+                return f"{base_message}\n\n{expert_guidance}"
+
+        return base_message
+
+    def get_expert_analysis_guidance(self) -> str:
+        """
+        Provide specific guidance for handling expert analysis in code reviews.
+        """
+        return (
+            "IMPORTANT: Analysis from an assistant model has been provided above. You MUST critically evaluate and validate "
+            "the expert findings rather than accepting them blindly. Cross-reference the expert analysis with "
+            "your own investigation findings, verify that suggested improvements are appropriate for this "
+            "codebase's context and patterns, and ensure recommendations align with the project's standards. "
+            "Present a synthesis that combines your systematic review with validated expert insights, clearly "
+            "distinguishing between findings you've independently confirmed and additional insights from expert analysis."
+        )
+
+    def get_step_guidance_message(self, request) -> str:
+        """
+        Code review-specific step guidance with detailed investigation instructions.
+        """
+        step_guidance = self.get_code_review_step_guidance(request.step_number, request.confidence, request)
+        return step_guidance["next_steps"]
+
+    def get_code_review_step_guidance(self, step_number: int, confidence: str, request) -> dict[str, Any]:
+        """
+        Provide step-specific guidance for code review workflow.
+        """
+        # Generate the next steps instruction based on required actions
+        required_actions = self.get_required_actions(step_number, confidence, request.findings, request.total_steps)
+
+        if step_number == 1:
+            next_steps = (
+                f"MANDATORY: DO NOT call the {self.get_name()} tool again immediately. You MUST first examine "
+                f"the code files thoroughly using appropriate tools. CRITICAL AWARENESS: You need to understand "
+                f"the code structure, identify potential issues across security, performance, and quality dimensions, "
+                f"and look for architectural concerns, over-engineering, unnecessary complexity, and scalability issues. "
+                f"Use file reading tools, code analysis, and systematic examination to gather comprehensive information. "
+                f"Only call {self.get_name()} again AFTER completing your investigation. When you call "
+                f"{self.get_name()} next time, use step_number: {step_number + 1} and report specific "
+                f"files examined, issues found, and code quality assessments discovered."
+            )
+        elif confidence in ["exploring", "low"]:
+            next_steps = (
+                f"STOP! Do NOT call {self.get_name()} again yet. Based on your findings, you've identified areas that need "
+                f"deeper analysis. MANDATORY ACTIONS before calling {self.get_name()} step {step_number + 1}:\\n"
+                + "\\n".join(f"{i+1}. {action}" for i, action in enumerate(required_actions))
+                + f"\\n\\nOnly call {self.get_name()} again with step_number: {step_number + 1} AFTER "
+                + "completing these code review tasks."
+            )
+        elif confidence in ["medium", "high"]:
+            next_steps = (
+                f"WAIT! Your code review needs final verification. DO NOT call {self.get_name()} immediately. REQUIRED ACTIONS:\\n"
+                + "\\n".join(f"{i+1}. {action}" for i, action in enumerate(required_actions))
+                + f"\\n\\nREMEMBER: Ensure you have identified all significant issues across all severity levels and "
+                f"verified the completeness of your review. Document findings with specific file references and "
+                f"line numbers where applicable, then call {self.get_name()} with step_number: {step_number + 1}."
+            )
+        else:
+            next_steps = (
+                f"PAUSE REVIEW. Before calling {self.get_name()} step {step_number + 1}, you MUST examine more code thoroughly. "
+                + "Required: "
+                + ", ".join(required_actions[:2])
+                + ". "
+                + f"Your next {self.get_name()} call (step_number: {step_number + 1}) must include "
+                f"NEW evidence from actual code analysis, not just theories. NO recursive {self.get_name()} calls "
+                f"without investigation work!"
+            )
+
+        return {"next_steps": next_steps}
+
+    def customize_workflow_response(self, response_data: dict, request) -> dict:
+        """
+        Customize response to match code review workflow format.
+        """
+        # Store initial request on first step
+        if request.step_number == 1:
+            self.initial_request = request.step
+            # Store review configuration for expert analysis
+            if request.relevant_files:
+                self.review_config = {
+                    "relevant_files": request.relevant_files,
+                    "review_type": request.review_type,
+                    "focus_on": request.focus_on,
+                    "standards": request.standards,
+                    "severity_filter": request.severity_filter,
+                }
+
+        # Convert generic status names to code review-specific ones
+        tool_name = self.get_name()
+        status_mapping = {
+            f"{tool_name}_in_progress": "code_review_in_progress",
+            f"pause_for_{tool_name}": "pause_for_code_review",
+            f"{tool_name}_required": "code_review_required",
+            f"{tool_name}_complete": "code_review_complete",
+        }
+
+        if response_data["status"] in status_mapping:
+            response_data["status"] = status_mapping[response_data["status"]]
+
+        # Rename status field to match code review workflow
+        if f"{tool_name}_status" in response_data:
+            response_data["code_review_status"] = response_data.pop(f"{tool_name}_status")
+            # Add code review-specific status fields
+            response_data["code_review_status"]["issues_by_severity"] = {}
+            for issue in self.consolidated_findings.issues_found:
+                severity = issue.get("severity", "unknown")
+                if severity not in response_data["code_review_status"]["issues_by_severity"]:
+                    response_data["code_review_status"]["issues_by_severity"][severity] = 0
+                response_data["code_review_status"]["issues_by_severity"][severity] += 1
+            response_data["code_review_status"]["review_confidence"] = self.get_request_confidence(request)
+
+        # Map complete_codereviewworkflow to complete_code_review
+        if f"complete_{tool_name}" in response_data:
+            response_data["complete_code_review"] = response_data.pop(f"complete_{tool_name}")
+
+        # Map the completion flag to match code review workflow
+        if f"{tool_name}_complete" in response_data:
+            response_data["code_review_complete"] = response_data.pop(f"{tool_name}_complete")
+
+        return response_data
+
+    # Required abstract methods from BaseTool
+    def get_request_model(self):
+        """Return the code review workflow-specific request model."""
+        return CodeReviewRequest
+
+    async def prepare_prompt(self, request) -> str:
+        """Not used - workflow tools use execute_workflow()."""
+        return ""  # Workflow tools use execute_workflow() directly