Docs added to show how a new provider is added

Docs added to show how a new tool is created All tools should add numbers to code for models to be able to reference if needed Enabled line numbering for code for all tools to use Additional tests to validate line numbering is not added to git diffs
2025-06-15 07:02:27 +04:00
parent b5004b91fc
commit 99fab3e83d
27 changed files with 2511 additions and 143 deletions
--- a/docs/adding_tools.md
+++ b/docs/adding_tools.md
@@ -0,0 +1,732 @@
+# Adding a New Tool
+
+This guide explains how to add a new tool to the Zen MCP Server. Tools are the primary way Claude interacts with the AI models, providing specialized capabilities like code review, debugging, test generation, and more.
+
+## Overview
+
+The tool system in Zen MCP Server is designed to be extensible. Each tool:
+- Inherits from the `BaseTool` class
+- Implements required abstract methods
+- Defines a request model for parameter validation
+- Is registered in the server's tool registry
+- Can leverage different AI models based on task requirements
+
+## Architecture Overview
+
+### Key Components
+
+1. **BaseTool** (`tools/base.py`): Abstract base class providing common functionality
+2. **Request Models**: Pydantic models for input validation
+3. **System Prompts**: Specialized prompts that configure AI behavior
+4. **Tool Registry**: Registration system in `server.py`
+
+### Tool Lifecycle
+
+1. Claude calls the tool with parameters
+2. Parameters are validated using Pydantic
+3. File paths are security-checked
+4. Prompt is prepared with system instructions
+5. AI model generates response
+6. Response is formatted and returned
+
+## Step-by-Step Implementation Guide
+
+### 1. Create the Tool File
+
+Create a new file in the `tools/` directory (e.g., `tools/example.py`):
+
+```python
+"""
+Example tool - Brief description of what your tool does
+
+This tool provides [specific functionality] to help developers [achieve goal].
+Key features:
+- Feature 1
+- Feature 2
+- Feature 3
+"""
+
+import logging
+from typing import Any, Optional
+
+from mcp.types import TextContent
+from pydantic import Field
+
+from config import TEMPERATURE_BALANCED
+from systemprompts import EXAMPLE_PROMPT  # You'll create this
+
+from .base import BaseTool, ToolRequest
+from .models import ToolOutput
+
+logger = logging.getLogger(__name__)
+```
+
+### 2. Define the Request Model
+
+Create a Pydantic model that inherits from `ToolRequest`:
+
+```python
+class ExampleRequest(ToolRequest):
+    """Request model for the example tool."""
+    
+    # Required parameters
+    prompt: str = Field(
+        ...,
+        description="The main input/question for the tool"
+    )
+    
+    # Optional parameters with defaults
+    files: Optional[list[str]] = Field(
+        default=None,
+        description="Files to analyze (must be absolute paths)"
+    )
+    
+    focus_area: Optional[str] = Field(
+        default=None,
+        description="Specific aspect to focus on"
+    )
+    
+    # You can add tool-specific parameters
+    output_format: Optional[str] = Field(
+        default="detailed",
+        description="Output format: 'summary', 'detailed', or 'actionable'"
+    )
+```
+
+### 3. Implement the Tool Class
+
+```python
+class ExampleTool(BaseTool):
+    """Implementation of the example tool."""
+    
+    def get_name(self) -> str:
+        """Return the tool's unique identifier."""
+        return "example"
+    
+    def get_description(self) -> str:
+        """Return detailed description for Claude."""
+        return (
+            "EXAMPLE TOOL - Brief tagline describing the tool's purpose. "
+            "Use this tool when you need to [specific use cases]. "
+            "Perfect for: [scenario 1], [scenario 2], [scenario 3]. "
+            "Supports [key features]. Choose thinking_mode based on "
+            "[guidance for mode selection]. "
+            "Note: If you're not currently using a top-tier model such as "
+            "Opus 4 or above, these tools can provide enhanced capabilities."
+        )
+    
+    def get_input_schema(self) -> dict[str, Any]:
+        """Define the JSON schema for tool parameters."""
+        schema = {
+            "type": "object",
+            "properties": {
+                "prompt": {
+                    "type": "string",
+                    "description": "The main input/question for the tool",
+                },
+                "files": {
+                    "type": "array",
+                    "items": {"type": "string"},
+                    "description": "Files to analyze (must be absolute paths)",
+                },
+                "focus_area": {
+                    "type": "string",
+                    "description": "Specific aspect to focus on",
+                },
+                "output_format": {
+                    "type": "string",
+                    "enum": ["summary", "detailed", "actionable"],
+                    "description": "Output format type",
+                    "default": "detailed",
+                },
+                "model": self.get_model_field_schema(),
+                "temperature": {
+                    "type": "number",
+                    "description": "Temperature (0-1, default varies by tool)",
+                    "minimum": 0,
+                    "maximum": 1,
+                },
+                "thinking_mode": {
+                    "type": "string",
+                    "enum": ["minimal", "low", "medium", "high", "max"],
+                    "description": "Thinking depth: minimal (0.5% of model max), "
+                                   "low (8%), medium (33%), high (67%), max (100%)",
+                },
+                "continuation_id": {
+                    "type": "string",
+                    "description": "Thread continuation ID for multi-turn conversations",
+                },
+            },
+            "required": ["prompt"] + (
+                ["model"] if self.is_effective_auto_mode() else []
+            ),
+        }
+        return schema
+    
+    def get_system_prompt(self) -> str:
+        """Return the system prompt for this tool."""
+        return EXAMPLE_PROMPT  # Defined in systemprompts/
+    
+    def get_default_temperature(self) -> float:
+        """Return default temperature for this tool."""
+        # Use predefined constants from config.py:
+        # TEMPERATURE_CREATIVE (0.7) - For creative tasks
+        # TEMPERATURE_BALANCED (0.5) - For balanced tasks
+        # TEMPERATURE_ANALYTICAL (0.2) - For analytical tasks
+        return TEMPERATURE_BALANCED
+    
+    def get_model_category(self):
+        """Specify which type of model this tool needs."""
+        from tools.models import ToolModelCategory
+        
+        # Choose based on your tool's needs:
+        # FAST_RESPONSE - Quick responses, cost-efficient (chat, simple queries)
+        # BALANCED - Standard analysis and generation
+        # EXTENDED_REASONING - Complex analysis, deep thinking (debug, review)
+        return ToolModelCategory.BALANCED
+    
+    def get_request_model(self):
+        """Return the request model class."""
+        return ExampleRequest
+    
+    def wants_line_numbers_by_default(self) -> bool:
+        """Whether to add line numbers to code files."""
+        # Return True if your tool benefits from precise line references
+        # (e.g., code review, debugging, refactoring)
+        # Return False for general analysis or token-sensitive operations
+        return False
+    
+    async def prepare_prompt(self, request: ExampleRequest) -> str:
+        """
+        Prepare the complete prompt for the AI model.
+        
+        This method combines:
+        - System prompt (behavior configuration)
+        - User request
+        - File contents (if provided)
+        - Additional context
+        """
+        # Check for prompt.txt in files (handles large prompts)
+        prompt_content, updated_files = self.handle_prompt_file(request.files)
+        if prompt_content:
+            request.prompt = prompt_content
+        if updated_files is not None:
+            request.files = updated_files
+        
+        # Build the prompt parts
+        prompt_parts = []
+        
+        # Add main request
+        prompt_parts.append(f"=== USER REQUEST ===")
+        prompt_parts.append(f"Focus Area: {request.focus_area}" if request.focus_area else "")
+        prompt_parts.append(f"Output Format: {request.output_format}")
+        prompt_parts.append(request.prompt)
+        prompt_parts.append("=== END REQUEST ===")
+        
+        # Add file contents if provided
+        if request.files:
+            # Use the centralized file handling (respects continuation)
+            file_content = self._prepare_file_content_for_prompt(
+                request.files,
+                request.continuation_id,
+                "Files to analyze"
+            )
+            if file_content:
+                prompt_parts.append("\n=== FILES ===")
+                prompt_parts.append(file_content)
+                prompt_parts.append("=== END FILES ===")
+        
+        # Validate token limits
+        full_prompt = "\n".join(filter(None, prompt_parts))
+        self._validate_token_limit(full_prompt, "Prompt")
+        
+        return full_prompt
+    
+    def format_response(self, response: str, request: ExampleRequest, 
+                       model_info: Optional[dict] = None) -> str:
+        """
+        Format the AI's response for display.
+        
+        Override this to add custom formatting, headers, or structure.
+        The base class handles special status parsing automatically.
+        """
+        # Example: Add a footer with next steps
+        return f"{response}\n\n---\n\n**Next Steps:** Review the analysis above and proceed with implementation."
+```
+
+### 4. Handle Large Prompts (Optional)
+
+If your tool might receive large text inputs, override the `execute` method:
+
+```python
+async def execute(self, arguments: dict[str, Any]) -> list[TextContent]:
+    """Override to check prompt size before processing."""
+    # Validate request first
+    request_model = self.get_request_model()
+    request = request_model(**arguments)
+    
+    # Check if prompt is too large for MCP limits
+    size_check = self.check_prompt_size(request.prompt)
+    if size_check:
+        return [TextContent(type="text", text=ToolOutput(**size_check).model_dump_json())]
+    
+    # Continue with normal execution
+    return await super().execute(arguments)
+```
+
+### 5. Create the System Prompt
+
+Create a new file in `systemprompts/` (e.g., `systemprompts/example_prompt.py`):
+
+```python
+"""System prompt for the example tool."""
+
+EXAMPLE_PROMPT = """You are an AI assistant specialized in [tool purpose].
+
+Your role is to [primary responsibility] by [approach/methodology].
+
+Key principles:
+1. [Principle 1]
+2. [Principle 2]
+3. [Principle 3]
+
+When analyzing content:
+- [Guideline 1]
+- [Guideline 2]
+- [Guideline 3]
+
+Output format:
+- Start with a brief summary
+- Provide detailed analysis organized by [structure]
+- Include specific examples and recommendations
+- End with actionable next steps
+
+Remember to:
+- Be specific and reference exact locations (file:line) when discussing code
+- Provide practical, implementable suggestions
+- Consider the broader context and implications
+- Maintain a helpful, constructive tone
+"""
+```
+
+Add the import to `systemprompts/__init__.py`:
+
+```python
+from .example_prompt import EXAMPLE_PROMPT
+```
+
+### 6. Register the Tool
+
+#### 6.1. Import in server.py
+
+Add the import at the top of `server.py`:
+
+```python
+from tools.example import ExampleTool
+```
+
+#### 6.2. Add to TOOLS Dictionary
+
+Find the `TOOLS` dictionary in `server.py` and add your tool:
+
+```python
+TOOLS = {
+    "analyze": AnalyzeTool(),
+    "chat": ChatTool(),
+    "review_code": CodeReviewTool(),
+    "debug": DebugTool(),
+    "review_changes": PreCommitTool(),
+    "generate_tests": TestGenTool(),
+    "thinkdeep": ThinkDeepTool(),
+    "refactor": RefactorTool(),
+    "example": ExampleTool(),  # Add your tool here
+}
+```
+
+### 7. Write Tests
+
+Create unit tests in `tests/test_example.py`:
+
+```python
+"""Tests for the example tool."""
+
+import pytest
+from unittest.mock import Mock, patch
+
+from tools.example import ExampleTool, ExampleRequest
+from tools.models import ToolModelCategory
+
+
+class TestExampleTool:
+    """Test suite for ExampleTool."""
+    
+    def test_tool_metadata(self):
+        """Test tool metadata methods."""
+        tool = ExampleTool()
+        
+        assert tool.get_name() == "example"
+        assert "EXAMPLE TOOL" in tool.get_description()
+        assert tool.get_default_temperature() == 0.5
+        assert tool.get_model_category() == ToolModelCategory.BALANCED
+    
+    def test_request_validation(self):
+        """Test request model validation."""
+        # Valid request
+        request = ExampleRequest(prompt="Test prompt")
+        assert request.prompt == "Test prompt"
+        assert request.output_format == "detailed"  # default
+        
+        # Invalid request (missing required field)
+        with pytest.raises(ValueError):
+            ExampleRequest()
+    
+    def test_input_schema(self):
+        """Test input schema generation."""
+        tool = ExampleTool()
+        schema = tool.get_input_schema()
+        
+        assert schema["type"] == "object"
+        assert "prompt" in schema["properties"]
+        assert "prompt" in schema["required"]
+        assert "model" in schema["properties"]
+    
+    @pytest.mark.asyncio
+    async def test_prepare_prompt(self):
+        """Test prompt preparation."""
+        tool = ExampleTool()
+        request = ExampleRequest(
+            prompt="Analyze this code",
+            focus_area="performance",
+            output_format="summary"
+        )
+        
+        with patch.object(tool, '_validate_token_limit'):
+            prompt = await tool.prepare_prompt(request)
+        
+        assert "USER REQUEST" in prompt
+        assert "Analyze this code" in prompt
+        assert "Focus Area: performance" in prompt
+        assert "Output Format: summary" in prompt
+    
+    @pytest.mark.asyncio
+    async def test_file_handling(self):
+        """Test file content handling."""
+        tool = ExampleTool()
+        request = ExampleRequest(
+            prompt="Analyze",
+            files=["/path/to/file.py"]
+        )
+        
+        # Mock file reading
+        with patch.object(tool, '_prepare_file_content_for_prompt') as mock_prep:
+            mock_prep.return_value = "file contents"
+            with patch.object(tool, '_validate_token_limit'):
+                prompt = await tool.prepare_prompt(request)
+        
+        assert "FILES" in prompt
+        assert "file contents" in prompt
+```
+
+### 8. Add Simulator Tests (Optional)
+
+For tools that interact with external systems, create simulator tests in `simulator_tests/test_example_basic.py`:
+
+```python
+"""Basic simulator test for example tool."""
+
+from simulator_tests.base_test import SimulatorTest
+
+
+class TestExampleBasic(SimulatorTest):
+    """Test basic example tool functionality."""
+    
+    def test_example_analysis(self):
+        """Test basic analysis with example tool."""
+        result = self.call_tool(
+            "example",
+            {
+                "prompt": "Analyze the architecture of this codebase",
+                "model": "flash",
+                "output_format": "summary"
+            }
+        )
+        
+        self.assert_tool_success(result)
+        self.assert_content_contains(result, ["architecture", "summary"])
+```
+
+### 9. Update Documentation
+
+Add your tool to the README.md in the tools section:
+
+```markdown
+### Available Tools
+
+- **example** - Brief description of what the tool does
+  - Use cases: [scenario 1], [scenario 2]
+  - Supports: [key features]
+  - Best model: `balanced` category for standard analysis
+```
+
+## Advanced Features
+
+### Understanding Conversation Memory
+
+The `continuation_id` feature enables multi-turn conversations using the conversation memory system (`utils/conversation_memory.py`). Here's how it works:
+
+1. **Thread Creation**: When a tool wants to enable follow-up conversations, it creates a thread
+2. **Turn Storage**: Each exchange (user/assistant) is stored as a turn with metadata
+3. **Cross-Tool Continuation**: Any tool can continue a conversation started by another tool
+4. **Automatic History**: When `continuation_id` is provided, the full conversation history is reconstructed
+
+Key concepts:
+- **ThreadContext**: Contains all conversation turns, files, and metadata
+- **ConversationTurn**: Single exchange with role, content, timestamp, files, tool attribution
+- **Thread Chains**: Conversations can have parent threads for extended discussions
+- **Turn Limits**: Default 20 turns (configurable via MAX_CONVERSATION_TURNS)
+
+Example flow:
+```python
+# Tool A creates thread
+thread_id = create_thread("analyze", request_data)
+
+# Tool A adds its response
+add_turn(thread_id, "assistant", response, files=[...], tool_name="analyze")
+
+# Tool B continues the same conversation
+context = get_thread(thread_id)  # Gets full history
+# Tool B sees all previous turns and files
+```
+
+### Supporting Special Response Types
+
+Tools can return special status responses for complex interactions. These are defined in `tools/models.py`:
+
+```python
+# Currently supported special statuses:
+SPECIAL_STATUS_MODELS = {
+    "need_clarification": NeedClarificationModel,
+    "focused_review_required": FocusedReviewRequiredModel,
+    "more_review_required": MoreReviewRequiredModel,
+    "more_testgen_required": MoreTestGenRequiredModel,
+    "more_refactor_required": MoreRefactorRequiredModel,
+    "resend_prompt": ResendPromptModel,
+}
+```
+
+Example implementation:
+```python
+# In your tool's format_response or within the AI response:
+if need_clarification:
+    return json.dumps({
+        "status": "need_clarification",
+        "questions": ["What specific aspect should I focus on?"],
+        "context": "I need more information to proceed"
+    })
+
+# For custom review status:
+if more_analysis_needed:
+    return json.dumps({
+        "status": "focused_review_required", 
+        "files": ["/path/to/file1.py", "/path/to/file2.py"],
+        "focus": "security",
+        "reason": "Found potential SQL injection vulnerabilities"
+    })
+```
+
+To add a new custom response type:
+
+1. Define the model in `tools/models.py`:
+```python
+class CustomStatusModel(BaseModel):
+    """Model for custom status responses"""
+    status: Literal["custom_status"]
+    custom_field: str
+    details: dict[str, Any]
+```
+
+2. Register it in `SPECIAL_STATUS_MODELS`:
+```python
+SPECIAL_STATUS_MODELS = {
+    # ... existing statuses ...
+    "custom_status": CustomStatusModel,
+}
+```
+
+3. The base tool will automatically handle parsing and validation
+
+### Token Management
+
+For tools processing large amounts of data:
+
+```python
+# Calculate available tokens dynamically
+def prepare_large_content(self, files: list[str], remaining_budget: int):
+    # Reserve tokens for response
+    reserve_tokens = 5000
+    
+    # Use model-specific limits
+    effective_max = remaining_budget - reserve_tokens
+    
+    # Process files with budget
+    content = self._prepare_file_content_for_prompt(
+        files,
+        continuation_id,
+        "Analysis files",
+        max_tokens=effective_max,
+        reserve_tokens=reserve_tokens
+    )
+```
+
+### Web Search Integration
+
+Enable web search for tools that benefit from current information:
+
+```python
+# In prepare_prompt:
+websearch_instruction = self.get_websearch_instruction(
+    request.use_websearch,
+    """Consider searching for:
+    - Current best practices for [topic]
+    - Recent updates to [technology]
+    - Community solutions for [problem]"""
+)
+
+full_prompt = f"{system_prompt}{websearch_instruction}\n\n{user_content}"
+```
+
+## Best Practices
+
+1. **Clear Tool Descriptions**: Write descriptive text that helps Claude understand when to use your tool
+2. **Proper Validation**: Use Pydantic models for robust input validation
+3. **Security First**: Always validate file paths are absolute
+4. **Token Awareness**: Handle large inputs gracefully with prompt.txt mechanism
+5. **Model Selection**: Choose appropriate model category for your tool's complexity
+6. **Line Numbers**: Enable for tools needing precise code references
+7. **Error Handling**: Provide helpful error messages for common issues
+8. **Testing**: Write comprehensive unit tests and simulator tests
+9. **Documentation**: Include examples and use cases in your description
+
+## Common Pitfalls to Avoid
+
+1. **Don't Skip Validation**: Always validate inputs, especially file paths
+2. **Don't Ignore Token Limits**: Use `_validate_token_limit` and handle large prompts
+3. **Don't Hardcode Models**: Use model categories for flexibility
+4. **Don't Forget Tests**: Every tool needs tests for reliability
+5. **Don't Break Conventions**: Follow existing patterns from other tools
+
+## Testing Your Tool
+
+### Manual Testing
+
+1. Start the server with your tool registered
+2. Use Claude Desktop to call your tool
+3. Test various parameter combinations
+4. Verify error handling
+
+### Automated Testing
+
+```bash
+# Run unit tests
+pytest tests/test_example.py -xvs
+
+# Run all tests to ensure no regressions
+pytest -xvs
+
+# Run simulator tests if applicable
+python communication_simulator_test.py
+```
+
+## Checklist
+
+Before submitting your PR:
+
+- [ ] Tool class created inheriting from `BaseTool`
+- [ ] All abstract methods implemented
+- [ ] Request model defined with proper validation
+- [ ] System prompt created in `systemprompts/`
+- [ ] Tool registered in `server.py`
+- [ ] Unit tests written and passing
+- [ ] Simulator tests added (if applicable)
+- [ ] Documentation updated
+- [ ] Code follows project style (ruff, black, isort)
+- [ ] Large prompt handling implemented (if needed)
+- [ ] Security validation for file paths
+- [ ] Appropriate model category selected
+- [ ] Tool description is clear and helpful
+
+## Example: Complete Simple Tool
+
+Here's a minimal but complete example tool:
+
+```python
+"""
+Simple calculator tool for mathematical operations.
+"""
+
+from typing import Any, Optional
+from mcp.types import TextContent
+from pydantic import Field
+
+from config import TEMPERATURE_ANALYTICAL
+from .base import BaseTool, ToolRequest
+from .models import ToolOutput
+
+
+class CalculateRequest(ToolRequest):
+    """Request model for calculator tool."""
+    
+    expression: str = Field(
+        ...,
+        description="Mathematical expression to evaluate"
+    )
+
+
+class CalculatorTool(BaseTool):
+    """Simple calculator tool."""
+    
+    def get_name(self) -> str:
+        return "calculate"
+    
+    def get_description(self) -> str:
+        return (
+            "CALCULATOR - Evaluates mathematical expressions. "
+            "Use this for calculations, conversions, and math problems."
+        )
+    
+    def get_input_schema(self) -> dict[str, Any]:
+        schema = {
+            "type": "object",
+            "properties": {
+                "expression": {
+                    "type": "string",
+                    "description": "Mathematical expression to evaluate",
+                },
+                "model": self.get_model_field_schema(),
+            },
+            "required": ["expression"] + (
+                ["model"] if self.is_effective_auto_mode() else []
+            ),
+        }
+        return schema
+    
+    def get_system_prompt(self) -> str:
+        return """You are a mathematical assistant. Evaluate the expression 
+        and explain the calculation steps clearly."""
+    
+    def get_default_temperature(self) -> float:
+        return TEMPERATURE_ANALYTICAL
+    
+    def get_request_model(self):
+        return CalculateRequest
+    
+    async def prepare_prompt(self, request: CalculateRequest) -> str:
+        return f"Calculate: {request.expression}\n\nShow your work step by step."
+```
+
+## Need Help?
+
+- Look at existing tools (`chat.py`, `refactor.py`) for examples
+- Check `base.py` for available helper methods
+- Review test files for testing patterns
+- Ask questions in GitHub issues if stuck