Docs added to show how a new tool is created All tools should add numbers to code for models to be able to reference if needed Enabled line numbering for code for all tools to use Additional tests to validate line numbering is not added to git diffs
23 KiB
Adding a New Tool
This guide explains how to add a new tool to the Zen MCP Server. Tools are the primary way Claude interacts with the AI models, providing specialized capabilities like code review, debugging, test generation, and more.
Overview
The tool system in Zen MCP Server is designed to be extensible. Each tool:
- Inherits from the
BaseToolclass - Implements required abstract methods
- Defines a request model for parameter validation
- Is registered in the server's tool registry
- Can leverage different AI models based on task requirements
Architecture Overview
Key Components
- BaseTool (
tools/base.py): Abstract base class providing common functionality - Request Models: Pydantic models for input validation
- System Prompts: Specialized prompts that configure AI behavior
- Tool Registry: Registration system in
server.py
Tool Lifecycle
- Claude calls the tool with parameters
- Parameters are validated using Pydantic
- File paths are security-checked
- Prompt is prepared with system instructions
- AI model generates response
- Response is formatted and returned
Step-by-Step Implementation Guide
1. Create the Tool File
Create a new file in the tools/ directory (e.g., tools/example.py):
"""
Example tool - Brief description of what your tool does
This tool provides [specific functionality] to help developers [achieve goal].
Key features:
- Feature 1
- Feature 2
- Feature 3
"""
import logging
from typing import Any, Optional
from mcp.types import TextContent
from pydantic import Field
from config import TEMPERATURE_BALANCED
from systemprompts import EXAMPLE_PROMPT # You'll create this
from .base import BaseTool, ToolRequest
from .models import ToolOutput
logger = logging.getLogger(__name__)
2. Define the Request Model
Create a Pydantic model that inherits from ToolRequest:
class ExampleRequest(ToolRequest):
"""Request model for the example tool."""
# Required parameters
prompt: str = Field(
...,
description="The main input/question for the tool"
)
# Optional parameters with defaults
files: Optional[list[str]] = Field(
default=None,
description="Files to analyze (must be absolute paths)"
)
focus_area: Optional[str] = Field(
default=None,
description="Specific aspect to focus on"
)
# You can add tool-specific parameters
output_format: Optional[str] = Field(
default="detailed",
description="Output format: 'summary', 'detailed', or 'actionable'"
)
3. Implement the Tool Class
class ExampleTool(BaseTool):
"""Implementation of the example tool."""
def get_name(self) -> str:
"""Return the tool's unique identifier."""
return "example"
def get_description(self) -> str:
"""Return detailed description for Claude."""
return (
"EXAMPLE TOOL - Brief tagline describing the tool's purpose. "
"Use this tool when you need to [specific use cases]. "
"Perfect for: [scenario 1], [scenario 2], [scenario 3]. "
"Supports [key features]. Choose thinking_mode based on "
"[guidance for mode selection]. "
"Note: If you're not currently using a top-tier model such as "
"Opus 4 or above, these tools can provide enhanced capabilities."
)
def get_input_schema(self) -> dict[str, Any]:
"""Define the JSON schema for tool parameters."""
schema = {
"type": "object",
"properties": {
"prompt": {
"type": "string",
"description": "The main input/question for the tool",
},
"files": {
"type": "array",
"items": {"type": "string"},
"description": "Files to analyze (must be absolute paths)",
},
"focus_area": {
"type": "string",
"description": "Specific aspect to focus on",
},
"output_format": {
"type": "string",
"enum": ["summary", "detailed", "actionable"],
"description": "Output format type",
"default": "detailed",
},
"model": self.get_model_field_schema(),
"temperature": {
"type": "number",
"description": "Temperature (0-1, default varies by tool)",
"minimum": 0,
"maximum": 1,
},
"thinking_mode": {
"type": "string",
"enum": ["minimal", "low", "medium", "high", "max"],
"description": "Thinking depth: minimal (0.5% of model max), "
"low (8%), medium (33%), high (67%), max (100%)",
},
"continuation_id": {
"type": "string",
"description": "Thread continuation ID for multi-turn conversations",
},
},
"required": ["prompt"] + (
["model"] if self.is_effective_auto_mode() else []
),
}
return schema
def get_system_prompt(self) -> str:
"""Return the system prompt for this tool."""
return EXAMPLE_PROMPT # Defined in systemprompts/
def get_default_temperature(self) -> float:
"""Return default temperature for this tool."""
# Use predefined constants from config.py:
# TEMPERATURE_CREATIVE (0.7) - For creative tasks
# TEMPERATURE_BALANCED (0.5) - For balanced tasks
# TEMPERATURE_ANALYTICAL (0.2) - For analytical tasks
return TEMPERATURE_BALANCED
def get_model_category(self):
"""Specify which type of model this tool needs."""
from tools.models import ToolModelCategory
# Choose based on your tool's needs:
# FAST_RESPONSE - Quick responses, cost-efficient (chat, simple queries)
# BALANCED - Standard analysis and generation
# EXTENDED_REASONING - Complex analysis, deep thinking (debug, review)
return ToolModelCategory.BALANCED
def get_request_model(self):
"""Return the request model class."""
return ExampleRequest
def wants_line_numbers_by_default(self) -> bool:
"""Whether to add line numbers to code files."""
# Return True if your tool benefits from precise line references
# (e.g., code review, debugging, refactoring)
# Return False for general analysis or token-sensitive operations
return False
async def prepare_prompt(self, request: ExampleRequest) -> str:
"""
Prepare the complete prompt for the AI model.
This method combines:
- System prompt (behavior configuration)
- User request
- File contents (if provided)
- Additional context
"""
# Check for prompt.txt in files (handles large prompts)
prompt_content, updated_files = self.handle_prompt_file(request.files)
if prompt_content:
request.prompt = prompt_content
if updated_files is not None:
request.files = updated_files
# Build the prompt parts
prompt_parts = []
# Add main request
prompt_parts.append(f"=== USER REQUEST ===")
prompt_parts.append(f"Focus Area: {request.focus_area}" if request.focus_area else "")
prompt_parts.append(f"Output Format: {request.output_format}")
prompt_parts.append(request.prompt)
prompt_parts.append("=== END REQUEST ===")
# Add file contents if provided
if request.files:
# Use the centralized file handling (respects continuation)
file_content = self._prepare_file_content_for_prompt(
request.files,
request.continuation_id,
"Files to analyze"
)
if file_content:
prompt_parts.append("\n=== FILES ===")
prompt_parts.append(file_content)
prompt_parts.append("=== END FILES ===")
# Validate token limits
full_prompt = "\n".join(filter(None, prompt_parts))
self._validate_token_limit(full_prompt, "Prompt")
return full_prompt
def format_response(self, response: str, request: ExampleRequest,
model_info: Optional[dict] = None) -> str:
"""
Format the AI's response for display.
Override this to add custom formatting, headers, or structure.
The base class handles special status parsing automatically.
"""
# Example: Add a footer with next steps
return f"{response}\n\n---\n\n**Next Steps:** Review the analysis above and proceed with implementation."
4. Handle Large Prompts (Optional)
If your tool might receive large text inputs, override the execute method:
async def execute(self, arguments: dict[str, Any]) -> list[TextContent]:
"""Override to check prompt size before processing."""
# Validate request first
request_model = self.get_request_model()
request = request_model(**arguments)
# Check if prompt is too large for MCP limits
size_check = self.check_prompt_size(request.prompt)
if size_check:
return [TextContent(type="text", text=ToolOutput(**size_check).model_dump_json())]
# Continue with normal execution
return await super().execute(arguments)
5. Create the System Prompt
Create a new file in systemprompts/ (e.g., systemprompts/example_prompt.py):
"""System prompt for the example tool."""
EXAMPLE_PROMPT = """You are an AI assistant specialized in [tool purpose].
Your role is to [primary responsibility] by [approach/methodology].
Key principles:
1. [Principle 1]
2. [Principle 2]
3. [Principle 3]
When analyzing content:
- [Guideline 1]
- [Guideline 2]
- [Guideline 3]
Output format:
- Start with a brief summary
- Provide detailed analysis organized by [structure]
- Include specific examples and recommendations
- End with actionable next steps
Remember to:
- Be specific and reference exact locations (file:line) when discussing code
- Provide practical, implementable suggestions
- Consider the broader context and implications
- Maintain a helpful, constructive tone
"""
Add the import to systemprompts/__init__.py:
from .example_prompt import EXAMPLE_PROMPT
6. Register the Tool
6.1. Import in server.py
Add the import at the top of server.py:
from tools.example import ExampleTool
6.2. Add to TOOLS Dictionary
Find the TOOLS dictionary in server.py and add your tool:
TOOLS = {
"analyze": AnalyzeTool(),
"chat": ChatTool(),
"review_code": CodeReviewTool(),
"debug": DebugTool(),
"review_changes": PreCommitTool(),
"generate_tests": TestGenTool(),
"thinkdeep": ThinkDeepTool(),
"refactor": RefactorTool(),
"example": ExampleTool(), # Add your tool here
}
7. Write Tests
Create unit tests in tests/test_example.py:
"""Tests for the example tool."""
import pytest
from unittest.mock import Mock, patch
from tools.example import ExampleTool, ExampleRequest
from tools.models import ToolModelCategory
class TestExampleTool:
"""Test suite for ExampleTool."""
def test_tool_metadata(self):
"""Test tool metadata methods."""
tool = ExampleTool()
assert tool.get_name() == "example"
assert "EXAMPLE TOOL" in tool.get_description()
assert tool.get_default_temperature() == 0.5
assert tool.get_model_category() == ToolModelCategory.BALANCED
def test_request_validation(self):
"""Test request model validation."""
# Valid request
request = ExampleRequest(prompt="Test prompt")
assert request.prompt == "Test prompt"
assert request.output_format == "detailed" # default
# Invalid request (missing required field)
with pytest.raises(ValueError):
ExampleRequest()
def test_input_schema(self):
"""Test input schema generation."""
tool = ExampleTool()
schema = tool.get_input_schema()
assert schema["type"] == "object"
assert "prompt" in schema["properties"]
assert "prompt" in schema["required"]
assert "model" in schema["properties"]
@pytest.mark.asyncio
async def test_prepare_prompt(self):
"""Test prompt preparation."""
tool = ExampleTool()
request = ExampleRequest(
prompt="Analyze this code",
focus_area="performance",
output_format="summary"
)
with patch.object(tool, '_validate_token_limit'):
prompt = await tool.prepare_prompt(request)
assert "USER REQUEST" in prompt
assert "Analyze this code" in prompt
assert "Focus Area: performance" in prompt
assert "Output Format: summary" in prompt
@pytest.mark.asyncio
async def test_file_handling(self):
"""Test file content handling."""
tool = ExampleTool()
request = ExampleRequest(
prompt="Analyze",
files=["/path/to/file.py"]
)
# Mock file reading
with patch.object(tool, '_prepare_file_content_for_prompt') as mock_prep:
mock_prep.return_value = "file contents"
with patch.object(tool, '_validate_token_limit'):
prompt = await tool.prepare_prompt(request)
assert "FILES" in prompt
assert "file contents" in prompt
8. Add Simulator Tests (Optional)
For tools that interact with external systems, create simulator tests in simulator_tests/test_example_basic.py:
"""Basic simulator test for example tool."""
from simulator_tests.base_test import SimulatorTest
class TestExampleBasic(SimulatorTest):
"""Test basic example tool functionality."""
def test_example_analysis(self):
"""Test basic analysis with example tool."""
result = self.call_tool(
"example",
{
"prompt": "Analyze the architecture of this codebase",
"model": "flash",
"output_format": "summary"
}
)
self.assert_tool_success(result)
self.assert_content_contains(result, ["architecture", "summary"])
9. Update Documentation
Add your tool to the README.md in the tools section:
### Available Tools
- **example** - Brief description of what the tool does
- Use cases: [scenario 1], [scenario 2]
- Supports: [key features]
- Best model: `balanced` category for standard analysis
Advanced Features
Understanding Conversation Memory
The continuation_id feature enables multi-turn conversations using the conversation memory system (utils/conversation_memory.py). Here's how it works:
- Thread Creation: When a tool wants to enable follow-up conversations, it creates a thread
- Turn Storage: Each exchange (user/assistant) is stored as a turn with metadata
- Cross-Tool Continuation: Any tool can continue a conversation started by another tool
- Automatic History: When
continuation_idis provided, the full conversation history is reconstructed
Key concepts:
- ThreadContext: Contains all conversation turns, files, and metadata
- ConversationTurn: Single exchange with role, content, timestamp, files, tool attribution
- Thread Chains: Conversations can have parent threads for extended discussions
- Turn Limits: Default 20 turns (configurable via MAX_CONVERSATION_TURNS)
Example flow:
# Tool A creates thread
thread_id = create_thread("analyze", request_data)
# Tool A adds its response
add_turn(thread_id, "assistant", response, files=[...], tool_name="analyze")
# Tool B continues the same conversation
context = get_thread(thread_id) # Gets full history
# Tool B sees all previous turns and files
Supporting Special Response Types
Tools can return special status responses for complex interactions. These are defined in tools/models.py:
# Currently supported special statuses:
SPECIAL_STATUS_MODELS = {
"need_clarification": NeedClarificationModel,
"focused_review_required": FocusedReviewRequiredModel,
"more_review_required": MoreReviewRequiredModel,
"more_testgen_required": MoreTestGenRequiredModel,
"more_refactor_required": MoreRefactorRequiredModel,
"resend_prompt": ResendPromptModel,
}
Example implementation:
# In your tool's format_response or within the AI response:
if need_clarification:
return json.dumps({
"status": "need_clarification",
"questions": ["What specific aspect should I focus on?"],
"context": "I need more information to proceed"
})
# For custom review status:
if more_analysis_needed:
return json.dumps({
"status": "focused_review_required",
"files": ["/path/to/file1.py", "/path/to/file2.py"],
"focus": "security",
"reason": "Found potential SQL injection vulnerabilities"
})
To add a new custom response type:
- Define the model in
tools/models.py:
class CustomStatusModel(BaseModel):
"""Model for custom status responses"""
status: Literal["custom_status"]
custom_field: str
details: dict[str, Any]
- Register it in
SPECIAL_STATUS_MODELS:
SPECIAL_STATUS_MODELS = {
# ... existing statuses ...
"custom_status": CustomStatusModel,
}
- The base tool will automatically handle parsing and validation
Token Management
For tools processing large amounts of data:
# Calculate available tokens dynamically
def prepare_large_content(self, files: list[str], remaining_budget: int):
# Reserve tokens for response
reserve_tokens = 5000
# Use model-specific limits
effective_max = remaining_budget - reserve_tokens
# Process files with budget
content = self._prepare_file_content_for_prompt(
files,
continuation_id,
"Analysis files",
max_tokens=effective_max,
reserve_tokens=reserve_tokens
)
Web Search Integration
Enable web search for tools that benefit from current information:
# In prepare_prompt:
websearch_instruction = self.get_websearch_instruction(
request.use_websearch,
"""Consider searching for:
- Current best practices for [topic]
- Recent updates to [technology]
- Community solutions for [problem]"""
)
full_prompt = f"{system_prompt}{websearch_instruction}\n\n{user_content}"
Best Practices
- Clear Tool Descriptions: Write descriptive text that helps Claude understand when to use your tool
- Proper Validation: Use Pydantic models for robust input validation
- Security First: Always validate file paths are absolute
- Token Awareness: Handle large inputs gracefully with prompt.txt mechanism
- Model Selection: Choose appropriate model category for your tool's complexity
- Line Numbers: Enable for tools needing precise code references
- Error Handling: Provide helpful error messages for common issues
- Testing: Write comprehensive unit tests and simulator tests
- Documentation: Include examples and use cases in your description
Common Pitfalls to Avoid
- Don't Skip Validation: Always validate inputs, especially file paths
- Don't Ignore Token Limits: Use
_validate_token_limitand handle large prompts - Don't Hardcode Models: Use model categories for flexibility
- Don't Forget Tests: Every tool needs tests for reliability
- Don't Break Conventions: Follow existing patterns from other tools
Testing Your Tool
Manual Testing
- Start the server with your tool registered
- Use Claude Desktop to call your tool
- Test various parameter combinations
- Verify error handling
Automated Testing
# Run unit tests
pytest tests/test_example.py -xvs
# Run all tests to ensure no regressions
pytest -xvs
# Run simulator tests if applicable
python communication_simulator_test.py
Checklist
Before submitting your PR:
- Tool class created inheriting from
BaseTool - All abstract methods implemented
- Request model defined with proper validation
- System prompt created in
systemprompts/ - Tool registered in
server.py - Unit tests written and passing
- Simulator tests added (if applicable)
- Documentation updated
- Code follows project style (ruff, black, isort)
- Large prompt handling implemented (if needed)
- Security validation for file paths
- Appropriate model category selected
- Tool description is clear and helpful
Example: Complete Simple Tool
Here's a minimal but complete example tool:
"""
Simple calculator tool for mathematical operations.
"""
from typing import Any, Optional
from mcp.types import TextContent
from pydantic import Field
from config import TEMPERATURE_ANALYTICAL
from .base import BaseTool, ToolRequest
from .models import ToolOutput
class CalculateRequest(ToolRequest):
"""Request model for calculator tool."""
expression: str = Field(
...,
description="Mathematical expression to evaluate"
)
class CalculatorTool(BaseTool):
"""Simple calculator tool."""
def get_name(self) -> str:
return "calculate"
def get_description(self) -> str:
return (
"CALCULATOR - Evaluates mathematical expressions. "
"Use this for calculations, conversions, and math problems."
)
def get_input_schema(self) -> dict[str, Any]:
schema = {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Mathematical expression to evaluate",
},
"model": self.get_model_field_schema(),
},
"required": ["expression"] + (
["model"] if self.is_effective_auto_mode() else []
),
}
return schema
def get_system_prompt(self) -> str:
return """You are a mathematical assistant. Evaluate the expression
and explain the calculation steps clearly."""
def get_default_temperature(self) -> float:
return TEMPERATURE_ANALYTICAL
def get_request_model(self):
return CalculateRequest
async def prepare_prompt(self, request: CalculateRequest) -> str:
return f"Calculate: {request.expression}\n\nShow your work step by step."
Need Help?
- Look at existing tools (
chat.py,refactor.py) for examples - Check
base.pyfor available helper methods - Review test files for testing patterns
- Ask questions in GitHub issues if stuck