Files

Beehive Innovations c960bcb720 Add DocGen tool with comprehensive documentation generation capabilities (#109 )

* WIP: new workflow architecture

* WIP: further improvements and cleanup

* WIP: cleanup and docks, replace old tool with new

* WIP: cleanup and docks, replace old tool with new

* WIP: new planner implementation using workflow

* WIP: precommit tool working as a workflow instead of a basic tool
Support for passing False to use_assistant_model to skip external models completely and use Claude only

* WIP: precommit workflow version swapped with old

* WIP: codereview

* WIP: replaced codereview

* WIP: replaced codereview

* WIP: replaced refactor

* WIP: workflow for thinkdeep

* WIP: ensure files get embedded correctly

* WIP: thinkdeep replaced with workflow version

* WIP: improved messaging when an external model's response is received

* WIP: analyze tool swapped

* WIP: updated tests
* Extract only the content when building history
* Use "relevant_files" for workflow tools only

* WIP: updated tests
* Extract only the content when building history
* Use "relevant_files" for workflow tools only

* WIP: fixed get_completion_next_steps_message missing param

* Fixed tests
Request for files consistently

* Fixed tests
Request for files consistently

* Fixed tests

* New testgen workflow tool
Updated docs

* Swap testgen workflow

* Fix CI test failures by excluding API-dependent tests

- Update GitHub Actions workflow to exclude simulation tests that require API keys
- Fix collaboration tests to properly mock workflow tool expert analysis calls
- Update test assertions to handle new workflow tool response format
- Ensure unit tests run without external API dependencies in CI

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* WIP - Update tests to match new tools

* WIP - Update tests to match new tools

* WIP - Update tests to match new tools

* Should help with https://github.com/BeehiveInnovations/zen-mcp-server/issues/97
Clear python cache when running script: https://github.com/BeehiveInnovations/zen-mcp-server/issues/96
Improved retry error logging
Cleanup

* WIP - chat tool using new architecture and improved code sharing

* Removed todo

* Removed todo

* Cleanup old name

* Tweak wordings

* Tweak wordings
Migrate old tests

* Support for Flash 2.0 and Flash Lite 2.0

* Support for Flash 2.0 and Flash Lite 2.0

* Support for Flash 2.0 and Flash Lite 2.0
Fixed test

* Improved consensus to use the workflow base class

* Improved consensus to use the workflow base class

* Allow images

* Allow images

* Replaced old consensus tool

* Cleanup tests

* Tests for prompt size

* New tool: docgen
Tests for prompt size
Fixes: https://github.com/BeehiveInnovations/zen-mcp-server/issues/107
Use available token size limits: https://github.com/BeehiveInnovations/zen-mcp-server/issues/105

* Improved docgen prompt
Exclude TestGen from pytest inclusion

* Updated errors

* Lint

* DocGen instructed not to fix bugs, surface them and stick to d

* WIP

* Stop claude from being lazy and only documenting a small handful

* More style rules

---------

Co-authored-by: Claude <noreply@anthropic.com>

2025-06-22 10:21:19 +04:00

9.2 KiB

Raw Blame History

Claude Development Guide for Zen MCP Server

This file contains essential commands and workflows for developing and maintaining the Zen MCP Server when working with Claude. Use these instructions to efficiently run quality checks, manage the server, check logs, and run tests.

Quick Reference Commands

Code Quality Checks

Before making any changes or submitting PRs, always run the comprehensive quality checks:

# Activate virtual environment first
source venv/bin/activate

# Run all quality checks (linting, formatting, tests)
./code_quality_checks.sh

This script automatically runs:

Ruff linting with auto-fix
Black code formatting
Import sorting with isort
Complete unit test suite (excluding integration tests)
Verification that all checks pass 100%

Run Integration Tests (requires API keys):

# Run integration tests that make real API calls
./run_integration_tests.sh

# Run integration tests + simulator tests
./run_integration_tests.sh --with-simulator

Server Management

Setup/Update the Server

# Run setup script (handles everything)
./run-server.sh

This script will:

Set up Python virtual environment
Install all dependencies
Create/update .env file
Configure MCP with Claude
Verify API keys

View Logs

# Follow logs in real-time
./run-server.sh -f

# Or manually view logs
tail -f logs/mcp_server.log

Log Management

View Server Logs

# View last 500 lines of server logs
tail -n 500 logs/mcp_server.log

# Follow logs in real-time
tail -f logs/mcp_server.log

# View specific number of lines
tail -n 100 logs/mcp_server.log

# Search logs for specific patterns
grep "ERROR" logs/mcp_server.log
grep "tool_name" logs/mcp_activity.log

Monitor Tool Executions Only

# View tool activity log (focused on tool calls and completions)
tail -n 100 logs/mcp_activity.log

# Follow tool activity in real-time
tail -f logs/mcp_activity.log

# Use simple tail commands to monitor logs
tail -f logs/mcp_activity.log | grep -E "(TOOL_CALL|TOOL_COMPLETED|ERROR|WARNING)"

Available Log Files

Current log files (with proper rotation):

# Main server log (all activity including debug info) - 20MB max, 10 backups
tail -f logs/mcp_server.log

# Tool activity only (TOOL_CALL, TOOL_COMPLETED, etc.) - 20MB max, 5 backups  
tail -f logs/mcp_activity.log

For programmatic log analysis (used by tests):

# Import the LogUtils class from simulator tests
from simulator_tests.log_utils import LogUtils

# Get recent logs
recent_logs = LogUtils.get_recent_server_logs(lines=500)

# Check for errors
errors = LogUtils.check_server_logs_for_errors()

# Search for specific patterns
matches = LogUtils.search_logs_for_pattern("TOOL_CALL.*debug")

Testing

Simulation tests are available to test the MCP server in a 'live' scenario, using your configured API keys to ensure the models are working and the server is able to communicate back and forth.

IMPORTANT: After any code changes, restart your Claude session for the changes to take effect.

Run All Simulator Tests

# Run the complete test suite
python communication_simulator_test.py

# Run tests with verbose output
python communication_simulator_test.py --verbose

Run Individual Simulator Tests (Recommended)

# List all available tests
python communication_simulator_test.py --list-tests

# RECOMMENDED: Run tests individually for better isolation and debugging
python communication_simulator_test.py --individual basic_conversation
python communication_simulator_test.py --individual content_validation
python communication_simulator_test.py --individual cross_tool_continuation
python communication_simulator_test.py --individual memory_validation

# Run multiple specific tests
python communication_simulator_test.py --tests basic_conversation content_validation

# Run individual test with verbose output for debugging
python communication_simulator_test.py --individual memory_validation --verbose

Available simulator tests include:

basic_conversation - Basic conversation flow with chat tool
content_validation - Content validation and duplicate detection
per_tool_deduplication - File deduplication for individual tools
cross_tool_continuation - Cross-tool conversation continuation scenarios
cross_tool_comprehensive - Comprehensive cross-tool file deduplication and continuation
line_number_validation - Line number handling validation across tools
memory_validation - Conversation memory validation
model_thinking_config - Model-specific thinking configuration behavior
o3_model_selection - O3 model selection and usage validation
ollama_custom_url - Ollama custom URL endpoint functionality
openrouter_fallback - OpenRouter fallback behavior when only provider
openrouter_models - OpenRouter model functionality and alias mapping
token_allocation_validation - Token allocation and conversation history validation
testgen_validation - TestGen tool validation with specific test function
refactor_validation - Refactor tool validation with codesmells
conversation_chain_validation - Conversation chain and threading validation
consensus_stance - Consensus tool validation with stance steering (for/against/neutral)

Note: All simulator tests should be run individually for optimal testing and better error isolation.

Run Unit Tests Only

# Run all unit tests (excluding integration tests that require API keys)
python -m pytest tests/ -v -m "not integration"

# Run specific test file
python -m pytest tests/test_refactor.py -v

# Run specific test function
python -m pytest tests/test_refactor.py::TestRefactorTool::test_format_response -v

# Run tests with coverage
python -m pytest tests/ --cov=. --cov-report=html -m "not integration"

Run Integration Tests (Uses Free Local Models)

Setup Requirements:

# 1. Install Ollama (if not already installed)
# Visit https://ollama.ai or use brew install ollama

# 2. Start Ollama service
ollama serve

# 3. Pull a model (e.g., llama3.2)
ollama pull llama3.2

# 4. Set environment variable for custom provider
export CUSTOM_API_URL="http://localhost:11434"

Run Integration Tests:

# Run integration tests that make real API calls to local models
python -m pytest tests/ -v -m "integration"

# Run specific integration test
python -m pytest tests/test_prompt_regression.py::TestPromptIntegration::test_chat_normal_prompt -v

# Run all tests (unit + integration)
python -m pytest tests/ -v

Note: Integration tests use the local-llama model via Ollama, which is completely FREE to run unlimited times. Requires CUSTOM_API_URL environment variable set to your local Ollama endpoint. They can be run safely in CI/CD but are excluded from code quality checks to keep them fast.

Development Workflow

Before Making Changes

Ensure virtual environment is activated: source .zen_venv/bin/activate
Run quality checks: ./code_quality_checks.sh
Check logs to ensure server is healthy: tail -n 50 logs/mcp_server.log

After Making Changes

Run quality checks again: ./code_quality_checks.sh
Run integration tests locally: ./run_integration_tests.sh
Run relevant simulator tests: python communication_simulator_test.py --individual <test_name>
Check logs for any issues: tail -n 100 logs/mcp_server.log
Restart Claude session to use updated code

Before Committing/PR

Final quality check: ./code_quality_checks.sh
Run integration tests: ./run_integration_tests.sh
Run full simulator test suite: ./run_integration_tests.sh --with-simulator
Verify all tests pass 100%

Common Troubleshooting

Server Issues

# Check if Python environment is set up correctly
./run-server.sh

# View recent errors
grep "ERROR" logs/mcp_server.log | tail -20

# Check virtual environment
which python
# Should show: .../zen-mcp-server/.zen_venv/bin/python

Test Failures

# Run individual failing test with verbose output
python communication_simulator_test.py --individual <test_name> --verbose

# Check server logs during test execution
tail -f logs/mcp_server.log

# Run tests with debug output
LOG_LEVEL=DEBUG python communication_simulator_test.py --individual <test_name>

Linting Issues

# Auto-fix most linting issues
ruff check . --fix
black .
isort .

# Check what would be changed without applying
ruff check .
black --check .
isort --check-only .

File Structure Context

./code_quality_checks.sh - Comprehensive quality check script
./run-server.sh - Server setup and management
communication_simulator_test.py - End-to-end testing framework
simulator_tests/ - Individual test modules
tests/ - Unit test suite
tools/ - MCP tool implementations
providers/ - AI provider implementations
systemprompts/ - System prompt definitions
logs/ - Server log files

Environment Requirements

Python 3.9+ with virtual environment
All dependencies from requirements.txt installed
Proper API keys configured in .env file

This guide provides everything needed to efficiently work with the Zen MCP Server codebase using Claude. Always run quality checks before and after making changes to ensure code integrity.

9.2 KiB Raw Blame History