Files
my-pal-mcp-server/docs/testing.md
Fahad b5004b91fc Major new addition: refactor tool
Supports decomposing large components and files, finding codesmells, finding modernizing opportunities as well as code organization opportunities. Fix this mega-classes today!
Line numbers added to embedded code for better references from model -> claude
2025-06-15 06:00:01 +04:00

6.4 KiB

Testing Guide

This project includes comprehensive test coverage through unit tests and integration simulator tests.

Running Tests

Prerequisites

  • Python virtual environment activated: source venv/bin/activate
  • All dependencies installed: pip install -r requirements.txt
  • Docker containers running (for simulator tests): ./run-server.sh

Unit Tests

Run all unit tests with pytest:

# Run all tests with verbose output
python -m pytest -xvs

# Run specific test file
python -m pytest tests/test_providers.py -xvs

Simulator Tests

Simulator tests replicate real-world Claude CLI interactions with the MCP server running in Docker. Unlike unit tests that test isolated functions, simulator tests validate the complete end-to-end flow including:

  • Actual MCP protocol communication
  • Docker container interactions
  • Multi-turn conversations across tools
  • Log output validation

Important: Simulator tests require LOG_LEVEL=DEBUG in your .env file to validate detailed execution logs.

Monitoring Logs During Tests

Important: The MCP stdio protocol interferes with stderr output during tool execution. While server startup logs appear in docker compose logs, tool execution logs are only written to file-based logs inside the container. This is a known limitation of the stdio-based MCP protocol and cannot be fixed without changing the MCP implementation.

To monitor logs during test execution:

# Monitor main server logs (includes all tool execution details)
docker exec zen-mcp-server tail -f -n 500 /tmp/mcp_server.log

# Monitor MCP activity logs (tool calls and completions)  
docker exec zen-mcp-server tail -f /tmp/mcp_activity.log

# Check log file sizes (logs rotate at 20MB)
docker exec zen-mcp-server ls -lh /tmp/mcp_*.log*

Log Rotation: All log files are configured with automatic rotation at 20MB to prevent disk space issues. The server keeps:

  • 10 rotated files for mcp_server.log (200MB total)
  • 5 rotated files for mcp_activity.log (100MB total)

Why logs don't appear in docker compose logs: The MCP stdio_server captures stderr during tool execution to prevent interference with the JSON-RPC protocol communication. This means that while you'll see startup logs in docker compose logs, you won't see tool execution logs there.

Running All Simulator Tests

# Run all simulator tests
python communication_simulator_test.py

# Run with verbose output for debugging
python communication_simulator_test.py --verbose

# Keep Docker logs after tests for inspection
python communication_simulator_test.py --keep-logs

Running Individual Tests

To run a single simulator test in isolation (useful for debugging or test development):

# Run a specific test by name
python communication_simulator_test.py --individual basic_conversation

# Examples of available tests:
python communication_simulator_test.py --individual content_validation
python communication_simulator_test.py --individual cross_tool_continuation
python communication_simulator_test.py --individual redis_validation

Other Options

# List all available simulator tests with descriptions
python communication_simulator_test.py --list-tests

# Run multiple specific tests (not all)
python communication_simulator_test.py --tests basic_conversation content_validation

# Force Docker environment rebuild before running tests
python communication_simulator_test.py --rebuild

Code Quality Checks

Before committing, ensure all linting passes:

# Run all linting checks
ruff check .
black --check .
isort --check-only .

# Auto-fix issues
ruff check . --fix
black .
isort .

What Each Test Suite Covers

Unit Tests

Test isolated components and functions:

  • Provider functionality: Model initialization, API interactions, capability checks
  • Tool operations: All MCP tools (chat, analyze, debug, etc.)
  • Conversation memory: Threading, continuation, history management
  • File handling: Path validation, token limits, deduplication
  • Auto mode: Model selection logic and fallback behavior

Simulator Tests

Validate real-world usage scenarios by simulating actual Claude prompts:

  • Basic conversations: Multi-turn chat functionality with real prompts
  • Cross-tool continuation: Context preservation across different tools
  • File deduplication: Efficient handling of repeated file references
  • Model selection: Proper routing to configured providers
  • Token allocation: Context window management in practice
  • Redis validation: Conversation persistence and retrieval

Contributing: Test Requirements

When contributing to this project:

  1. New features MUST include tests:

    • Add unit tests in tests/ for new functions or classes
    • Test both success and error cases
  2. Tool changes require simulator tests:

    • Add simulator tests in simulator_tests/ for new or modified tools
    • Use realistic prompts that demonstrate the feature
    • Validate output through Docker logs
  3. Test naming conventions:

    • Unit tests: test_<feature>_<scenario>.py
    • Simulator tests: test_<tool>_<behavior>.py
  4. Before submitting PR - Complete Validation Checklist:

    # Activate virtual environment first as needed
    source venv/bin/activate
    
    # Run all linting tools (must pass 100%)
    ruff check .
    black --check .
    isort --check-only .
    
    # Auto-fix issues if needed
    ruff check . --fix
    black .
    isort .
    
    # Run complete unit test suite (must pass 100%)
    python -m pytest -xvs
    
    # Run simulator tests for tool changes
    python communication_simulator_test.py
    
  5. GitHub Actions Compliance:

    • Every single test must pass - we have zero tolerance for failing tests in CI
    • All linting must pass cleanly (ruff, black, isort)
    • Import sorting must be correct
    • Virtual environment activation is required for consistent results
    • Tests failing in GitHub Actions will result in PR rejection
  6. Contribution Standards:

    • Follow the PR template requirements exactly
    • Check every box in the template checklist before submitting
    • Include comprehensive tests for all new functionality
    • Ensure backward compatibility unless explicitly breaking

Remember: Tests are documentation. They show how features are intended to be used and help prevent regressions. Quality over speed - take the time to ensure everything passes locally before pushing.