Supports decomposing large components and files, finding codesmells, finding modernizing opportunities as well as code organization opportunities. Fix this mega-classes today! Line numbers added to embedded code for better references from model -> claude
6.4 KiB
Testing Guide
This project includes comprehensive test coverage through unit tests and integration simulator tests.
Running Tests
Prerequisites
- Python virtual environment activated:
source venv/bin/activate - All dependencies installed:
pip install -r requirements.txt - Docker containers running (for simulator tests):
./run-server.sh
Unit Tests
Run all unit tests with pytest:
# Run all tests with verbose output
python -m pytest -xvs
# Run specific test file
python -m pytest tests/test_providers.py -xvs
Simulator Tests
Simulator tests replicate real-world Claude CLI interactions with the MCP server running in Docker. Unlike unit tests that test isolated functions, simulator tests validate the complete end-to-end flow including:
- Actual MCP protocol communication
- Docker container interactions
- Multi-turn conversations across tools
- Log output validation
Important: Simulator tests require LOG_LEVEL=DEBUG in your .env file to validate detailed execution logs.
Monitoring Logs During Tests
Important: The MCP stdio protocol interferes with stderr output during tool execution. While server startup logs appear in docker compose logs, tool execution logs are only written to file-based logs inside the container. This is a known limitation of the stdio-based MCP protocol and cannot be fixed without changing the MCP implementation.
To monitor logs during test execution:
# Monitor main server logs (includes all tool execution details)
docker exec zen-mcp-server tail -f -n 500 /tmp/mcp_server.log
# Monitor MCP activity logs (tool calls and completions)
docker exec zen-mcp-server tail -f /tmp/mcp_activity.log
# Check log file sizes (logs rotate at 20MB)
docker exec zen-mcp-server ls -lh /tmp/mcp_*.log*
Log Rotation: All log files are configured with automatic rotation at 20MB to prevent disk space issues. The server keeps:
- 10 rotated files for mcp_server.log (200MB total)
- 5 rotated files for mcp_activity.log (100MB total)
Why logs don't appear in docker compose logs: The MCP stdio_server captures stderr during tool execution to prevent interference with the JSON-RPC protocol communication. This means that while you'll see startup logs in docker compose logs, you won't see tool execution logs there.
Running All Simulator Tests
# Run all simulator tests
python communication_simulator_test.py
# Run with verbose output for debugging
python communication_simulator_test.py --verbose
# Keep Docker logs after tests for inspection
python communication_simulator_test.py --keep-logs
Running Individual Tests
To run a single simulator test in isolation (useful for debugging or test development):
# Run a specific test by name
python communication_simulator_test.py --individual basic_conversation
# Examples of available tests:
python communication_simulator_test.py --individual content_validation
python communication_simulator_test.py --individual cross_tool_continuation
python communication_simulator_test.py --individual redis_validation
Other Options
# List all available simulator tests with descriptions
python communication_simulator_test.py --list-tests
# Run multiple specific tests (not all)
python communication_simulator_test.py --tests basic_conversation content_validation
# Force Docker environment rebuild before running tests
python communication_simulator_test.py --rebuild
Code Quality Checks
Before committing, ensure all linting passes:
# Run all linting checks
ruff check .
black --check .
isort --check-only .
# Auto-fix issues
ruff check . --fix
black .
isort .
What Each Test Suite Covers
Unit Tests
Test isolated components and functions:
- Provider functionality: Model initialization, API interactions, capability checks
- Tool operations: All MCP tools (chat, analyze, debug, etc.)
- Conversation memory: Threading, continuation, history management
- File handling: Path validation, token limits, deduplication
- Auto mode: Model selection logic and fallback behavior
Simulator Tests
Validate real-world usage scenarios by simulating actual Claude prompts:
- Basic conversations: Multi-turn chat functionality with real prompts
- Cross-tool continuation: Context preservation across different tools
- File deduplication: Efficient handling of repeated file references
- Model selection: Proper routing to configured providers
- Token allocation: Context window management in practice
- Redis validation: Conversation persistence and retrieval
Contributing: Test Requirements
When contributing to this project:
-
New features MUST include tests:
- Add unit tests in
tests/for new functions or classes - Test both success and error cases
- Add unit tests in
-
Tool changes require simulator tests:
- Add simulator tests in
simulator_tests/for new or modified tools - Use realistic prompts that demonstrate the feature
- Validate output through Docker logs
- Add simulator tests in
-
Test naming conventions:
- Unit tests:
test_<feature>_<scenario>.py - Simulator tests:
test_<tool>_<behavior>.py
- Unit tests:
-
Before submitting PR - Complete Validation Checklist:
# Activate virtual environment first as needed source venv/bin/activate # Run all linting tools (must pass 100%) ruff check . black --check . isort --check-only . # Auto-fix issues if needed ruff check . --fix black . isort . # Run complete unit test suite (must pass 100%) python -m pytest -xvs # Run simulator tests for tool changes python communication_simulator_test.py -
GitHub Actions Compliance:
- Every single test must pass - we have zero tolerance for failing tests in CI
- All linting must pass cleanly (ruff, black, isort)
- Import sorting must be correct
- Virtual environment activation is required for consistent results
- Tests failing in GitHub Actions will result in PR rejection
-
Contribution Standards:
- Follow the PR template requirements exactly
- Check every box in the template checklist before submitting
- Include comprehensive tests for all new functionality
- Ensure backward compatibility unless explicitly breaking
Remember: Tests are documentation. They show how features are intended to be used and help prevent regressions. Quality over speed - take the time to ensure everything passes locally before pushing.