Renamed setup script to avoid confusion (https://github.com/BeehiveInnovations/zen-mcp-server/issues/35)

Further fixes to tests Pass O3 simulation test when keys are not set, along with a notice Updated docs on testing, simulation tests / contributing Support for OpenAI o4-mini and o4-mini-high
2025-06-14 09:28:20 +04:00
parent c5f682c7b0
commit 746380eb7f
17 changed files with 324 additions and 53 deletions
--- a/docs/advanced-usage.md
+++ b/docs/advanced-usage.md
@@ -55,6 +55,8 @@ DEFAULT_MODEL=flash                         # Always use Flash
 DEFAULT_MODEL=o3                           # Always use O3
 ```

+**Important:** After changing any configuration in `.env` (including `DEFAULT_MODEL`, API keys, or other settings), restart the server with `./run-server.sh` to apply the changes.
+
 **Per-Request Model Override:**
 Regardless of your default setting, you can specify models per request:
 - "Use **pro** for deep security analysis of auth.py"
--- a/docs/testing.md
+++ b/docs/testing.md
@@ -0,0 +1,126 @@
+# Testing Guide
+
+This project includes comprehensive test coverage through unit tests and integration simulator tests.
+
+## Running Tests
+
+### Prerequisites
+- Python virtual environment activated: `source venv/bin/activate`
+- All dependencies installed: `pip install -r requirements.txt`
+- Docker containers running (for simulator tests): `./run-server.sh`
+
+### Unit Tests
+
+Run all unit tests with pytest:
+```bash
+# Run all tests with verbose output
+python -m pytest -xvs
+
+# Run specific test file
+python -m pytest tests/test_providers.py -xvs
+```
+
+### Simulator Tests
+
+Simulator tests replicate real-world Claude CLI interactions with the MCP server running in Docker. Unlike unit tests that test isolated functions, simulator tests validate the complete end-to-end flow including:
+- Actual MCP protocol communication
+- Docker container interactions
+- Multi-turn conversations across tools
+- Log output validation
+
+**Important**: Simulator tests require `LOG_LEVEL=DEBUG` in your `.env` file to validate detailed execution logs.
+
+#### Running All Simulator Tests
+```bash
+# Run all simulator tests
+python communication_simulator_test.py
+
+# Run with verbose output for debugging
+python communication_simulator_test.py --verbose
+
+# Keep Docker logs after tests for inspection
+python communication_simulator_test.py --keep-logs
+```
+
+#### Running Individual Tests
+To run a single simulator test in isolation (useful for debugging or test development):
+
+```bash
+# Run a specific test by name
+python communication_simulator_test.py --individual basic_conversation
+
+# Examples of available tests:
+python communication_simulator_test.py --individual content_validation
+python communication_simulator_test.py --individual cross_tool_continuation
+python communication_simulator_test.py --individual redis_validation
+```
+
+#### Other Options
+```bash
+# List all available simulator tests with descriptions
+python communication_simulator_test.py --list-tests
+
+# Run multiple specific tests (not all)
+python communication_simulator_test.py --tests basic_conversation content_validation
+
+# Force Docker environment rebuild before running tests
+python communication_simulator_test.py --rebuild
+```
+
+### Code Quality Checks
+
+Before committing, ensure all linting passes:
+```bash
+# Run all linting checks
+ruff check .
+black --check .
+isort --check-only .
+
+# Auto-fix issues
+ruff check . --fix
+black .
+isort .
+```
+
+## What Each Test Suite Covers
+
+### Unit Tests (256 tests)
+Test isolated components and functions:
+- **Provider functionality**: Model initialization, API interactions, capability checks
+- **Tool operations**: All MCP tools (chat, analyze, debug, etc.)
+- **Conversation memory**: Threading, continuation, history management
+- **File handling**: Path validation, token limits, deduplication
+- **Auto mode**: Model selection logic and fallback behavior
+
+### Simulator Tests (14 tests)
+Validate real-world usage scenarios by simulating actual Claude prompts:
+- **Basic conversations**: Multi-turn chat functionality with real prompts
+- **Cross-tool continuation**: Context preservation across different tools
+- **File deduplication**: Efficient handling of repeated file references
+- **Model selection**: Proper routing to configured providers
+- **Token allocation**: Context window management in practice
+- **Redis validation**: Conversation persistence and retrieval
+
+## Contributing: Test Requirements
+
+When contributing to this project:
+
+1. **New features MUST include tests**:
+   - Add unit tests in `tests/` for new functions or classes
+   - Test both success and error cases
+   
+2. **Tool changes require simulator tests**:
+   - Add simulator tests in `simulator_tests/` for new or modified tools
+   - Use realistic prompts that demonstrate the feature
+   - Validate output through Docker logs
+   
+3. **Test naming conventions**:
+   - Unit tests: `test_<feature>_<scenario>.py`
+   - Simulator tests: `test_<tool>_<behavior>.py`
+
+4. **Before submitting PR**:
+   - Run all unit tests: `python -m pytest -xvs`
+   - Run relevant simulator tests
+   - Ensure all linting passes
+
+Remember: Tests are documentation. They show how features are intended to be used and help prevent regressions.
--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -43,7 +43,7 @@ cat .env
 If you need to update your API keys, edit the `.env` file and then run:

 ```bash
-./setup-docker.sh
+./run-server.sh
 ```

 This will validate your configuration and restart the services.
@@ -73,7 +73,7 @@ See [Logging Documentation](logging.md) for more details on accessing logs.

 **"API key environment variable is required"**
 - Add your API key to the `.env` file
- Run: `./setup-docker.sh` to validate and restart
+- Run: `./run-server.sh` to validate and restart

 **File path errors**
 - Always use absolute paths: `/Users/you/project/file.py`