diff --git a/docs/README.md b/docs/README.md index 0b40434..4cac177 100644 --- a/docs/README.md +++ b/docs/README.md @@ -16,6 +16,7 @@ This documentation is organized into four main categories to serve different aud - **[Development Workflows](contributing/workflows.md)** - Git workflows, testing, and collaboration patterns - **[Code Style Guide](contributing/code-style.md)** - Coding standards and best practices - **[Testing Strategy](contributing/testing.md)** - Testing approaches and quality assurance +- **[Test Structure Analysis](contributing/test-structure.md)** - Detailed analysis of existing test suite - **[Repository Overview](contributing/file-overview.md)** - Understanding the codebase structure ### 🏗️ For System Architects diff --git a/docs/contributing/test-structure.md b/docs/contributing/test-structure.md new file mode 100644 index 0000000..3fd2e39 --- /dev/null +++ b/docs/contributing/test-structure.md @@ -0,0 +1,470 @@ +# Test Structure Documentation + +## Overview + +This document provides a comprehensive analysis of the existing test structure in the Gemini MCP Server project. The test suite consists of **17 specialized test files** organized to validate all aspects of the system from unit-level functionality to complex AI collaboration workflows. + +## Test Organization + +### Test Directory Structure + +``` +tests/ +├── __init__.py # Package initialization +├── conftest.py # Global test configuration and fixtures +├── test_claude_continuation.py # Claude continuation opportunities +├── test_collaboration.py # AI-to-AI collaboration features +├── test_config.py # Configuration validation +├── test_conversation_history_bug.py # Bug fix regression tests +├── test_conversation_memory.py # Redis-based conversation persistence +├── test_cross_tool_continuation.py # Cross-tool conversation threading +├── test_docker_path_integration.py # Docker environment path translation +├── test_large_prompt_handling.py # Large prompt detection and handling +├── test_live_integration.py # Live API testing (excluded from CI) +├── test_precommit.py # Pre-commit validation and git integration +├── test_prompt_regression.py # Normal prompt handling regression +├── test_server.py # Main server functionality +├── test_thinking_modes.py # Thinking mode functionality +├── test_tools.py # Individual tool implementations +└── test_utils.py # Utility function testing +``` + +## Test Categories and Analysis + +### 1. Core Functionality Tests + +#### `test_server.py` - Main Server Functionality +**Purpose**: Tests the core MCP server implementation and tool dispatch mechanism + +**Key Test Classes**: +- **Server startup and initialization** +- **Tool registration and availability** +- **Request routing and handling** +- **Error propagation and handling** + +**Example Coverage**: +```python +# Tests tool listing functionality +def test_list_tools() + +# Tests tool execution pipeline +async def test_call_tool() + +# Tests error handling for invalid tools +async def test_call_invalid_tool() +``` + +#### `test_config.py` - Configuration Management +**Purpose**: Validates configuration loading, environment variable handling, and settings validation + +**Key Areas**: +- **Environment variable parsing** +- **Default value handling** +- **Configuration validation** +- **Error handling for missing required config** + +#### `test_tools.py` - Tool Implementation Testing +**Purpose**: Tests individual tool implementations with comprehensive input validation + +**Key Features**: +- **Absolute path enforcement across all tools** +- **Parameter validation for each tool** +- **Error handling for malformed inputs** +- **Tool-specific behavior validation** + +**Critical Security Testing**: +```python +# Tests that all tools enforce absolute paths +async def test_tool_absolute_path_requirement() + +# Tests path traversal attack prevention +async def test_tool_path_traversal_prevention() +``` + +#### `test_utils.py` - Utility Function Testing +**Purpose**: Tests file utilities, token counting, and directory handling functions + +**Coverage Areas**: +- **File reading and processing** +- **Token counting and limits** +- **Directory traversal and expansion** +- **Path validation and security** + +### 2. Advanced Feature Tests + +#### `test_collaboration.py` - AI-to-AI Collaboration +**Purpose**: Tests dynamic context requests and collaborative AI workflows + +**Key Scenarios**: +- **Clarification request parsing** +- **Dynamic context expansion** +- **AI-to-AI communication protocols** +- **Collaboration workflow validation** + +**Example Test**: +```python +async def test_clarification_request_parsing(): + """Test parsing of AI clarification requests for additional context.""" + # Validates that Gemini can request additional files/context + # and Claude can respond appropriately +``` + +#### `test_cross_tool_continuation.py` - Cross-Tool Threading +**Purpose**: Tests conversation continuity across different tools + +**Critical Features**: +- **Continuation ID persistence** +- **Context preservation between tools** +- **Thread management across tool switches** +- **File context sharing between AI agents** + +#### `test_conversation_memory.py` - Memory Persistence +**Purpose**: Tests Redis-based conversation storage and retrieval + +**Test Coverage**: +- **Conversation storage and retrieval** +- **Thread context management** +- **TTL (time-to-live) handling** +- **Memory cleanup and optimization** + +#### `test_thinking_modes.py` - Cognitive Load Management +**Purpose**: Tests thinking mode functionality across all tools + +**Validation Areas**: +- **Token budget enforcement** +- **Mode selection and application** +- **Performance characteristics** +- **Quality vs. cost trade-offs** + +### 3. Specialized Testing + +#### `test_large_prompt_handling.py` - Scale Testing +**Purpose**: Tests handling of prompts exceeding MCP token limits + +**Key Scenarios**: +- **Large prompt detection (>50,000 characters)** +- **Automatic file-based prompt handling** +- **MCP token limit workarounds** +- **Response capacity preservation** + +**Critical Flow Testing**: +```python +async def test_large_prompt_file_handling(): + """Test that large prompts are automatically handled via file mechanism.""" + # Validates the workaround for MCP's 25K token limit +``` + +#### `test_docker_path_integration.py` - Environment Testing +**Purpose**: Tests Docker environment path translation and workspace mounting + +**Coverage**: +- **Host-to-container path mapping** +- **Workspace directory access** +- **Cross-platform path handling** +- **Security boundary enforcement** + +#### `test_precommit.py` - Quality Gate Testing +**Purpose**: Tests pre-commit validation and git integration + +**Validation Areas**: +- **Git repository discovery** +- **Change detection and analysis** +- **Multi-repository support** +- **Security scanning of changes** + +### 4. Regression and Bug Fix Tests + +#### `test_conversation_history_bug.py` - Bug Fix Validation +**Purpose**: Regression test for conversation history duplication bug + +**Specific Coverage**: +- **Conversation deduplication** +- **History consistency** +- **Memory leak prevention** +- **Thread integrity** + +#### `test_prompt_regression.py` - Normal Operation Validation +**Purpose**: Ensures normal prompt handling continues to work correctly + +**Test Focus**: +- **Standard prompt processing** +- **Backward compatibility** +- **Feature regression prevention** +- **Performance baseline maintenance** + +#### `test_claude_continuation.py` - Session Management +**Purpose**: Tests Claude continuation opportunities and session management + +**Key Areas**: +- **Session state management** +- **Continuation opportunity detection** +- **Context preservation** +- **Session cleanup and termination** + +### 5. Live Integration Testing + +#### `test_live_integration.py` - Real API Testing +**Purpose**: Tests actual Gemini API integration (excluded from regular CI) + +**Requirements**: +- Valid `GEMINI_API_KEY` environment variable +- Network connectivity to Google AI services +- Redis server for conversation memory testing + +**Test Categories**: +- **Basic API request/response validation** +- **Tool execution with real Gemini responses** +- **Conversation threading with actual AI** +- **Error handling with real API responses** + +**Exclusion from CI**: +```python +@pytest.mark.skipif(not os.getenv("GEMINI_API_KEY"), reason="API key required") +class TestLiveIntegration: + """Tests requiring actual Gemini API access.""" +``` + +## Test Configuration Analysis + +### `conftest.py` - Global Test Setup + +**Key Fixtures and Configuration**: + +#### Environment Isolation +```python +# Ensures tests run in isolated sandbox environment +os.environ["MCP_PROJECT_ROOT"] = str(temp_dir) +``` + +#### Dummy API Keys +```python +# Provides safe dummy keys for testing without real credentials +os.environ["GEMINI_API_KEY"] = "dummy-key-for-testing" +``` + +#### Cross-Platform Compatibility +```python +# Handles Windows async event loop configuration +if platform.system() == "Windows": + asyncio.set_event_loop_policy(asyncio.WindowsProactorEventLoopPolicy()) +``` + +#### Project Path Fixtures +```python +@pytest.fixture +def project_path(): + """Provides safe project path for file operations in tests.""" +``` + +### `pytest.ini` - Test Runner Configuration + +**Key Settings**: +```ini +[pytest] +testpaths = tests +python_files = test_*.py +python_classes = Test* +python_functions = test_* +asyncio_mode = auto +addopts = + -v + --strict-markers + --tb=short +``` + +## Mocking Strategies + +### 1. Gemini API Mocking + +**Pattern Used**: +```python +@patch("tools.base.BaseTool.create_model") +async def test_tool_execution(self, mock_create_model): + mock_model = Mock() + mock_model.generate_content.return_value = Mock( + candidates=[Mock(content=Mock(parts=[Mock(text="Mocked response")]))] + ) + mock_create_model.return_value = mock_model +``` + +**Benefits**: +- **No API key required** for unit and integration tests +- **Predictable responses** for consistent testing +- **Fast execution** without network dependencies +- **Cost-effective** testing without API charges + +### 2. Redis Memory Mocking + +**Pattern Used**: +```python +@patch("utils.conversation_memory.get_redis_client") +def test_conversation_flow(self, mock_redis): + mock_client = Mock() + mock_redis.return_value = mock_client + # Test conversation persistence logic +``` + +**Advantages**: +- **No Redis server required** for testing +- **Controlled state** for predictable test scenarios +- **Error simulation** for resilience testing + +### 3. File System Mocking + +**Pattern Used**: +```python +@patch("builtins.open", mock_open(read_data="test file content")) +@patch("os.path.exists", return_value=True) +def test_file_operations(): + # Test file reading without actual files +``` + +**Security Benefits**: +- **No file system access** during testing +- **Path validation testing** without security risks +- **Consistent test data** across environments + +## Security Testing Focus + +### Path Validation Testing + +**Critical Security Tests**: +1. **Absolute path enforcement** - All tools must reject relative paths +2. **Directory traversal prevention** - Block `../` and similar patterns +3. **Symlink attack prevention** - Detect and block symbolic link attacks +4. **Sandbox boundary enforcement** - Restrict access to allowed directories + +**Example Security Test**: +```python +async def test_path_traversal_attack_prevention(): + """Test that directory traversal attacks are blocked.""" + dangerous_paths = [ + "../../../etc/passwd", + "/etc/shadow", + "~/../../root/.ssh/id_rsa" + ] + + for path in dangerous_paths: + with pytest.raises(SecurityError): + await tool.execute({"files": [path]}) +``` + +### Docker Security Testing + +**Container Security Validation**: +- **Workspace mounting** - Verify read-only access enforcement +- **Path translation** - Test host-to-container path mapping +- **Privilege boundaries** - Ensure container cannot escape sandbox + +## Test Execution Patterns + +### Parallel Test Execution + +**Strategy**: Tests are designed for parallel execution with proper isolation + +**Benefits**: +- **Faster test suite** execution +- **Resource efficiency** for CI/CD +- **Scalable testing** for large codebases + +### Conditional Test Execution + +**Live Test Skipping**: +```python +@pytest.mark.skipif(not os.getenv("GEMINI_API_KEY"), reason="API key required") +``` + +**Platform-Specific Tests**: +```python +@pytest.mark.skipif(platform.system() == "Windows", reason="Unix-specific test") +``` + +## Test Quality Metrics + +### Coverage Analysis + +**Current Test Coverage by Category**: +- ✅ **Tool Functionality**: All 7 tools comprehensively tested +- ✅ **Server Operations**: Complete request/response cycle coverage +- ✅ **Security Validation**: Path safety and access control testing +- ✅ **Collaboration Features**: AI-to-AI communication patterns +- ✅ **Memory Management**: Conversation persistence and threading +- ✅ **Error Handling**: Graceful degradation and error recovery + +### Test Reliability + +**Design Characteristics**: +- **Deterministic**: Tests produce consistent results +- **Isolated**: No test dependencies or shared state +- **Fast**: Unit tests complete in milliseconds +- **Comprehensive**: Edge cases and error conditions covered + +## Integration with Development Workflow + +### Test-Driven Development Support + +**TDD Cycle Integration**: +1. **Red**: Write failing test for new functionality +2. **Green**: Implement minimal code to pass test +3. **Refactor**: Improve code while maintaining test coverage + +### Pre-Commit Testing + +**Quality Gates**: +- **Security validation** before commits +- **Functionality regression** prevention +- **Code quality** maintenance +- **Performance baseline** protection + +### CI/CD Integration + +**GitHub Actions Workflow**: +- **Multi-Python version** testing (3.10, 3.11, 3.12) +- **Parallel test execution** for efficiency +- **Selective live testing** when API keys available +- **Coverage reporting** and quality gates + +## Best Practices Demonstrated + +### 1. Comprehensive Mocking +Every external dependency is properly mocked for reliable testing + +### 2. Security-First Approach +Strong emphasis on security validation and vulnerability prevention + +### 3. Collaboration Testing +Extensive testing of AI-to-AI communication and workflow patterns + +### 4. Real-World Scenarios +Tests cover actual usage patterns and edge cases + +### 5. Maintainable Structure +Clear organization and focused test files for easy maintenance + +## Recommendations for Contributors + +### Adding New Tests + +1. **Follow Naming Conventions**: Use descriptive test names that explain the scenario +2. **Maintain Isolation**: Mock all external dependencies +3. **Test Security**: Include path validation and security checks +4. **Cover Edge Cases**: Test error conditions and boundary cases +5. **Document Purpose**: Use docstrings to explain test objectives + +### Test Quality Standards + +1. **Fast Execution**: Unit tests should complete in milliseconds +2. **Predictable Results**: Tests should be deterministic +3. **Clear Assertions**: Use descriptive assertion messages +4. **Proper Cleanup**: Ensure tests don't leave side effects + +### Testing New Features + +1. **Start with Unit Tests**: Test individual components first +2. **Add Integration Tests**: Test component interactions +3. **Include Security Tests**: Validate security measures +4. **Test Collaboration**: If relevant, test AI-to-AI workflows + +--- + +This test structure demonstrates a mature, production-ready testing approach that ensures code quality, security, and reliability while supporting the collaborative AI development patterns that make this project unique. \ No newline at end of file diff --git a/docs/contributing/testing.md b/docs/contributing/testing.md index 31571fc..fca7e68 100644 --- a/docs/contributing/testing.md +++ b/docs/contributing/testing.md @@ -679,6 +679,12 @@ jobs: file: ./coverage.xml ``` +## Detailed Test Structure Analysis + +For a comprehensive analysis of the existing test suite, including detailed breakdowns of all 17 test files, security testing patterns, and collaboration feature validation, see: + +**[Test Structure Documentation](test-structure.md)** - Complete analysis of existing test organization, mocking strategies, and quality assurance patterns + --- This comprehensive testing strategy ensures high-quality, reliable code while maintaining development velocity and supporting the collaborative patterns defined in CLAUDE.md. \ No newline at end of file