my-pal-mcp-server/docs/contributing/testing.md

# Testing Strategy & Guidelines

## Overview

This document outlines the comprehensive testing strategy for the Gemini MCP Server project, including unit testing, integration testing, and quality assurance practices that align with CLAUDE.md collaboration patterns.

## Testing Philosophy

### Test-Driven Development (TDD)

**TDD Cycle**:
1. **Red**: Write failing test for new functionality
2. **Green**: Implement minimal code to pass the test
3. **Refactor**: Improve code while maintaining test coverage
4. **Repeat**: Continue cycle for all new features

**Example TDD Flow**:
```python
# 1. Write failing test
def test_chat_tool_should_process_simple_prompt():
    tool = ChatTool()
    result = tool.execute({"prompt": "Hello"})
    assert result.status == "success"
    assert "hello" in result.content.lower()

# 2. Implement minimal functionality
class ChatTool:
    def execute(self, request):
        return ToolOutput(content="Hello!", status="success")

# 3. Refactor and enhance
```

### Testing Pyramid

```
    /\
   /  \     E2E Tests (Few, High-Value)
  /____\    Integration Tests (Some, Key Paths)
 /______\   Unit Tests (Many, Fast, Isolated)
/________\
```

**Distribution**:
- **70% Unit Tests**: Fast, isolated, comprehensive coverage
- **20% Integration Tests**: Component interaction validation
- **10% End-to-End Tests**: Complete workflow validation

## Test Categories

### 1. Unit Tests

**Purpose**: Test individual functions and classes in isolation

**Location**: `tests/unit/`

**Example Structure**:
```python
# tests/unit/test_file_utils.py
import pytest
from unittest.mock import Mock, patch, mock_open

from utils.file_utils import validate_file_path, read_file_with_token_limit

class TestFileUtils:
    """Unit tests for file utility functions."""

    def test_validate_file_path_with_safe_path(self):
        """Test that safe file paths pass validation."""
        safe_path = "/workspace/tools/chat.py"
        assert validate_file_path(safe_path) is True

    def test_validate_file_path_with_traversal_attack(self):
        """Test that directory traversal attempts are blocked."""
        dangerous_path = "/workspace/../../../etc/passwd"
        with pytest.raises(SecurityError):
            validate_file_path(dangerous_path)

    @patch('builtins.open', new_callable=mock_open, read_data="test content")
    def test_read_file_with_token_limit(self, mock_file):
        """Test file reading with token budget enforcement."""
        content = read_file_with_token_limit("/test/file.py", max_tokens=100)
        assert "test content" in content
        mock_file.assert_called_once_with("/test/file.py", 'r', encoding='utf-8')
```

**Unit Test Guidelines**:
- **Isolation**: Mock external dependencies (file system, network, database)
- **Fast Execution**: Each test should complete in milliseconds
- **Single Responsibility**: One test per behavior/scenario
- **Descriptive Names**: Test names should describe the scenario and expected outcome

### 2. Integration Tests

**Purpose**: Test component interactions and system integration

**Location**: `tests/integration/`

**Example Structure**:
```python
# tests/integration/test_tool_execution.py
import pytest
import asyncio
from unittest.mock import patch

from server import call_tool
from tools.chat import ChatTool
from utils.conversation_memory import ConversationMemory

class TestToolExecution:
    """Integration tests for tool execution pipeline."""

    @pytest.fixture
    def mock_redis(self):
        """Mock Redis for conversation memory testing."""
        with patch('redis.Redis') as mock:
            yield mock

    @pytest.fixture
    def conversation_memory(self, mock_redis):
        """Create conversation memory with mocked Redis."""
        return ConversationMemory("redis://mock")

    async def test_chat_tool_execution_with_memory(self, conversation_memory):
        """Test chat tool execution with conversation memory integration."""
        # Arrange
        request = {
            "name": "chat",
            "arguments": {
                "prompt": "Hello",
                "continuation_id": "test-thread-123"
            }
        }

        # Act
        result = await call_tool(request["name"], request["arguments"])

        # Assert
        assert len(result) == 1
        assert result[0].type == "text"
        assert "hello" in result[0].text.lower()

    async def test_tool_execution_error_handling(self):
        """Test error handling in tool execution pipeline."""
        # Test with invalid tool name
        with pytest.raises(ToolNotFoundError):
            await call_tool("nonexistent_tool", {})
```

**Integration Test Guidelines**:
- **Real Component Interaction**: Test actual component communication
- **Mock External Services**: Mock external APIs (Gemini, Redis) for reliability
- **Error Scenarios**: Test error propagation and handling
- **Async Testing**: Use pytest-asyncio for async code testing

### 3. Live Integration Tests

**Purpose**: Test real API integration with external services

**Location**: `tests/live/`

**Requirements**:
- Valid `GEMINI_API_KEY` environment variable
- Redis server running (for conversation memory tests)
- Network connectivity

**Example Structure**:
```python
# tests/live/test_gemini_integration.py
import pytest
import os

from tools.chat import ChatTool
from tools.models import GeminiClient

@pytest.mark.live
@pytest.mark.skipif(not os.getenv("GEMINI_API_KEY"), reason="API key required")
class TestGeminiIntegration:
    """Live tests requiring actual Gemini API access."""

    def setup_method(self):
        """Set up for live testing."""
        self.api_key = os.getenv("GEMINI_API_KEY")
        self.client = GeminiClient(self.api_key)

    async def test_basic_gemini_request(self):
        """Test basic Gemini API request/response."""
        response = await self.client.generate_response(
            prompt="Say 'test successful'",
            thinking_mode="minimal"
        )
        assert "test successful" in response.lower()

    async def test_chat_tool_with_real_api(self):
        """Test ChatTool with real Gemini API integration."""
        tool = ChatTool()
        result = await tool.execute({
            "prompt": "What is 2+2?",
            "thinking_mode": "minimal"
        })

        assert result.status == "success"
        assert "4" in result.content
```

**Live Test Guidelines**:
- **Skip When Unavailable**: Skip if API keys or services unavailable
- **Rate Limiting**: Respect API rate limits with delays
- **Minimal Mode**: Use minimal thinking mode for speed
- **Cleanup**: Clean up any created resources

### 4. Security Tests

**Purpose**: Validate security measures and vulnerability prevention

**Location**: `tests/security/`

**Example Structure**:
```python
# tests/security/test_path_validation.py
import pytest

from utils.file_utils import validate_file_path
from exceptions import SecurityError

class TestSecurityValidation:
    """Security-focused tests for input validation."""

    @pytest.mark.parametrize("dangerous_path", [
        "../../../etc/passwd",
        "/etc/shadow",
        "~/../../root/.ssh/id_rsa",
        "/var/log/auth.log",
        "\\..\\..\\windows\\system32\\config\\sam"
    ])
    def test_dangerous_path_rejection(self, dangerous_path):
        """Test that dangerous file paths are rejected."""
        with pytest.raises(SecurityError):
            validate_file_path(dangerous_path)

    def test_secret_sanitization_in_logs(self):
        """Test that sensitive data is sanitized in log output."""
        request_data = {
            "prompt": "Hello",
            "api_key": "sk-secret123",
            "token": "bearer-token-456"
        }

        sanitized = sanitize_for_logging(request_data)

        assert sanitized["api_key"] == "[REDACTED]"
        assert sanitized["token"] == "[REDACTED]"
        assert sanitized["prompt"] == "Hello"  # Non-sensitive data preserved
```

## Test Configuration

### pytest Configuration

**pytest.ini**:
```ini
[tool:pytest]
testpaths = tests
python_files = test_*.py *_test.py
python_classes = Test*
python_functions = test_*
addopts =
    -v
    --strict-markers
    --disable-warnings
    --cov=tools
    --cov=utils
    --cov-report=html
    --cov-report=term-missing
    --cov-fail-under=80

markers =
    unit: Unit tests (fast, isolated)
    integration: Integration tests (component interaction)
    live: Live tests requiring API keys and external services
    security: Security-focused tests
    slow: Tests that take more than 1 second
```

**conftest.py**:
```python
# tests/conftest.py
import pytest
import asyncio
from unittest.mock import Mock, patch

@pytest.fixture(scope="session")
def event_loop():
    """Create an instance of the default event loop for the test session."""
    loop = asyncio.get_event_loop_policy().new_event_loop()
    yield loop
    loop.close()

@pytest.fixture
def mock_gemini_client():
    """Mock Gemini client for testing without API calls."""
    with patch('tools.models.GeminiClient') as mock:
        mock_instance = Mock()
        mock_instance.generate_response.return_value = "Mocked response"
        mock.return_value = mock_instance
        yield mock_instance

@pytest.fixture
def mock_redis():
    """Mock Redis client for testing without Redis server."""
    with patch('redis.Redis') as mock:
        yield mock

@pytest.fixture
def sample_file_content():
    """Sample file content for testing file processing."""
    return """
def example_function():
    # This is a sample function
    return "hello world"

class ExampleClass:
    def method(self):
        pass
"""

@pytest.fixture
def temp_project_directory(tmp_path):
    """Create temporary project directory structure for testing."""
    project_dir = tmp_path / "test_project"
    project_dir.mkdir()

    # Create subdirectories
    (project_dir / "tools").mkdir()
    (project_dir / "utils").mkdir()
    (project_dir / "tests").mkdir()

    # Create sample files
    (project_dir / "tools" / "sample.py").write_text("# Sample tool")
    (project_dir / "utils" / "helper.py").write_text("# Helper utility")

    return project_dir
```

## Test Data Management

### Test Fixtures

**File-based Fixtures**:
```python
# tests/fixtures/sample_code.py
PYTHON_CODE_SAMPLE = '''
import asyncio
from typing import Dict, List

async def process_data(items: List[str]) -> Dict[str, int]:
    """Process a list of items and return counts."""
    result = {}
    for item in items:
        result[item] = len(item)
    return result
'''

JAVASCRIPT_CODE_SAMPLE = '''
async function processData(items) {
    const result = {};
    for (const item of items) {
        result[item] = item.length;
    }
    return result;
}
'''

ERROR_LOGS_SAMPLE = '''
2025-01-11 23:45:12 ERROR [tool_execution] Tool 'analyze' failed: File not found
Traceback (most recent call last):
  File "/app/tools/analyze.py", line 45, in execute
    content = read_file(file_path)
  File "/app/utils/file_utils.py", line 23, in read_file
    with open(file_path, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/nonexistent/file.py'
'''
```

### Mock Data Factories

**ToolOutput Factory**:
```python
# tests/factories.py
from dataclasses import dataclass
from typing import Dict, Any, List

def create_tool_output(
    content: str = "Default response",
    status: str = "success",
    metadata: Dict[str, Any] = None,
    files_processed: List[str] = None
) -> ToolOutput:
    """Factory for creating ToolOutput test instances."""
    return ToolOutput(
        content=content,
        metadata=metadata or {},
        files_processed=files_processed or [],
        status=status
    )

def create_thread_context(
    thread_id: str = "test-thread-123",
    files: List[str] = None
) -> ThreadContext:
    """Factory for creating ThreadContext test instances."""
    return ThreadContext(
        thread_id=thread_id,
        conversation_files=set(files or []),
        tool_history=[],
        context_tokens=0
    )
```

## Mocking Strategies

### External Service Mocking

**Gemini API Mocking**:
```python
class MockGeminiClient:
    """Mock Gemini client for testing."""

    def __init__(self, responses: Dict[str, str] = None):
        self.responses = responses or {
            "default": "This is a mocked response from Gemini"
        }
        self.call_count = 0

    async def generate_response(self, prompt: str, **kwargs) -> str:
        """Mock response generation."""
        self.call_count += 1

        # Return specific response for specific prompts
        for key, response in self.responses.items():
            if key in prompt.lower():
                return response

        return self.responses.get("default", "Mock response")

# Usage in tests
@patch('tools.models.GeminiClient', MockGeminiClient)
def test_with_mocked_gemini():
    # Test implementation
    pass
```

**File System Mocking**:
```python
@patch('builtins.open', mock_open(read_data="file content"))
@patch('os.path.exists', return_value=True)
@patch('os.path.getsize', return_value=1024)
def test_file_operations():
    """Test file operations with mocked file system."""
    content = read_file("/mocked/file.py")
    assert content == "file content"
```

## Performance Testing

### Load Testing

**Concurrent Tool Execution**:
```python
# tests/performance/test_load.py
import asyncio
import pytest
import time

@pytest.mark.slow
class TestPerformance:
    """Performance tests for system load handling."""

    async def test_concurrent_tool_execution(self):
        """Test system performance under concurrent load."""
        start_time = time.time()

        # Create 10 concurrent tool execution tasks
        tasks = []
        for i in range(10):
            task = asyncio.create_task(
                call_tool("chat", {"prompt": f"Request {i}"})
            )
            tasks.append(task)

        # Wait for all tasks to complete
        results = await asyncio.gather(*tasks)

        end_time = time.time()
        execution_time = end_time - start_time

        # Verify all requests succeeded
        assert len(results) == 10
        assert all(len(result) == 1 for result in results)

        # Performance assertion (adjust based on requirements)
        assert execution_time < 30.0  # All requests should complete within 30s

    async def test_memory_usage_stability(self):
        """Test that memory usage remains stable under load."""
        import psutil
        import gc

        process = psutil.Process()
        initial_memory = process.memory_info().rss

        # Execute multiple operations
        for i in range(100):
            await call_tool("chat", {"prompt": f"Memory test {i}"})

            # Force garbage collection periodically
            if i % 10 == 0:
                gc.collect()

        final_memory = process.memory_info().rss
        memory_growth = final_memory - initial_memory

        # Memory growth should be reasonable (adjust threshold as needed)
        assert memory_growth < 100 * 1024 * 1024  # Less than 100MB growth
```

## Test Execution

### Running Tests

**Basic Test Execution**:
```bash
# Run all tests
pytest

# Run specific test categories
pytest -m unit                    # Unit tests only
pytest -m integration            # Integration tests only
pytest -m "not live"             # All tests except live tests
pytest -m "live and not slow"    # Live tests that are fast

# Run with coverage
pytest --cov=tools --cov=utils --cov-report=html

# Run specific test file
pytest tests/unit/test_file_utils.py -v

# Run specific test method
pytest tests/unit/test_file_utils.py::TestFileUtils::test_validate_file_path -v
```

**Continuous Integration**:
```bash
# CI test script
#!/bin/bash
set -e

echo "Running unit tests..."
pytest -m unit --cov=tools --cov=utils --cov-fail-under=80

echo "Running integration tests..."
pytest -m integration

echo "Running security tests..."
pytest -m security

echo "Checking code quality..."
flake8 tools/ utils/ tests/
mypy tools/ utils/

echo "All tests passed!"
```

### Test Reports

**Coverage Reports**:
```bash
# Generate HTML coverage report
pytest --cov=tools --cov=utils --cov-report=html
open htmlcov/index.html

# Generate terminal coverage report
pytest --cov=tools --cov=utils --cov-report=term-missing
```

**Test Results Export**:
```bash
# Export test results to JUnit XML (for CI integration)
pytest --junitxml=test-results.xml

# Export test results with timing information
pytest --durations=10  # Show 10 slowest tests
```

## Quality Metrics

### Coverage Targets

**Minimum Coverage Requirements**:
- **Overall Coverage**: 80%
- **Critical Modules**: 90% (security, file_utils, conversation_memory)
- **Tool Modules**: 85%
- **Utility Modules**: 80%

**Coverage Enforcement**:
```bash
# Fail build if coverage drops below threshold
pytest --cov-fail-under=80
```

### Test Quality Metrics

**Test Suite Characteristics**:
- **Fast Execution**: Unit test suite should complete in <30 seconds
- **Reliable**: Tests should have <1% flaky failure rate
- **Maintainable**: Test code should follow same quality standards as production code
- **Comprehensive**: All critical paths and edge cases covered

## Integration with Development Workflow

### Pre-commit Testing

**Git Hook Integration**:
```bash
#!/bin/sh
# .git/hooks/pre-commit

echo "Running pre-commit tests..."

# Run fast tests before commit
pytest -m "unit and not slow" --cov-fail-under=80

if [ $? -ne 0 ]; then
    echo "Tests failed. Commit blocked."
    exit 1
fi

echo "Pre-commit tests passed."
```

### CI/CD Integration

**GitHub Actions Workflow**:
```yaml
name: Test Suite
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: [3.9, 3.10, 3.11]

    steps:
      - uses: actions/checkout@v3
      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v4
        with:
          python-version: ${{ matrix.python-version }}

      - name: Install dependencies
        run: |
          pip install -r requirements.txt
          pip install pytest pytest-cov pytest-asyncio

      - name: Run unit tests
        run: pytest -m unit --cov=tools --cov=utils --cov-fail-under=80

      - name: Run integration tests
        run: pytest -m integration

      - name: Run security tests
        run: pytest -m security

      - name: Upload coverage reports
        uses: codecov/codecov-action@v3
        with:
          file: ./coverage.xml
```

## Detailed Test Structure Analysis

For a comprehensive analysis of the existing test suite, including detailed breakdowns of all 17 test files, security testing patterns, and collaboration feature validation, see:

**[Test Structure Documentation](test-structure.md)** - Complete analysis of existing test organization, mocking strategies, and quality assurance patterns

---

This comprehensive testing strategy ensures high-quality, reliable code while maintaining development velocity and supporting the collaborative patterns defined in CLAUDE.md.