adding detailed docs

2025-06-11 14:34:19 +02:00
parent f5f17b3597
commit 95ced22973
13 changed files with 4881 additions and 9 deletions
--- a/docs/architecture/components.md
+++ b/docs/architecture/components.md
@@ -0,0 +1,379 @@
+# System Components & Interactions
+
+## Component Architecture
+
+The Gemini MCP Server is built on a modular component architecture that enables sophisticated AI collaboration patterns while maintaining security and performance.
+
+## Core Components
+
+### 1. MCP Protocol Engine
+
+**Location**: `server.py:45-120`
+**Purpose**: Central communication hub implementing Model Context Protocol specification
+
+**Key Responsibilities**:
+- **Protocol Compliance**: Implements MCP v1.0 specification for Claude integration
+- **Message Routing**: Dispatches requests to appropriate tool handlers
+- **Error Handling**: Graceful degradation and error response formatting
+- **Lifecycle Management**: Server startup, shutdown, and resource cleanup
+
+**Implementation Details**:
+```python
+# server.py:67
+@server.list_tools()
+async def list_tools() -> list[types.Tool]:
+    """Dynamic tool discovery and registration"""
+    return [tool.get_schema() for tool in REGISTERED_TOOLS]
+
+@server.call_tool()
+async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
+    """Tool execution with error handling and response formatting"""
+```
+
+**Dependencies**:
+- `mcp` library for protocol implementation
+- `asyncio` for concurrent request processing
+- Tool registry for dynamic handler discovery
+
+### 2. Tool Architecture System
+
+**Location**: `tools/` directory
+**Purpose**: Modular plugin system for specialized AI capabilities
+
+#### BaseTool Abstract Class (`tools/base.py:25`)
+
+**Interface Contract**:
+```python
+class BaseTool(ABC):
+    @abstractmethod
+    async def execute(self, request: dict) -> ToolOutput:
+        """Core tool execution logic"""
+        
+    @abstractmethod
+    def get_schema(self) -> types.Tool:
+        """MCP tool schema definition"""
+        
+    def _format_response(self, content: str, metadata: dict) -> ToolOutput:
+        """Standardized response formatting"""
+```
+
+#### Individual Tool Components
+
+**Chat Tool** (`tools/chat.py:30`)
+- **Purpose**: Quick questions and general collaboration
+- **Thinking Mode**: Default 'medium' (8192 tokens)
+- **Use Cases**: Brainstorming, simple explanations, immediate answers
+
+**ThinkDeep Tool** (`tools/thinkdeep.py:45`)
+- **Purpose**: Complex analysis and strategic planning
+- **Thinking Mode**: Default 'high' (16384 tokens)
+- **Use Cases**: Architecture decisions, design exploration, comprehensive analysis
+
+**CodeReview Tool** (`tools/codereview.py:60`)
+- **Purpose**: Code quality and security analysis
+- **Thinking Mode**: Default 'medium' (8192 tokens)
+- **Use Cases**: Bug detection, security audits, quality validation
+
+**Analyze Tool** (`tools/analyze.py:75`)
+- **Purpose**: Codebase exploration and understanding
+- **Thinking Mode**: Variable based on scope
+- **Use Cases**: Dependency analysis, pattern detection, system comprehension
+
+**Debug Tool** (`tools/debug.py:90`)
+- **Purpose**: Error investigation and root cause analysis
+- **Thinking Mode**: Default 'medium' (8192 tokens)
+- **Use Cases**: Stack trace analysis, bug diagnosis, performance issues
+
+**Precommit Tool** (`tools/precommit.py:105`)
+- **Purpose**: Automated quality gates and validation
+- **Thinking Mode**: Default 'medium' (8192 tokens)
+- **Use Cases**: Pre-commit validation, change analysis, quality assurance
+
+### 3. Security Engine
+
+**Location**: `utils/file_utils.py:45-120`
+**Purpose**: Multi-layer security validation and enforcement
+
+#### Security Components
+
+**Path Validation System**:
+```python
+# utils/file_utils.py:67
+def validate_file_path(file_path: str) -> bool:
+    """Multi-layer path security validation"""
+    # 1. Dangerous path detection
+    dangerous_patterns = ['../', '~/', '/etc/', '/var/', '/usr/']
+    if any(pattern in file_path for pattern in dangerous_patterns):
+        return False
+    
+    # 2. Absolute path requirement
+    if not os.path.isabs(file_path):
+        return False
+    
+    # 3. Sandbox boundary enforcement
+    return file_path.startswith(PROJECT_ROOT)
+```
+
+**Docker Path Translation**:
+```python
+# utils/file_utils.py:89
+def translate_docker_path(host_path: str) -> str:
+    """Convert host paths to container paths for Docker environment"""
+    if host_path.startswith(WORKSPACE_ROOT):
+        return host_path.replace(WORKSPACE_ROOT, '/workspace', 1)
+    return host_path
+```
+
+**Security Layers**:
+1. **Input Sanitization**: Path cleaning and normalization
+2. **Pattern Matching**: Dangerous path detection and blocking
+3. **Boundary Enforcement**: PROJECT_ROOT containment validation
+4. **Container Translation**: Safe host-to-container path mapping
+
+### 4. Conversation Memory System
+
+**Location**: `utils/conversation_memory.py:30-150`
+**Purpose**: Cross-session context preservation and threading
+
+#### Memory Components
+
+**Thread Context Management**:
+```python
+# utils/conversation_memory.py:45
+class ThreadContext:
+    thread_id: str
+    tool_history: List[ToolExecution]
+    conversation_files: Set[str]
+    context_tokens: int
+    created_at: datetime
+    last_accessed: datetime
+```
+
+**Redis Integration**:
+```python
+# utils/conversation_memory.py:78
+class ConversationMemory:
+    def __init__(self, redis_url: str):
+        self.redis = redis.from_url(redis_url)
+    
+    async def store_thread(self, context: ThreadContext) -> None:
+        """Persist conversation thread to Redis"""
+    
+    async def retrieve_thread(self, thread_id: str) -> Optional[ThreadContext]:
+        """Reconstruct conversation from storage"""
+    
+    async def cleanup_expired_threads(self) -> int:
+        """Remove old conversations to manage memory"""
+```
+
+**Memory Features**:
+- **Thread Persistence**: UUID-based conversation storage
+- **Context Reconstruction**: Full conversation history retrieval
+- **File Deduplication**: Efficient storage of repeated file references
+- **Automatic Cleanup**: Time-based thread expiration
+
+### 5. File Processing Pipeline
+
+**Location**: `utils/file_utils.py:120-200`
+**Purpose**: Token-aware file reading and content optimization
+
+#### Processing Components
+
+**Priority System**:
+```python
+# utils/file_utils.py:134
+FILE_PRIORITIES = {
+    '.py': 1,    # Python source code (highest priority)
+    '.js': 1,    # JavaScript source
+    '.ts': 1,    # TypeScript source
+    '.md': 2,    # Documentation
+    '.txt': 3,   # Text files
+    '.log': 4,   # Log files (lowest priority)
+}
+```
+
+**Token Management**:
+```python
+# utils/file_utils.py:156
+def read_file_with_token_limit(file_path: str, max_tokens: int) -> str:
+    """Read file content with token budget enforcement"""
+    try:
+        with open(file_path, 'r', encoding='utf-8') as f:
+            content = f.read()
+        
+        # Token estimation and truncation
+        estimated_tokens = len(content) // 4  # Rough estimation
+        if estimated_tokens > max_tokens:
+            # Truncate with preservation of structure
+            content = content[:max_tokens * 4]
+        
+        return format_file_content(content, file_path)
+    except Exception as e:
+        return f"Error reading {file_path}: {str(e)}"
+```
+
+**Content Formatting**:
+- **Line Numbers**: Added for precise code references
+- **Error Handling**: Graceful failure with informative messages
+- **Structure Preservation**: Maintains code formatting and indentation
+
+### 6. Gemini API Integration
+
+**Location**: `tools/models.py:25-80`
+**Purpose**: Standardized interface to Google's Gemini models
+
+#### Integration Components
+
+**API Client**:
+```python
+# tools/models.py:34
+class GeminiClient:
+    def __init__(self, api_key: str, model: str = "gemini-2.0-flash-thinking-exp"):
+        self.client = genai.GenerativeModel(model)
+        self.api_key = api_key
+    
+    async def generate_response(self, 
+                              prompt: str, 
+                              thinking_mode: str = 'medium',
+                              files: List[str] = None) -> str:
+        """Generate response with thinking mode and file context"""
+```
+
+**Model Configuration**:
+```python
+# config.py:24
+GEMINI_MODEL = os.getenv('GEMINI_MODEL', 'gemini-2.0-flash-thinking-exp')
+MAX_CONTEXT_TOKENS = int(os.getenv('MAX_CONTEXT_TOKENS', '1000000'))
+```
+
+**Thinking Mode Management**:
+```python
+# tools/models.py:67
+THINKING_MODE_TOKENS = {
+    'minimal': 128,
+    'low': 2048,
+    'medium': 8192,
+    'high': 16384,
+    'max': 32768
+}
+```
+
+## Component Interactions
+
+### 1. Request Processing Flow
+
+```
+Claude Request
+    ↓
+MCP Protocol Engine (server.py:67)
+    ↓ (validate & route)
+Tool Selection & Loading
+    ↓
+Security Validation (utils/file_utils.py:67)
+    ↓ (if files involved)
+File Processing Pipeline (utils/file_utils.py:134)
+    ↓
+Conversation Context Loading (utils/conversation_memory.py:78)
+    ↓ (if continuation_id provided)
+Gemini API Integration (tools/models.py:34)
+    ↓
+Response Processing & Formatting
+    ↓
+Conversation Storage (utils/conversation_memory.py:78)
+    ↓
+MCP Response to Claude
+```
+
+### 2. Security Integration Points
+
+**Pre-Tool Execution**:
+- Path validation before any file operations
+- Sandbox boundary enforcement
+- Docker path translation for container environments
+
+**During Tool Execution**:
+- Token budget enforcement to prevent memory exhaustion
+- File access logging and monitoring
+- Error containment and graceful degradation
+
+**Post-Tool Execution**:
+- Response sanitization
+- Conversation storage with access controls
+- Resource cleanup and memory management
+
+### 3. Memory System Integration
+
+**Thread Creation**:
+```python
+# New conversation
+thread_id = str(uuid.uuid4())
+context = ThreadContext(thread_id=thread_id, ...)
+await memory.store_thread(context)
+```
+
+**Thread Continuation**:
+```python
+# Continuing conversation
+if continuation_id:
+    context = await memory.retrieve_thread(continuation_id)
+    # Merge new request with existing context
+```
+
+**Cross-Tool Communication**:
+```python
+# Tool A stores findings
+await memory.add_tool_execution(thread_id, tool_execution)
+
+# Tool B retrieves context
+context = await memory.retrieve_thread(thread_id)
+previous_findings = context.get_tool_outputs('analyze')
+```
+
+## Configuration & Dependencies
+
+### Environment Configuration
+
+**Required Settings** (`config.py`):
+```python
+GEMINI_API_KEY = os.getenv('GEMINI_API_KEY')  # Required
+GEMINI_MODEL = os.getenv('GEMINI_MODEL', 'gemini-2.0-flash-thinking-exp')
+PROJECT_ROOT = os.getenv('PROJECT_ROOT', '/workspace')
+REDIS_URL = os.getenv('REDIS_URL', 'redis://localhost:6379')
+MAX_CONTEXT_TOKENS = int(os.getenv('MAX_CONTEXT_TOKENS', '1000000'))
+```
+
+### Component Dependencies
+
+**Core Dependencies**:
+- `mcp`: MCP protocol implementation
+- `google-generativeai`: Gemini API client
+- `redis`: Conversation persistence
+- `asyncio`: Concurrent processing
+
+**Security Dependencies**:
+- `pathlib`: Path manipulation and validation
+- `os`: File system operations and environment access
+
+**Tool Dependencies**:
+- `pydantic`: Data validation and serialization
+- `typing`: Type hints and contract definition
+
+## Extension Architecture
+
+### Adding New Components
+
+1. **Tool Components**: Inherit from BaseTool and implement required interface
+2. **Security Components**: Extend validation chain in file_utils.py
+3. **Memory Components**: Add new storage backends via interface abstraction
+4. **Processing Components**: Extend file pipeline with new content types
+
+### Integration Patterns
+
+- **Plugin Architecture**: Dynamic discovery and registration
+- **Interface Segregation**: Clear contracts between components
+- **Dependency Injection**: Configuration-driven component assembly
+- **Error Boundaries**: Isolated failure handling per component
+
+---
+
+This component architecture provides a robust foundation for AI collaboration while maintaining security, performance, and extensibility requirements.
--- a/docs/architecture/data-flow.md
+++ b/docs/architecture/data-flow.md
@@ -0,0 +1,545 @@
+# Data Flow & Processing Patterns
+
+## Overview
+
+The Gemini MCP Server implements sophisticated data flow patterns that enable secure, efficient, and contextually-aware AI collaboration. This document traces data movement through the system with concrete examples and performance considerations.
+
+## Primary Data Flow Patterns
+
+### 1. Standard Tool Execution Flow
+
+```mermaid
+sequenceDiagram
+    participant C as Claude
+    participant M as MCP Engine
+    participant S as Security Layer
+    participant T as Tool Handler
+    participant G as Gemini API
+    participant R as Redis Memory
+
+    C->>M: MCP Request (tool_name, params)
+    M->>M: Validate Request Schema
+    M->>S: Security Validation
+    S->>S: Path Validation & Sanitization
+    S->>T: Secure Parameters
+    T->>R: Load Conversation Context
+    R-->>T: Thread Context (if exists)
+    T->>T: Process Files & Context
+    T->>G: Formatted Prompt + Context
+    G-->>T: AI Response
+    T->>R: Store Execution Result
+    T->>M: Formatted Tool Output
+    M->>C: MCP Response
+```
+
+**Example Request Flow**:
+```json
+// Claude → MCP Engine
+{
+  "method": "tools/call",
+  "params": {
+    "name": "analyze",
+    "arguments": {
+      "files": ["/workspace/tools/analyze.py"],
+      "question": "Explain the architecture pattern",
+      "continuation_id": "550e8400-e29b-41d4-a716-446655440000"
+    }
+  }
+}
+```
+
+### 2. File Processing Pipeline
+
+#### Stage 1: Security Validation (`utils/file_utils.py:67`)
+
+```python
+# Input: ["/workspace/tools/analyze.py", "../../../etc/passwd"]
+def validate_file_paths(file_paths: List[str]) -> List[str]:
+    validated = []
+    for path in file_paths:
+        # 1. Dangerous pattern detection
+        if any(danger in path for danger in ['../', '~/', '/etc/', '/var/']):
+            logger.warning(f"Blocked dangerous path: {path}")
+            continue
+            
+        # 2. Absolute path requirement
+        if not os.path.isabs(path):
+            path = os.path.abspath(path)
+            
+        # 3. Sandbox boundary check
+        if not path.startswith(PROJECT_ROOT):
+            logger.warning(f"Path outside sandbox: {path}")
+            continue
+            
+        validated.append(path)
+    
+    return validated
+# Output: ["/workspace/tools/analyze.py"]
+```
+
+#### Stage 2: Docker Path Translation (`utils/file_utils.py:89`)
+
+```python
+# Host Environment: /Users/user/project/tools/analyze.py
+# Container Environment: /workspace/tools/analyze.py
+def translate_paths_for_environment(paths: List[str]) -> List[str]:
+    translated = []
+    for path in paths:
+        if WORKSPACE_ROOT and path.startswith(WORKSPACE_ROOT):
+            container_path = path.replace(WORKSPACE_ROOT, '/workspace', 1)
+            translated.append(container_path)
+        else:
+            translated.append(path)
+    return translated
+```
+
+#### Stage 3: Priority-Based Processing (`utils/file_utils.py:134`)
+
+```python
+# File Priority Matrix
+FILE_PRIORITIES = {
+    '.py': 1,     # Source code (highest priority)
+    '.js': 1,     '.ts': 1,     '.tsx': 1,
+    '.md': 2,     # Documentation
+    '.json': 2,   '.yaml': 2,   '.yml': 2,
+    '.txt': 3,    # Text files
+    '.log': 4,    # Logs (lowest priority)
+}
+
+# Token Budget Allocation
+def allocate_token_budget(files: List[str], total_budget: int) -> Dict[str, int]:
+    # Priority 1 files get 60% of budget
+    # Priority 2 files get 30% of budget  
+    # Priority 3+ files get 10% of budget
+    
+    priority_groups = defaultdict(list)
+    for file in files:
+        ext = Path(file).suffix.lower()
+        priority = FILE_PRIORITIES.get(ext, 4)
+        priority_groups[priority].append(file)
+    
+    allocations = {}
+    if priority_groups[1]:  # Source code files
+        code_budget = int(total_budget * 0.6)
+        per_file = code_budget // len(priority_groups[1])
+        for file in priority_groups[1]:
+            allocations[file] = per_file
+            
+    if priority_groups[2]:  # Documentation files
+        doc_budget = int(total_budget * 0.3)
+        per_file = doc_budget // len(priority_groups[2])
+        for file in priority_groups[2]:
+            allocations[file] = per_file
+    
+    return allocations
+```
+
+#### Stage 4: Content Processing & Formatting
+
+```python
+def process_file_content(file_path: str, token_limit: int) -> str:
+    try:
+        with open(file_path, 'r', encoding='utf-8') as f:
+            content = f.read()
+        
+        # Token estimation (rough: 1 token ≈ 4 characters)
+        estimated_tokens = len(content) // 4
+        
+        if estimated_tokens > token_limit:
+            # Smart truncation preserving structure
+            lines = content.split('\n')
+            truncated_lines = []
+            current_tokens = 0
+            
+            for line in lines:
+                line_tokens = len(line) // 4
+                if current_tokens + line_tokens > token_limit:
+                    break
+                truncated_lines.append(line)
+                current_tokens += line_tokens
+            
+            content = '\n'.join(truncated_lines)
+            content += f"\n\n... [Truncated at {token_limit} tokens]"
+        
+        # Format with line numbers for precise references
+        lines = content.split('\n')
+        formatted_lines = []
+        for i, line in enumerate(lines, 1):
+            formatted_lines.append(f"{i:6d}\t{line}")
+        
+        return '\n'.join(formatted_lines)
+        
+    except Exception as e:
+        return f"Error reading {file_path}: {str(e)}"
+```
+
+### 3. Conversation Memory Flow
+
+#### Context Storage Pattern (`utils/conversation_memory.py:78`)
+
+```python
+# Tool execution creates persistent context
+async def store_tool_execution(thread_id: str, tool_execution: ToolExecution):
+    context = await self.retrieve_thread(thread_id) or ThreadContext(thread_id)
+    
+    # Add new execution to history
+    context.tool_history.append(tool_execution)
+    
+    # Update file set (deduplication)
+    if tool_execution.files:
+        context.conversation_files.update(tool_execution.files)
+    
+    # Update token tracking
+    context.context_tokens += tool_execution.response_tokens
+    context.last_accessed = datetime.now()
+    
+    # Persist to Redis
+    await self.redis.setex(
+        f"thread:{thread_id}",
+        timedelta(hours=24),  # 24-hour expiration
+        context.to_json()
+    )
+```
+
+#### Context Retrieval & Reconstruction
+
+```python
+async def build_conversation_context(thread_id: str) -> str:
+    context = await self.retrieve_thread(thread_id)
+    if not context:
+        return ""
+    
+    # Build conversation summary
+    summary_parts = []
+    
+    # Add file context (deduplicated)
+    if context.conversation_files:
+        summary_parts.append("## Previous Files Analyzed:")
+        for file_path in sorted(context.conversation_files):
+            summary_parts.append(f"- {file_path}")
+    
+    # Add tool execution history
+    if context.tool_history:
+        summary_parts.append("\n## Previous Analysis:")
+        for execution in context.tool_history[-3:]:  # Last 3 executions
+            summary_parts.append(f"**{execution.tool_name}**: {execution.summary}")
+    
+    return '\n'.join(summary_parts)
+```
+
+### 4. Thinking Mode Processing
+
+#### Dynamic Token Allocation (`tools/models.py:67`)
+
+```python
+# Thinking mode determines computational budget
+THINKING_MODE_TOKENS = {
+    'minimal': 128,    # Quick answers, simple questions
+    'low': 2048,      # Basic analysis, straightforward tasks
+    'medium': 8192,   # Standard analysis, moderate complexity
+    'high': 16384,    # Deep analysis, complex problems
+    'max': 32768      # Maximum depth, critical decisions
+}
+
+def prepare_gemini_request(prompt: str, thinking_mode: str, files: List[str]) -> dict:
+    # Calculate total context budget
+    thinking_tokens = THINKING_MODE_TOKENS.get(thinking_mode, 8192)
+    file_tokens = MAX_CONTEXT_TOKENS - thinking_tokens - 1000  # Reserve for response
+    
+    # Process files within budget
+    file_content = process_files_with_budget(files, file_tokens)
+    
+    # Construct final prompt
+    full_prompt = f"""
+{prompt}
+
+## Available Context ({thinking_tokens} thinking tokens allocated)
+
+{file_content}
+
+Please analyze using {thinking_mode} thinking mode.
+"""
+    
+    return {
+        'prompt': full_prompt,
+        'max_tokens': thinking_tokens,
+        'temperature': 0.2 if thinking_mode in ['high', 'max'] else 0.5
+    }
+```
+
+## Advanced Data Flow Patterns
+
+### 1. Cross-Tool Continuation Flow
+
+```python
+# Tool A (analyze) creates foundation
+analyze_result = await analyze_tool.execute({
+    'files': ['/workspace/tools/'],
+    'question': 'What is the architecture pattern?'
+})
+
+# Store context with continuation capability
+thread_id = str(uuid.uuid4())
+await memory.store_tool_execution(thread_id, ToolExecution(
+    tool_name='analyze',
+    files=['/workspace/tools/'],
+    summary='Identified MCP plugin architecture pattern',
+    continuation_id=thread_id
+))
+
+# Tool B (thinkdeep) continues analysis
+thinkdeep_result = await thinkdeep_tool.execute({
+    'current_analysis': analyze_result.content,
+    'focus_areas': ['scalability', 'security'],
+    'continuation_id': thread_id  # Links to previous context
+})
+```
+
+### 2. Error Recovery & Graceful Degradation
+
+```python
+def resilient_file_processing(files: List[str]) -> str:
+    """Process files with graceful error handling"""
+    results = []
+    
+    for file_path in files:
+        try:
+            content = read_file_safely(file_path)
+            results.append(f"=== {file_path} ===\n{content}")
+        except PermissionError:
+            results.append(f"=== {file_path} ===\nERROR: Permission denied")
+        except FileNotFoundError:
+            results.append(f"=== {file_path} ===\nERROR: File not found")
+        except UnicodeDecodeError:
+            # Try binary file detection
+            try:
+                with open(file_path, 'rb') as f:
+                    header = f.read(16)
+                    if is_binary_file(header):
+                        results.append(f"=== {file_path} ===\nBinary file (skipped)")
+                    else:
+                        results.append(f"=== {file_path} ===\nERROR: Encoding issue")
+            except:
+                results.append(f"=== {file_path} ===\nERROR: Unreadable file")
+        except Exception as e:
+            results.append(f"=== {file_path} ===\nERROR: {str(e)}")
+    
+    return '\n\n'.join(results)
+```
+
+### 3. Performance Optimization Patterns
+
+#### Concurrent File Processing
+
+```python
+async def process_files_concurrently(files: List[str], token_budget: int) -> str:
+    """Process multiple files concurrently with shared budget"""
+    
+    # Allocate budget per file
+    allocations = allocate_token_budget(files, token_budget)
+    
+    # Create processing tasks
+    tasks = []
+    for file_path in files:
+        task = asyncio.create_task(
+            process_single_file(file_path, allocations.get(file_path, 1000))
+        )
+        tasks.append(task)
+    
+    # Wait for all files to complete
+    results = await asyncio.gather(*tasks, return_exceptions=True)
+    
+    # Combine results, handling exceptions
+    processed_content = []
+    for i, result in enumerate(results):
+        if isinstance(result, Exception):
+            processed_content.append(f"Error processing {files[i]}: {result}")
+        else:
+            processed_content.append(result)
+    
+    return '\n\n'.join(processed_content)
+```
+
+#### Intelligent Caching
+
+```python
+class FileContentCache:
+    def __init__(self, max_size: int = 100):
+        self.cache = {}
+        self.access_times = {}
+        self.max_size = max_size
+    
+    async def get_file_content(self, file_path: str, token_limit: int) -> str:
+        # Create cache key including token limit
+        cache_key = f"{file_path}:{token_limit}"
+        
+        # Check cache hit
+        if cache_key in self.cache:
+            self.access_times[cache_key] = time.time()
+            return self.cache[cache_key]
+        
+        # Process file and cache result
+        content = await process_file_content(file_path, token_limit)
+        
+        # Evict oldest entries if cache full
+        if len(self.cache) >= self.max_size:
+            oldest_key = min(self.access_times.keys(), 
+                           key=lambda k: self.access_times[k])
+            del self.cache[oldest_key]
+            del self.access_times[oldest_key]
+        
+        # Store in cache
+        self.cache[cache_key] = content
+        self.access_times[cache_key] = time.time()
+        
+        return content
+```
+
+## Data Persistence Patterns
+
+### 1. Redis Thread Storage
+
+```python
+# Thread context serialization
+class ThreadContext:
+    def to_json(self) -> str:
+        return json.dumps({
+            'thread_id': self.thread_id,
+            'tool_history': [ex.to_dict() for ex in self.tool_history],
+            'conversation_files': list(self.conversation_files),
+            'context_tokens': self.context_tokens,
+            'created_at': self.created_at.isoformat(),
+            'last_accessed': self.last_accessed.isoformat()
+        })
+    
+    @classmethod
+    def from_json(cls, json_str: str) -> 'ThreadContext':
+        data = json.loads(json_str)
+        context = cls(data['thread_id'])
+        context.tool_history = [
+            ToolExecution.from_dict(ex) for ex in data['tool_history']
+        ]
+        context.conversation_files = set(data['conversation_files'])
+        context.context_tokens = data['context_tokens']
+        context.created_at = datetime.fromisoformat(data['created_at'])
+        context.last_accessed = datetime.fromisoformat(data['last_accessed'])
+        return context
+```
+
+### 2. Configuration State Management
+
+```python
+# Environment-based configuration with validation
+class Config:
+    def __init__(self):
+        self.gemini_api_key = self._require_env('GEMINI_API_KEY')
+        self.gemini_model = os.getenv('GEMINI_MODEL', 'gemini-2.0-flash-thinking-exp')
+        self.project_root = os.getenv('PROJECT_ROOT', '/workspace')
+        self.redis_url = os.getenv('REDIS_URL', 'redis://localhost:6379')
+        self.max_context_tokens = int(os.getenv('MAX_CONTEXT_TOKENS', '1000000'))
+        
+        # Validate critical paths
+        if not os.path.exists(self.project_root):
+            raise ConfigError(f"PROJECT_ROOT does not exist: {self.project_root}")
+    
+    def _require_env(self, key: str) -> str:
+        value = os.getenv(key)
+        if not value:
+            raise ConfigError(f"Required environment variable not set: {key}")
+        return value
+```
+
+## Security Data Flow
+
+### 1. Request Sanitization Pipeline
+
+```python
+def sanitize_request_data(request: dict) -> dict:
+    """Multi-layer request sanitization"""
+    sanitized = {}
+    
+    # 1. Schema validation
+    validated_data = RequestSchema.parse_obj(request)
+    
+    # 2. Path sanitization
+    if 'files' in validated_data:
+        sanitized['files'] = [
+            sanitize_file_path(path) for path in validated_data['files']
+        ]
+    
+    # 3. Content filtering
+    if 'prompt' in validated_data:
+        sanitized['prompt'] = filter_sensitive_content(validated_data['prompt'])
+    
+    # 4. Parameter validation
+    for key, value in validated_data.items():
+        if key not in ['files', 'prompt']:
+            sanitized[key] = validate_parameter(key, value)
+    
+    return sanitized
+```
+
+### 2. Response Sanitization
+
+```python
+def sanitize_response_data(response: str) -> str:
+    """Remove sensitive information from responses"""
+    
+    # Remove potential API keys, tokens, passwords
+    sensitive_patterns = [
+        r'api[_-]?key["\s:=]+[a-zA-Z0-9-_]{20,}',
+        r'token["\s:=]+[a-zA-Z0-9-_]{20,}',
+        r'password["\s:=]+\S+',
+        r'/home/[^/\s]+',  # User paths
+        r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}',  # Emails
+    ]
+    
+    sanitized = response
+    for pattern in sensitive_patterns:
+        sanitized = re.sub(pattern, '[REDACTED]', sanitized, flags=re.IGNORECASE)
+    
+    return sanitized
+```
+
+## Performance Monitoring & Metrics
+
+### 1. Request Processing Metrics
+
+```python
+class PerformanceMetrics:
+    def __init__(self):
+        self.request_times = []
+        self.file_processing_times = []
+        self.memory_usage = []
+        self.error_counts = defaultdict(int)
+    
+    async def track_request(self, tool_name: str, files: List[str]):
+        start_time = time.time()
+        start_memory = psutil.Process().memory_info().rss
+        
+        try:
+            # Process request...
+            yield
+            
+        except Exception as e:
+            self.error_counts[f"{tool_name}:{type(e).__name__}"] += 1
+            raise
+        finally:
+            # Record metrics
+            end_time = time.time()
+            end_memory = psutil.Process().memory_info().rss
+            
+            self.request_times.append({
+                'tool': tool_name,
+                'duration': end_time - start_time,
+                'file_count': len(files),
+                'timestamp': datetime.now()
+            })
+            
+            self.memory_usage.append({
+                'memory_delta': end_memory - start_memory,
+                'timestamp': datetime.now()
+            })
+```
+
+This comprehensive data flow documentation provides the foundation for understanding how information moves through the Gemini MCP Server, enabling effective debugging, optimization, and extension of the system.
--- a/docs/architecture/overview.md
+++ b/docs/architecture/overview.md
@@ -0,0 +1,225 @@
+# Gemini MCP Server Architecture Overview
+
+## System Overview
+
+The **Gemini MCP Server** implements a sophisticated Model Context Protocol (MCP) server architecture that provides Claude with access to Google's Gemini AI models through specialized tools. This enables advanced AI-assisted development workflows combining Claude's general capabilities with Gemini's deep analytical and creative thinking abilities.
+
+## High-Level Architecture
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    Claude Interface                         │
+│                 (Claude Desktop App)                        │
+└─────────────────────┬───────────────────────────────────────┘
+                      │ MCP Protocol (stdio)
+┌─────────────────────▼───────────────────────────────────────┐
+│                MCP Core Engine                              │
+│  • AsyncIO Event Loop (server.py:45)                      │
+│  • Tool Discovery & Registration                           │
+│  • Request/Response Processing                              │
+└─────────────────────┬───────────────────────────────────────┘
+                      │
+┌─────────────────────▼───────────────────────────────────────┐
+│                Tool Architecture                            │
+│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐           │
+│  │   chat      │ │ thinkdeep   │ │  analyze    │           │
+│  │ (quick Q&A) │ │(deep think) │ │(code review)│           │
+│  └─────────────┘ └─────────────┘ └─────────────┘           │
+│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐           │
+│  │ codereview  │ │   debug     │ │ precommit   │           │
+│  │(quality)    │ │(root cause) │ │(validation) │           │
+│  └─────────────┘ └─────────────┘ └─────────────┘           │
+└─────────────────────┬───────────────────────────────────────┘
+                      │
+┌─────────────────────▼───────────────────────────────────────┐
+│               Support Services                              │
+│  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐│
+│  │Redis Conversation│ │Security Engine  │ │Gemini API      ││
+│  │Memory & Threading│ │Multi-layer      │ │Integration     ││
+│  │                  │ │Validation       │ │                ││
+│  └─────────────────┘ └─────────────────┘ └─────────────────┘│
+└─────────────────────────────────────────────────────────────┘
+```
+
+## Core Components
+
+### 1. MCP Core Engine (server.py:45)
+
+**Purpose**: Central coordination hub managing the MCP protocol implementation
+**Key Components**:
+- **AsyncIO Event Loop**: Handles concurrent tool execution and request processing
+- **Tool Discovery**: Dynamic loading and registration via `@server.list_tools()` decorator
+- **Protocol Management**: MCP message parsing, validation, and response formatting
+
+**Architecture Pattern**: Event-driven architecture with asyncio for non-blocking operations
+
+### 2. Tool System Architecture
+
+**Purpose**: Modular plugin system for specialized AI capabilities
+**Key Components**:
+- **BaseTool Abstract Class** (`tools/base.py:25`): Common interface for all tools
+- **Plugin Architecture**: Individual tool implementations in `tools/` directory
+- **Tool Selection Matrix**: CLAUDE.md defines appropriate tool usage patterns
+
+**Data Flow**:
+```
+Claude Request → MCP Engine → Tool Selection → Gemini API → Response Processing → Claude
+```
+
+**Tool Categories**:
+- **Quick Response**: `chat` - immediate answers and brainstorming
+- **Deep Analysis**: `thinkdeep` - complex architecture and strategic planning  
+- **Code Quality**: `codereview` - security audits and bug detection
+- **Investigation**: `debug` - root cause analysis and error investigation
+- **Exploration**: `analyze` - codebase comprehension and dependency analysis
+- **Validation**: `precommit` - automated quality gates
+
+### 3. Security Architecture
+
+**Purpose**: Multi-layer defense system protecting against malicious operations
+**Key Components**:
+- **Path Validation** (`utils/file_utils.py:45`): Prevents directory traversal attacks
+- **Sandbox Enforcement**: PROJECT_ROOT containment for file operations
+- **Docker Path Translation**: Host-to-container path mapping with WORKSPACE_ROOT
+- **Absolute Path Requirement**: Eliminates relative path vulnerabilities
+
+**Security Layers**:
+1. **Input Validation**: Path sanitization and dangerous operation detection
+2. **Container Isolation**: Docker environment with controlled file access
+3. **Permission Boundaries**: Read-only access patterns with explicit write gates
+4. **Error Recovery**: Graceful handling of unauthorized operations
+
+### 4. Thinking Modes System
+
+**Purpose**: Computational budget control for Gemini's analysis depth
+**Implementation**: 
+- **Token Allocation**: `minimal (128), low (2048), medium (8192), high (16384), max (32768)`
+- **Dynamic Selection**: Tools adjust thinking depth based on task complexity
+- **Resource Management**: Prevents token exhaustion on complex analysis
+
+**Usage Pattern**:
+```python
+# tools/thinkdeep.py:67
+thinking_mode = request.get('thinking_mode', 'high')
+context_tokens = THINKING_MODE_TOKENS[thinking_mode]
+```
+
+### 5. Conversation System
+
+**Purpose**: Cross-session context preservation and threading
+**Key Components**:
+- **Redis Persistence** (`utils/conversation_memory.py:30`): Thread storage and retrieval
+- **Thread Reconstruction**: UUID-based conversation continuity
+- **Cross-Tool Continuation**: `continuation_id` parameter for context flow
+- **Follow-up Management**: Structured multi-turn conversation support
+
+**Data Structures**:
+```python
+# utils/conversation_memory.py:45
+class ThreadContext:
+    thread_id: str
+    tool_history: List[ToolExecution]
+    conversation_files: List[str]
+    context_tokens: int
+```
+
+## Integration Points
+
+### Configuration Management (config.py)
+
+**Critical Settings**:
+- **`GEMINI_MODEL`** (config.py:24): Model selection for API calls
+- **`MAX_CONTEXT_TOKENS`** (config.py:30): Token limits for conversation management
+- **`REDIS_URL`** (config.py:60): Conversation memory backend
+- **`PROJECT_ROOT`** (config.py:15): Security sandbox boundary
+
+### Utility Services
+
+**File Operations** (`utils/file_utils.py`):
+- Token-aware reading with priority system
+- Directory expansion with filtering
+- Error-resistant content formatting
+
+**Git Integration** (`utils/git_utils.py`):
+- Repository state analysis for precommit validation
+- Change detection for documentation updates
+- Branch and commit tracking
+
+**Token Management** (`utils/token_utils.py`):
+- Context optimization and pruning
+- File prioritization strategies
+- Memory usage monitoring
+
+## Data Flow Patterns
+
+### 1. Tool Execution Flow
+
+```
+1. Claude sends MCP request with tool name and parameters
+2. MCP Engine validates request and routes to appropriate tool
+3. Tool loads conversation context from Redis (if continuation_id provided)
+4. Tool processes request using Gemini API with thinking mode configuration
+5. Tool stores results in conversation memory and returns formatted response
+6. MCP Engine serializes response and sends to Claude via stdio
+```
+
+### 2. File Processing Pipeline
+
+```
+1. File paths received and validated against security rules
+2. Docker path translation (host → container mapping)
+3. Token budget allocation based on file size and context limits
+4. Priority-based file reading (code files > documentation > logs)
+5. Content formatting with line numbers and error handling
+6. Context assembly with deduplication across conversation turns
+```
+
+### 3. Security Validation Chain
+
+```
+1. Path Input → Dangerous Path Detection → Rejection/Sanitization
+2. Validated Path → Absolute Path Conversion → Sandbox Boundary Check
+3. Bounded Path → Docker Translation → Container Path Generation
+4. Safe Path → File Operation → Error-Resistant Content Return
+```
+
+## Performance Characteristics
+
+### Scalability Factors
+
+- **Concurrent Tool Execution**: AsyncIO enables parallel processing of multiple tool requests
+- **Memory Efficiency**: Token-aware file processing prevents memory exhaustion
+- **Context Optimization**: Conversation deduplication reduces redundant processing
+- **Error Resilience**: Graceful degradation maintains functionality during failures
+
+### Resource Management
+
+- **Token Budgeting**: 40% context reservation (30% Memory Bank + 10% Memory MCP)
+- **File Prioritization**: Direct code files prioritized over supporting documentation
+- **Redis Optimization**: Thread-based storage with automatic cleanup
+- **Gemini API Efficiency**: Thinking mode selection optimizes computational costs
+
+## Extension Points
+
+### Adding New Tools
+
+1. **Inherit from BaseTool** (`tools/base.py:25`)
+2. **Implement required methods**: `execute()`, `get_schema()`
+3. **Register with MCP Engine**: Add to tool discovery system
+4. **Update CLAUDE.md**: Define collaboration patterns and usage guidelines
+
+### Security Extensions
+
+1. **Custom Validators**: Add to `utils/file_utils.py` validation chain
+2. **Path Translators**: Extend Docker path mapping for new mount points
+3. **Permission Gates**: Implement granular access controls for sensitive operations
+
+### Performance Optimizations
+
+1. **Caching Layers**: Add Redis caching for frequently accessed files
+2. **Context Compression**: Implement intelligent context summarization
+3. **Parallel Processing**: Extend AsyncIO patterns for I/O-bound operations
+
+---
+
+This architecture provides a robust, secure, and extensible foundation for AI-assisted development workflows while maintaining clear separation of concerns and comprehensive error handling.