adding detailed docs
This commit is contained in:
379
docs/architecture/components.md
Normal file
379
docs/architecture/components.md
Normal file
@@ -0,0 +1,379 @@
|
||||
# System Components & Interactions
|
||||
|
||||
## Component Architecture
|
||||
|
||||
The Gemini MCP Server is built on a modular component architecture that enables sophisticated AI collaboration patterns while maintaining security and performance.
|
||||
|
||||
## Core Components
|
||||
|
||||
### 1. MCP Protocol Engine
|
||||
|
||||
**Location**: `server.py:45-120`
|
||||
**Purpose**: Central communication hub implementing Model Context Protocol specification
|
||||
|
||||
**Key Responsibilities**:
|
||||
- **Protocol Compliance**: Implements MCP v1.0 specification for Claude integration
|
||||
- **Message Routing**: Dispatches requests to appropriate tool handlers
|
||||
- **Error Handling**: Graceful degradation and error response formatting
|
||||
- **Lifecycle Management**: Server startup, shutdown, and resource cleanup
|
||||
|
||||
**Implementation Details**:
|
||||
```python
|
||||
# server.py:67
|
||||
@server.list_tools()
|
||||
async def list_tools() -> list[types.Tool]:
|
||||
"""Dynamic tool discovery and registration"""
|
||||
return [tool.get_schema() for tool in REGISTERED_TOOLS]
|
||||
|
||||
@server.call_tool()
|
||||
async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
|
||||
"""Tool execution with error handling and response formatting"""
|
||||
```
|
||||
|
||||
**Dependencies**:
|
||||
- `mcp` library for protocol implementation
|
||||
- `asyncio` for concurrent request processing
|
||||
- Tool registry for dynamic handler discovery
|
||||
|
||||
### 2. Tool Architecture System
|
||||
|
||||
**Location**: `tools/` directory
|
||||
**Purpose**: Modular plugin system for specialized AI capabilities
|
||||
|
||||
#### BaseTool Abstract Class (`tools/base.py:25`)
|
||||
|
||||
**Interface Contract**:
|
||||
```python
|
||||
class BaseTool(ABC):
|
||||
@abstractmethod
|
||||
async def execute(self, request: dict) -> ToolOutput:
|
||||
"""Core tool execution logic"""
|
||||
|
||||
@abstractmethod
|
||||
def get_schema(self) -> types.Tool:
|
||||
"""MCP tool schema definition"""
|
||||
|
||||
def _format_response(self, content: str, metadata: dict) -> ToolOutput:
|
||||
"""Standardized response formatting"""
|
||||
```
|
||||
|
||||
#### Individual Tool Components
|
||||
|
||||
**Chat Tool** (`tools/chat.py:30`)
|
||||
- **Purpose**: Quick questions and general collaboration
|
||||
- **Thinking Mode**: Default 'medium' (8192 tokens)
|
||||
- **Use Cases**: Brainstorming, simple explanations, immediate answers
|
||||
|
||||
**ThinkDeep Tool** (`tools/thinkdeep.py:45`)
|
||||
- **Purpose**: Complex analysis and strategic planning
|
||||
- **Thinking Mode**: Default 'high' (16384 tokens)
|
||||
- **Use Cases**: Architecture decisions, design exploration, comprehensive analysis
|
||||
|
||||
**CodeReview Tool** (`tools/codereview.py:60`)
|
||||
- **Purpose**: Code quality and security analysis
|
||||
- **Thinking Mode**: Default 'medium' (8192 tokens)
|
||||
- **Use Cases**: Bug detection, security audits, quality validation
|
||||
|
||||
**Analyze Tool** (`tools/analyze.py:75`)
|
||||
- **Purpose**: Codebase exploration and understanding
|
||||
- **Thinking Mode**: Variable based on scope
|
||||
- **Use Cases**: Dependency analysis, pattern detection, system comprehension
|
||||
|
||||
**Debug Tool** (`tools/debug.py:90`)
|
||||
- **Purpose**: Error investigation and root cause analysis
|
||||
- **Thinking Mode**: Default 'medium' (8192 tokens)
|
||||
- **Use Cases**: Stack trace analysis, bug diagnosis, performance issues
|
||||
|
||||
**Precommit Tool** (`tools/precommit.py:105`)
|
||||
- **Purpose**: Automated quality gates and validation
|
||||
- **Thinking Mode**: Default 'medium' (8192 tokens)
|
||||
- **Use Cases**: Pre-commit validation, change analysis, quality assurance
|
||||
|
||||
### 3. Security Engine
|
||||
|
||||
**Location**: `utils/file_utils.py:45-120`
|
||||
**Purpose**: Multi-layer security validation and enforcement
|
||||
|
||||
#### Security Components
|
||||
|
||||
**Path Validation System**:
|
||||
```python
|
||||
# utils/file_utils.py:67
|
||||
def validate_file_path(file_path: str) -> bool:
|
||||
"""Multi-layer path security validation"""
|
||||
# 1. Dangerous path detection
|
||||
dangerous_patterns = ['../', '~/', '/etc/', '/var/', '/usr/']
|
||||
if any(pattern in file_path for pattern in dangerous_patterns):
|
||||
return False
|
||||
|
||||
# 2. Absolute path requirement
|
||||
if not os.path.isabs(file_path):
|
||||
return False
|
||||
|
||||
# 3. Sandbox boundary enforcement
|
||||
return file_path.startswith(PROJECT_ROOT)
|
||||
```
|
||||
|
||||
**Docker Path Translation**:
|
||||
```python
|
||||
# utils/file_utils.py:89
|
||||
def translate_docker_path(host_path: str) -> str:
|
||||
"""Convert host paths to container paths for Docker environment"""
|
||||
if host_path.startswith(WORKSPACE_ROOT):
|
||||
return host_path.replace(WORKSPACE_ROOT, '/workspace', 1)
|
||||
return host_path
|
||||
```
|
||||
|
||||
**Security Layers**:
|
||||
1. **Input Sanitization**: Path cleaning and normalization
|
||||
2. **Pattern Matching**: Dangerous path detection and blocking
|
||||
3. **Boundary Enforcement**: PROJECT_ROOT containment validation
|
||||
4. **Container Translation**: Safe host-to-container path mapping
|
||||
|
||||
### 4. Conversation Memory System
|
||||
|
||||
**Location**: `utils/conversation_memory.py:30-150`
|
||||
**Purpose**: Cross-session context preservation and threading
|
||||
|
||||
#### Memory Components
|
||||
|
||||
**Thread Context Management**:
|
||||
```python
|
||||
# utils/conversation_memory.py:45
|
||||
class ThreadContext:
|
||||
thread_id: str
|
||||
tool_history: List[ToolExecution]
|
||||
conversation_files: Set[str]
|
||||
context_tokens: int
|
||||
created_at: datetime
|
||||
last_accessed: datetime
|
||||
```
|
||||
|
||||
**Redis Integration**:
|
||||
```python
|
||||
# utils/conversation_memory.py:78
|
||||
class ConversationMemory:
|
||||
def __init__(self, redis_url: str):
|
||||
self.redis = redis.from_url(redis_url)
|
||||
|
||||
async def store_thread(self, context: ThreadContext) -> None:
|
||||
"""Persist conversation thread to Redis"""
|
||||
|
||||
async def retrieve_thread(self, thread_id: str) -> Optional[ThreadContext]:
|
||||
"""Reconstruct conversation from storage"""
|
||||
|
||||
async def cleanup_expired_threads(self) -> int:
|
||||
"""Remove old conversations to manage memory"""
|
||||
```
|
||||
|
||||
**Memory Features**:
|
||||
- **Thread Persistence**: UUID-based conversation storage
|
||||
- **Context Reconstruction**: Full conversation history retrieval
|
||||
- **File Deduplication**: Efficient storage of repeated file references
|
||||
- **Automatic Cleanup**: Time-based thread expiration
|
||||
|
||||
### 5. File Processing Pipeline
|
||||
|
||||
**Location**: `utils/file_utils.py:120-200`
|
||||
**Purpose**: Token-aware file reading and content optimization
|
||||
|
||||
#### Processing Components
|
||||
|
||||
**Priority System**:
|
||||
```python
|
||||
# utils/file_utils.py:134
|
||||
FILE_PRIORITIES = {
|
||||
'.py': 1, # Python source code (highest priority)
|
||||
'.js': 1, # JavaScript source
|
||||
'.ts': 1, # TypeScript source
|
||||
'.md': 2, # Documentation
|
||||
'.txt': 3, # Text files
|
||||
'.log': 4, # Log files (lowest priority)
|
||||
}
|
||||
```
|
||||
|
||||
**Token Management**:
|
||||
```python
|
||||
# utils/file_utils.py:156
|
||||
def read_file_with_token_limit(file_path: str, max_tokens: int) -> str:
|
||||
"""Read file content with token budget enforcement"""
|
||||
try:
|
||||
with open(file_path, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
|
||||
# Token estimation and truncation
|
||||
estimated_tokens = len(content) // 4 # Rough estimation
|
||||
if estimated_tokens > max_tokens:
|
||||
# Truncate with preservation of structure
|
||||
content = content[:max_tokens * 4]
|
||||
|
||||
return format_file_content(content, file_path)
|
||||
except Exception as e:
|
||||
return f"Error reading {file_path}: {str(e)}"
|
||||
```
|
||||
|
||||
**Content Formatting**:
|
||||
- **Line Numbers**: Added for precise code references
|
||||
- **Error Handling**: Graceful failure with informative messages
|
||||
- **Structure Preservation**: Maintains code formatting and indentation
|
||||
|
||||
### 6. Gemini API Integration
|
||||
|
||||
**Location**: `tools/models.py:25-80`
|
||||
**Purpose**: Standardized interface to Google's Gemini models
|
||||
|
||||
#### Integration Components
|
||||
|
||||
**API Client**:
|
||||
```python
|
||||
# tools/models.py:34
|
||||
class GeminiClient:
|
||||
def __init__(self, api_key: str, model: str = "gemini-2.0-flash-thinking-exp"):
|
||||
self.client = genai.GenerativeModel(model)
|
||||
self.api_key = api_key
|
||||
|
||||
async def generate_response(self,
|
||||
prompt: str,
|
||||
thinking_mode: str = 'medium',
|
||||
files: List[str] = None) -> str:
|
||||
"""Generate response with thinking mode and file context"""
|
||||
```
|
||||
|
||||
**Model Configuration**:
|
||||
```python
|
||||
# config.py:24
|
||||
GEMINI_MODEL = os.getenv('GEMINI_MODEL', 'gemini-2.0-flash-thinking-exp')
|
||||
MAX_CONTEXT_TOKENS = int(os.getenv('MAX_CONTEXT_TOKENS', '1000000'))
|
||||
```
|
||||
|
||||
**Thinking Mode Management**:
|
||||
```python
|
||||
# tools/models.py:67
|
||||
THINKING_MODE_TOKENS = {
|
||||
'minimal': 128,
|
||||
'low': 2048,
|
||||
'medium': 8192,
|
||||
'high': 16384,
|
||||
'max': 32768
|
||||
}
|
||||
```
|
||||
|
||||
## Component Interactions
|
||||
|
||||
### 1. Request Processing Flow
|
||||
|
||||
```
|
||||
Claude Request
|
||||
↓
|
||||
MCP Protocol Engine (server.py:67)
|
||||
↓ (validate & route)
|
||||
Tool Selection & Loading
|
||||
↓
|
||||
Security Validation (utils/file_utils.py:67)
|
||||
↓ (if files involved)
|
||||
File Processing Pipeline (utils/file_utils.py:134)
|
||||
↓
|
||||
Conversation Context Loading (utils/conversation_memory.py:78)
|
||||
↓ (if continuation_id provided)
|
||||
Gemini API Integration (tools/models.py:34)
|
||||
↓
|
||||
Response Processing & Formatting
|
||||
↓
|
||||
Conversation Storage (utils/conversation_memory.py:78)
|
||||
↓
|
||||
MCP Response to Claude
|
||||
```
|
||||
|
||||
### 2. Security Integration Points
|
||||
|
||||
**Pre-Tool Execution**:
|
||||
- Path validation before any file operations
|
||||
- Sandbox boundary enforcement
|
||||
- Docker path translation for container environments
|
||||
|
||||
**During Tool Execution**:
|
||||
- Token budget enforcement to prevent memory exhaustion
|
||||
- File access logging and monitoring
|
||||
- Error containment and graceful degradation
|
||||
|
||||
**Post-Tool Execution**:
|
||||
- Response sanitization
|
||||
- Conversation storage with access controls
|
||||
- Resource cleanup and memory management
|
||||
|
||||
### 3. Memory System Integration
|
||||
|
||||
**Thread Creation**:
|
||||
```python
|
||||
# New conversation
|
||||
thread_id = str(uuid.uuid4())
|
||||
context = ThreadContext(thread_id=thread_id, ...)
|
||||
await memory.store_thread(context)
|
||||
```
|
||||
|
||||
**Thread Continuation**:
|
||||
```python
|
||||
# Continuing conversation
|
||||
if continuation_id:
|
||||
context = await memory.retrieve_thread(continuation_id)
|
||||
# Merge new request with existing context
|
||||
```
|
||||
|
||||
**Cross-Tool Communication**:
|
||||
```python
|
||||
# Tool A stores findings
|
||||
await memory.add_tool_execution(thread_id, tool_execution)
|
||||
|
||||
# Tool B retrieves context
|
||||
context = await memory.retrieve_thread(thread_id)
|
||||
previous_findings = context.get_tool_outputs('analyze')
|
||||
```
|
||||
|
||||
## Configuration & Dependencies
|
||||
|
||||
### Environment Configuration
|
||||
|
||||
**Required Settings** (`config.py`):
|
||||
```python
|
||||
GEMINI_API_KEY = os.getenv('GEMINI_API_KEY') # Required
|
||||
GEMINI_MODEL = os.getenv('GEMINI_MODEL', 'gemini-2.0-flash-thinking-exp')
|
||||
PROJECT_ROOT = os.getenv('PROJECT_ROOT', '/workspace')
|
||||
REDIS_URL = os.getenv('REDIS_URL', 'redis://localhost:6379')
|
||||
MAX_CONTEXT_TOKENS = int(os.getenv('MAX_CONTEXT_TOKENS', '1000000'))
|
||||
```
|
||||
|
||||
### Component Dependencies
|
||||
|
||||
**Core Dependencies**:
|
||||
- `mcp`: MCP protocol implementation
|
||||
- `google-generativeai`: Gemini API client
|
||||
- `redis`: Conversation persistence
|
||||
- `asyncio`: Concurrent processing
|
||||
|
||||
**Security Dependencies**:
|
||||
- `pathlib`: Path manipulation and validation
|
||||
- `os`: File system operations and environment access
|
||||
|
||||
**Tool Dependencies**:
|
||||
- `pydantic`: Data validation and serialization
|
||||
- `typing`: Type hints and contract definition
|
||||
|
||||
## Extension Architecture
|
||||
|
||||
### Adding New Components
|
||||
|
||||
1. **Tool Components**: Inherit from BaseTool and implement required interface
|
||||
2. **Security Components**: Extend validation chain in file_utils.py
|
||||
3. **Memory Components**: Add new storage backends via interface abstraction
|
||||
4. **Processing Components**: Extend file pipeline with new content types
|
||||
|
||||
### Integration Patterns
|
||||
|
||||
- **Plugin Architecture**: Dynamic discovery and registration
|
||||
- **Interface Segregation**: Clear contracts between components
|
||||
- **Dependency Injection**: Configuration-driven component assembly
|
||||
- **Error Boundaries**: Isolated failure handling per component
|
||||
|
||||
---
|
||||
|
||||
This component architecture provides a robust foundation for AI collaboration while maintaining security, performance, and extensibility requirements.
|
||||
545
docs/architecture/data-flow.md
Normal file
545
docs/architecture/data-flow.md
Normal file
@@ -0,0 +1,545 @@
|
||||
# Data Flow & Processing Patterns
|
||||
|
||||
## Overview
|
||||
|
||||
The Gemini MCP Server implements sophisticated data flow patterns that enable secure, efficient, and contextually-aware AI collaboration. This document traces data movement through the system with concrete examples and performance considerations.
|
||||
|
||||
## Primary Data Flow Patterns
|
||||
|
||||
### 1. Standard Tool Execution Flow
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant C as Claude
|
||||
participant M as MCP Engine
|
||||
participant S as Security Layer
|
||||
participant T as Tool Handler
|
||||
participant G as Gemini API
|
||||
participant R as Redis Memory
|
||||
|
||||
C->>M: MCP Request (tool_name, params)
|
||||
M->>M: Validate Request Schema
|
||||
M->>S: Security Validation
|
||||
S->>S: Path Validation & Sanitization
|
||||
S->>T: Secure Parameters
|
||||
T->>R: Load Conversation Context
|
||||
R-->>T: Thread Context (if exists)
|
||||
T->>T: Process Files & Context
|
||||
T->>G: Formatted Prompt + Context
|
||||
G-->>T: AI Response
|
||||
T->>R: Store Execution Result
|
||||
T->>M: Formatted Tool Output
|
||||
M->>C: MCP Response
|
||||
```
|
||||
|
||||
**Example Request Flow**:
|
||||
```json
|
||||
// Claude → MCP Engine
|
||||
{
|
||||
"method": "tools/call",
|
||||
"params": {
|
||||
"name": "analyze",
|
||||
"arguments": {
|
||||
"files": ["/workspace/tools/analyze.py"],
|
||||
"question": "Explain the architecture pattern",
|
||||
"continuation_id": "550e8400-e29b-41d4-a716-446655440000"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2. File Processing Pipeline
|
||||
|
||||
#### Stage 1: Security Validation (`utils/file_utils.py:67`)
|
||||
|
||||
```python
|
||||
# Input: ["/workspace/tools/analyze.py", "../../../etc/passwd"]
|
||||
def validate_file_paths(file_paths: List[str]) -> List[str]:
|
||||
validated = []
|
||||
for path in file_paths:
|
||||
# 1. Dangerous pattern detection
|
||||
if any(danger in path for danger in ['../', '~/', '/etc/', '/var/']):
|
||||
logger.warning(f"Blocked dangerous path: {path}")
|
||||
continue
|
||||
|
||||
# 2. Absolute path requirement
|
||||
if not os.path.isabs(path):
|
||||
path = os.path.abspath(path)
|
||||
|
||||
# 3. Sandbox boundary check
|
||||
if not path.startswith(PROJECT_ROOT):
|
||||
logger.warning(f"Path outside sandbox: {path}")
|
||||
continue
|
||||
|
||||
validated.append(path)
|
||||
|
||||
return validated
|
||||
# Output: ["/workspace/tools/analyze.py"]
|
||||
```
|
||||
|
||||
#### Stage 2: Docker Path Translation (`utils/file_utils.py:89`)
|
||||
|
||||
```python
|
||||
# Host Environment: /Users/user/project/tools/analyze.py
|
||||
# Container Environment: /workspace/tools/analyze.py
|
||||
def translate_paths_for_environment(paths: List[str]) -> List[str]:
|
||||
translated = []
|
||||
for path in paths:
|
||||
if WORKSPACE_ROOT and path.startswith(WORKSPACE_ROOT):
|
||||
container_path = path.replace(WORKSPACE_ROOT, '/workspace', 1)
|
||||
translated.append(container_path)
|
||||
else:
|
||||
translated.append(path)
|
||||
return translated
|
||||
```
|
||||
|
||||
#### Stage 3: Priority-Based Processing (`utils/file_utils.py:134`)
|
||||
|
||||
```python
|
||||
# File Priority Matrix
|
||||
FILE_PRIORITIES = {
|
||||
'.py': 1, # Source code (highest priority)
|
||||
'.js': 1, '.ts': 1, '.tsx': 1,
|
||||
'.md': 2, # Documentation
|
||||
'.json': 2, '.yaml': 2, '.yml': 2,
|
||||
'.txt': 3, # Text files
|
||||
'.log': 4, # Logs (lowest priority)
|
||||
}
|
||||
|
||||
# Token Budget Allocation
|
||||
def allocate_token_budget(files: List[str], total_budget: int) -> Dict[str, int]:
|
||||
# Priority 1 files get 60% of budget
|
||||
# Priority 2 files get 30% of budget
|
||||
# Priority 3+ files get 10% of budget
|
||||
|
||||
priority_groups = defaultdict(list)
|
||||
for file in files:
|
||||
ext = Path(file).suffix.lower()
|
||||
priority = FILE_PRIORITIES.get(ext, 4)
|
||||
priority_groups[priority].append(file)
|
||||
|
||||
allocations = {}
|
||||
if priority_groups[1]: # Source code files
|
||||
code_budget = int(total_budget * 0.6)
|
||||
per_file = code_budget // len(priority_groups[1])
|
||||
for file in priority_groups[1]:
|
||||
allocations[file] = per_file
|
||||
|
||||
if priority_groups[2]: # Documentation files
|
||||
doc_budget = int(total_budget * 0.3)
|
||||
per_file = doc_budget // len(priority_groups[2])
|
||||
for file in priority_groups[2]:
|
||||
allocations[file] = per_file
|
||||
|
||||
return allocations
|
||||
```
|
||||
|
||||
#### Stage 4: Content Processing & Formatting
|
||||
|
||||
```python
|
||||
def process_file_content(file_path: str, token_limit: int) -> str:
|
||||
try:
|
||||
with open(file_path, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
|
||||
# Token estimation (rough: 1 token ≈ 4 characters)
|
||||
estimated_tokens = len(content) // 4
|
||||
|
||||
if estimated_tokens > token_limit:
|
||||
# Smart truncation preserving structure
|
||||
lines = content.split('\n')
|
||||
truncated_lines = []
|
||||
current_tokens = 0
|
||||
|
||||
for line in lines:
|
||||
line_tokens = len(line) // 4
|
||||
if current_tokens + line_tokens > token_limit:
|
||||
break
|
||||
truncated_lines.append(line)
|
||||
current_tokens += line_tokens
|
||||
|
||||
content = '\n'.join(truncated_lines)
|
||||
content += f"\n\n... [Truncated at {token_limit} tokens]"
|
||||
|
||||
# Format with line numbers for precise references
|
||||
lines = content.split('\n')
|
||||
formatted_lines = []
|
||||
for i, line in enumerate(lines, 1):
|
||||
formatted_lines.append(f"{i:6d}\t{line}")
|
||||
|
||||
return '\n'.join(formatted_lines)
|
||||
|
||||
except Exception as e:
|
||||
return f"Error reading {file_path}: {str(e)}"
|
||||
```
|
||||
|
||||
### 3. Conversation Memory Flow
|
||||
|
||||
#### Context Storage Pattern (`utils/conversation_memory.py:78`)
|
||||
|
||||
```python
|
||||
# Tool execution creates persistent context
|
||||
async def store_tool_execution(thread_id: str, tool_execution: ToolExecution):
|
||||
context = await self.retrieve_thread(thread_id) or ThreadContext(thread_id)
|
||||
|
||||
# Add new execution to history
|
||||
context.tool_history.append(tool_execution)
|
||||
|
||||
# Update file set (deduplication)
|
||||
if tool_execution.files:
|
||||
context.conversation_files.update(tool_execution.files)
|
||||
|
||||
# Update token tracking
|
||||
context.context_tokens += tool_execution.response_tokens
|
||||
context.last_accessed = datetime.now()
|
||||
|
||||
# Persist to Redis
|
||||
await self.redis.setex(
|
||||
f"thread:{thread_id}",
|
||||
timedelta(hours=24), # 24-hour expiration
|
||||
context.to_json()
|
||||
)
|
||||
```
|
||||
|
||||
#### Context Retrieval & Reconstruction
|
||||
|
||||
```python
|
||||
async def build_conversation_context(thread_id: str) -> str:
|
||||
context = await self.retrieve_thread(thread_id)
|
||||
if not context:
|
||||
return ""
|
||||
|
||||
# Build conversation summary
|
||||
summary_parts = []
|
||||
|
||||
# Add file context (deduplicated)
|
||||
if context.conversation_files:
|
||||
summary_parts.append("## Previous Files Analyzed:")
|
||||
for file_path in sorted(context.conversation_files):
|
||||
summary_parts.append(f"- {file_path}")
|
||||
|
||||
# Add tool execution history
|
||||
if context.tool_history:
|
||||
summary_parts.append("\n## Previous Analysis:")
|
||||
for execution in context.tool_history[-3:]: # Last 3 executions
|
||||
summary_parts.append(f"**{execution.tool_name}**: {execution.summary}")
|
||||
|
||||
return '\n'.join(summary_parts)
|
||||
```
|
||||
|
||||
### 4. Thinking Mode Processing
|
||||
|
||||
#### Dynamic Token Allocation (`tools/models.py:67`)
|
||||
|
||||
```python
|
||||
# Thinking mode determines computational budget
|
||||
THINKING_MODE_TOKENS = {
|
||||
'minimal': 128, # Quick answers, simple questions
|
||||
'low': 2048, # Basic analysis, straightforward tasks
|
||||
'medium': 8192, # Standard analysis, moderate complexity
|
||||
'high': 16384, # Deep analysis, complex problems
|
||||
'max': 32768 # Maximum depth, critical decisions
|
||||
}
|
||||
|
||||
def prepare_gemini_request(prompt: str, thinking_mode: str, files: List[str]) -> dict:
|
||||
# Calculate total context budget
|
||||
thinking_tokens = THINKING_MODE_TOKENS.get(thinking_mode, 8192)
|
||||
file_tokens = MAX_CONTEXT_TOKENS - thinking_tokens - 1000 # Reserve for response
|
||||
|
||||
# Process files within budget
|
||||
file_content = process_files_with_budget(files, file_tokens)
|
||||
|
||||
# Construct final prompt
|
||||
full_prompt = f"""
|
||||
{prompt}
|
||||
|
||||
## Available Context ({thinking_tokens} thinking tokens allocated)
|
||||
|
||||
{file_content}
|
||||
|
||||
Please analyze using {thinking_mode} thinking mode.
|
||||
"""
|
||||
|
||||
return {
|
||||
'prompt': full_prompt,
|
||||
'max_tokens': thinking_tokens,
|
||||
'temperature': 0.2 if thinking_mode in ['high', 'max'] else 0.5
|
||||
}
|
||||
```
|
||||
|
||||
## Advanced Data Flow Patterns
|
||||
|
||||
### 1. Cross-Tool Continuation Flow
|
||||
|
||||
```python
|
||||
# Tool A (analyze) creates foundation
|
||||
analyze_result = await analyze_tool.execute({
|
||||
'files': ['/workspace/tools/'],
|
||||
'question': 'What is the architecture pattern?'
|
||||
})
|
||||
|
||||
# Store context with continuation capability
|
||||
thread_id = str(uuid.uuid4())
|
||||
await memory.store_tool_execution(thread_id, ToolExecution(
|
||||
tool_name='analyze',
|
||||
files=['/workspace/tools/'],
|
||||
summary='Identified MCP plugin architecture pattern',
|
||||
continuation_id=thread_id
|
||||
))
|
||||
|
||||
# Tool B (thinkdeep) continues analysis
|
||||
thinkdeep_result = await thinkdeep_tool.execute({
|
||||
'current_analysis': analyze_result.content,
|
||||
'focus_areas': ['scalability', 'security'],
|
||||
'continuation_id': thread_id # Links to previous context
|
||||
})
|
||||
```
|
||||
|
||||
### 2. Error Recovery & Graceful Degradation
|
||||
|
||||
```python
|
||||
def resilient_file_processing(files: List[str]) -> str:
|
||||
"""Process files with graceful error handling"""
|
||||
results = []
|
||||
|
||||
for file_path in files:
|
||||
try:
|
||||
content = read_file_safely(file_path)
|
||||
results.append(f"=== {file_path} ===\n{content}")
|
||||
except PermissionError:
|
||||
results.append(f"=== {file_path} ===\nERROR: Permission denied")
|
||||
except FileNotFoundError:
|
||||
results.append(f"=== {file_path} ===\nERROR: File not found")
|
||||
except UnicodeDecodeError:
|
||||
# Try binary file detection
|
||||
try:
|
||||
with open(file_path, 'rb') as f:
|
||||
header = f.read(16)
|
||||
if is_binary_file(header):
|
||||
results.append(f"=== {file_path} ===\nBinary file (skipped)")
|
||||
else:
|
||||
results.append(f"=== {file_path} ===\nERROR: Encoding issue")
|
||||
except:
|
||||
results.append(f"=== {file_path} ===\nERROR: Unreadable file")
|
||||
except Exception as e:
|
||||
results.append(f"=== {file_path} ===\nERROR: {str(e)}")
|
||||
|
||||
return '\n\n'.join(results)
|
||||
```
|
||||
|
||||
### 3. Performance Optimization Patterns
|
||||
|
||||
#### Concurrent File Processing
|
||||
|
||||
```python
|
||||
async def process_files_concurrently(files: List[str], token_budget: int) -> str:
|
||||
"""Process multiple files concurrently with shared budget"""
|
||||
|
||||
# Allocate budget per file
|
||||
allocations = allocate_token_budget(files, token_budget)
|
||||
|
||||
# Create processing tasks
|
||||
tasks = []
|
||||
for file_path in files:
|
||||
task = asyncio.create_task(
|
||||
process_single_file(file_path, allocations.get(file_path, 1000))
|
||||
)
|
||||
tasks.append(task)
|
||||
|
||||
# Wait for all files to complete
|
||||
results = await asyncio.gather(*tasks, return_exceptions=True)
|
||||
|
||||
# Combine results, handling exceptions
|
||||
processed_content = []
|
||||
for i, result in enumerate(results):
|
||||
if isinstance(result, Exception):
|
||||
processed_content.append(f"Error processing {files[i]}: {result}")
|
||||
else:
|
||||
processed_content.append(result)
|
||||
|
||||
return '\n\n'.join(processed_content)
|
||||
```
|
||||
|
||||
#### Intelligent Caching
|
||||
|
||||
```python
|
||||
class FileContentCache:
|
||||
def __init__(self, max_size: int = 100):
|
||||
self.cache = {}
|
||||
self.access_times = {}
|
||||
self.max_size = max_size
|
||||
|
||||
async def get_file_content(self, file_path: str, token_limit: int) -> str:
|
||||
# Create cache key including token limit
|
||||
cache_key = f"{file_path}:{token_limit}"
|
||||
|
||||
# Check cache hit
|
||||
if cache_key in self.cache:
|
||||
self.access_times[cache_key] = time.time()
|
||||
return self.cache[cache_key]
|
||||
|
||||
# Process file and cache result
|
||||
content = await process_file_content(file_path, token_limit)
|
||||
|
||||
# Evict oldest entries if cache full
|
||||
if len(self.cache) >= self.max_size:
|
||||
oldest_key = min(self.access_times.keys(),
|
||||
key=lambda k: self.access_times[k])
|
||||
del self.cache[oldest_key]
|
||||
del self.access_times[oldest_key]
|
||||
|
||||
# Store in cache
|
||||
self.cache[cache_key] = content
|
||||
self.access_times[cache_key] = time.time()
|
||||
|
||||
return content
|
||||
```
|
||||
|
||||
## Data Persistence Patterns
|
||||
|
||||
### 1. Redis Thread Storage
|
||||
|
||||
```python
|
||||
# Thread context serialization
|
||||
class ThreadContext:
|
||||
def to_json(self) -> str:
|
||||
return json.dumps({
|
||||
'thread_id': self.thread_id,
|
||||
'tool_history': [ex.to_dict() for ex in self.tool_history],
|
||||
'conversation_files': list(self.conversation_files),
|
||||
'context_tokens': self.context_tokens,
|
||||
'created_at': self.created_at.isoformat(),
|
||||
'last_accessed': self.last_accessed.isoformat()
|
||||
})
|
||||
|
||||
@classmethod
|
||||
def from_json(cls, json_str: str) -> 'ThreadContext':
|
||||
data = json.loads(json_str)
|
||||
context = cls(data['thread_id'])
|
||||
context.tool_history = [
|
||||
ToolExecution.from_dict(ex) for ex in data['tool_history']
|
||||
]
|
||||
context.conversation_files = set(data['conversation_files'])
|
||||
context.context_tokens = data['context_tokens']
|
||||
context.created_at = datetime.fromisoformat(data['created_at'])
|
||||
context.last_accessed = datetime.fromisoformat(data['last_accessed'])
|
||||
return context
|
||||
```
|
||||
|
||||
### 2. Configuration State Management
|
||||
|
||||
```python
|
||||
# Environment-based configuration with validation
|
||||
class Config:
|
||||
def __init__(self):
|
||||
self.gemini_api_key = self._require_env('GEMINI_API_KEY')
|
||||
self.gemini_model = os.getenv('GEMINI_MODEL', 'gemini-2.0-flash-thinking-exp')
|
||||
self.project_root = os.getenv('PROJECT_ROOT', '/workspace')
|
||||
self.redis_url = os.getenv('REDIS_URL', 'redis://localhost:6379')
|
||||
self.max_context_tokens = int(os.getenv('MAX_CONTEXT_TOKENS', '1000000'))
|
||||
|
||||
# Validate critical paths
|
||||
if not os.path.exists(self.project_root):
|
||||
raise ConfigError(f"PROJECT_ROOT does not exist: {self.project_root}")
|
||||
|
||||
def _require_env(self, key: str) -> str:
|
||||
value = os.getenv(key)
|
||||
if not value:
|
||||
raise ConfigError(f"Required environment variable not set: {key}")
|
||||
return value
|
||||
```
|
||||
|
||||
## Security Data Flow
|
||||
|
||||
### 1. Request Sanitization Pipeline
|
||||
|
||||
```python
|
||||
def sanitize_request_data(request: dict) -> dict:
|
||||
"""Multi-layer request sanitization"""
|
||||
sanitized = {}
|
||||
|
||||
# 1. Schema validation
|
||||
validated_data = RequestSchema.parse_obj(request)
|
||||
|
||||
# 2. Path sanitization
|
||||
if 'files' in validated_data:
|
||||
sanitized['files'] = [
|
||||
sanitize_file_path(path) for path in validated_data['files']
|
||||
]
|
||||
|
||||
# 3. Content filtering
|
||||
if 'prompt' in validated_data:
|
||||
sanitized['prompt'] = filter_sensitive_content(validated_data['prompt'])
|
||||
|
||||
# 4. Parameter validation
|
||||
for key, value in validated_data.items():
|
||||
if key not in ['files', 'prompt']:
|
||||
sanitized[key] = validate_parameter(key, value)
|
||||
|
||||
return sanitized
|
||||
```
|
||||
|
||||
### 2. Response Sanitization
|
||||
|
||||
```python
|
||||
def sanitize_response_data(response: str) -> str:
|
||||
"""Remove sensitive information from responses"""
|
||||
|
||||
# Remove potential API keys, tokens, passwords
|
||||
sensitive_patterns = [
|
||||
r'api[_-]?key["\s:=]+[a-zA-Z0-9-_]{20,}',
|
||||
r'token["\s:=]+[a-zA-Z0-9-_]{20,}',
|
||||
r'password["\s:=]+\S+',
|
||||
r'/home/[^/\s]+', # User paths
|
||||
r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}', # Emails
|
||||
]
|
||||
|
||||
sanitized = response
|
||||
for pattern in sensitive_patterns:
|
||||
sanitized = re.sub(pattern, '[REDACTED]', sanitized, flags=re.IGNORECASE)
|
||||
|
||||
return sanitized
|
||||
```
|
||||
|
||||
## Performance Monitoring & Metrics
|
||||
|
||||
### 1. Request Processing Metrics
|
||||
|
||||
```python
|
||||
class PerformanceMetrics:
|
||||
def __init__(self):
|
||||
self.request_times = []
|
||||
self.file_processing_times = []
|
||||
self.memory_usage = []
|
||||
self.error_counts = defaultdict(int)
|
||||
|
||||
async def track_request(self, tool_name: str, files: List[str]):
|
||||
start_time = time.time()
|
||||
start_memory = psutil.Process().memory_info().rss
|
||||
|
||||
try:
|
||||
# Process request...
|
||||
yield
|
||||
|
||||
except Exception as e:
|
||||
self.error_counts[f"{tool_name}:{type(e).__name__}"] += 1
|
||||
raise
|
||||
finally:
|
||||
# Record metrics
|
||||
end_time = time.time()
|
||||
end_memory = psutil.Process().memory_info().rss
|
||||
|
||||
self.request_times.append({
|
||||
'tool': tool_name,
|
||||
'duration': end_time - start_time,
|
||||
'file_count': len(files),
|
||||
'timestamp': datetime.now()
|
||||
})
|
||||
|
||||
self.memory_usage.append({
|
||||
'memory_delta': end_memory - start_memory,
|
||||
'timestamp': datetime.now()
|
||||
})
|
||||
```
|
||||
|
||||
This comprehensive data flow documentation provides the foundation for understanding how information moves through the Gemini MCP Server, enabling effective debugging, optimization, and extension of the system.
|
||||
225
docs/architecture/overview.md
Normal file
225
docs/architecture/overview.md
Normal file
@@ -0,0 +1,225 @@
|
||||
# Gemini MCP Server Architecture Overview
|
||||
|
||||
## System Overview
|
||||
|
||||
The **Gemini MCP Server** implements a sophisticated Model Context Protocol (MCP) server architecture that provides Claude with access to Google's Gemini AI models through specialized tools. This enables advanced AI-assisted development workflows combining Claude's general capabilities with Gemini's deep analytical and creative thinking abilities.
|
||||
|
||||
## High-Level Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Claude Interface │
|
||||
│ (Claude Desktop App) │
|
||||
└─────────────────────┬───────────────────────────────────────┘
|
||||
│ MCP Protocol (stdio)
|
||||
┌─────────────────────▼───────────────────────────────────────┐
|
||||
│ MCP Core Engine │
|
||||
│ • AsyncIO Event Loop (server.py:45) │
|
||||
│ • Tool Discovery & Registration │
|
||||
│ • Request/Response Processing │
|
||||
└─────────────────────┬───────────────────────────────────────┘
|
||||
│
|
||||
┌─────────────────────▼───────────────────────────────────────┐
|
||||
│ Tool Architecture │
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
|
||||
│ │ chat │ │ thinkdeep │ │ analyze │ │
|
||||
│ │ (quick Q&A) │ │(deep think) │ │(code review)│ │
|
||||
│ └─────────────┘ └─────────────┘ └─────────────┘ │
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
|
||||
│ │ codereview │ │ debug │ │ precommit │ │
|
||||
│ │(quality) │ │(root cause) │ │(validation) │ │
|
||||
│ └─────────────┘ └─────────────┘ └─────────────┘ │
|
||||
└─────────────────────┬───────────────────────────────────────┘
|
||||
│
|
||||
┌─────────────────────▼───────────────────────────────────────┐
|
||||
│ Support Services │
|
||||
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐│
|
||||
│ │Redis Conversation│ │Security Engine │ │Gemini API ││
|
||||
│ │Memory & Threading│ │Multi-layer │ │Integration ││
|
||||
│ │ │ │Validation │ │ ││
|
||||
│ └─────────────────┘ └─────────────────┘ └─────────────────┘│
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Core Components
|
||||
|
||||
### 1. MCP Core Engine (server.py:45)
|
||||
|
||||
**Purpose**: Central coordination hub managing the MCP protocol implementation
|
||||
**Key Components**:
|
||||
- **AsyncIO Event Loop**: Handles concurrent tool execution and request processing
|
||||
- **Tool Discovery**: Dynamic loading and registration via `@server.list_tools()` decorator
|
||||
- **Protocol Management**: MCP message parsing, validation, and response formatting
|
||||
|
||||
**Architecture Pattern**: Event-driven architecture with asyncio for non-blocking operations
|
||||
|
||||
### 2. Tool System Architecture
|
||||
|
||||
**Purpose**: Modular plugin system for specialized AI capabilities
|
||||
**Key Components**:
|
||||
- **BaseTool Abstract Class** (`tools/base.py:25`): Common interface for all tools
|
||||
- **Plugin Architecture**: Individual tool implementations in `tools/` directory
|
||||
- **Tool Selection Matrix**: CLAUDE.md defines appropriate tool usage patterns
|
||||
|
||||
**Data Flow**:
|
||||
```
|
||||
Claude Request → MCP Engine → Tool Selection → Gemini API → Response Processing → Claude
|
||||
```
|
||||
|
||||
**Tool Categories**:
|
||||
- **Quick Response**: `chat` - immediate answers and brainstorming
|
||||
- **Deep Analysis**: `thinkdeep` - complex architecture and strategic planning
|
||||
- **Code Quality**: `codereview` - security audits and bug detection
|
||||
- **Investigation**: `debug` - root cause analysis and error investigation
|
||||
- **Exploration**: `analyze` - codebase comprehension and dependency analysis
|
||||
- **Validation**: `precommit` - automated quality gates
|
||||
|
||||
### 3. Security Architecture
|
||||
|
||||
**Purpose**: Multi-layer defense system protecting against malicious operations
|
||||
**Key Components**:
|
||||
- **Path Validation** (`utils/file_utils.py:45`): Prevents directory traversal attacks
|
||||
- **Sandbox Enforcement**: PROJECT_ROOT containment for file operations
|
||||
- **Docker Path Translation**: Host-to-container path mapping with WORKSPACE_ROOT
|
||||
- **Absolute Path Requirement**: Eliminates relative path vulnerabilities
|
||||
|
||||
**Security Layers**:
|
||||
1. **Input Validation**: Path sanitization and dangerous operation detection
|
||||
2. **Container Isolation**: Docker environment with controlled file access
|
||||
3. **Permission Boundaries**: Read-only access patterns with explicit write gates
|
||||
4. **Error Recovery**: Graceful handling of unauthorized operations
|
||||
|
||||
### 4. Thinking Modes System
|
||||
|
||||
**Purpose**: Computational budget control for Gemini's analysis depth
|
||||
**Implementation**:
|
||||
- **Token Allocation**: `minimal (128), low (2048), medium (8192), high (16384), max (32768)`
|
||||
- **Dynamic Selection**: Tools adjust thinking depth based on task complexity
|
||||
- **Resource Management**: Prevents token exhaustion on complex analysis
|
||||
|
||||
**Usage Pattern**:
|
||||
```python
|
||||
# tools/thinkdeep.py:67
|
||||
thinking_mode = request.get('thinking_mode', 'high')
|
||||
context_tokens = THINKING_MODE_TOKENS[thinking_mode]
|
||||
```
|
||||
|
||||
### 5. Conversation System
|
||||
|
||||
**Purpose**: Cross-session context preservation and threading
|
||||
**Key Components**:
|
||||
- **Redis Persistence** (`utils/conversation_memory.py:30`): Thread storage and retrieval
|
||||
- **Thread Reconstruction**: UUID-based conversation continuity
|
||||
- **Cross-Tool Continuation**: `continuation_id` parameter for context flow
|
||||
- **Follow-up Management**: Structured multi-turn conversation support
|
||||
|
||||
**Data Structures**:
|
||||
```python
|
||||
# utils/conversation_memory.py:45
|
||||
class ThreadContext:
|
||||
thread_id: str
|
||||
tool_history: List[ToolExecution]
|
||||
conversation_files: List[str]
|
||||
context_tokens: int
|
||||
```
|
||||
|
||||
## Integration Points
|
||||
|
||||
### Configuration Management (config.py)
|
||||
|
||||
**Critical Settings**:
|
||||
- **`GEMINI_MODEL`** (config.py:24): Model selection for API calls
|
||||
- **`MAX_CONTEXT_TOKENS`** (config.py:30): Token limits for conversation management
|
||||
- **`REDIS_URL`** (config.py:60): Conversation memory backend
|
||||
- **`PROJECT_ROOT`** (config.py:15): Security sandbox boundary
|
||||
|
||||
### Utility Services
|
||||
|
||||
**File Operations** (`utils/file_utils.py`):
|
||||
- Token-aware reading with priority system
|
||||
- Directory expansion with filtering
|
||||
- Error-resistant content formatting
|
||||
|
||||
**Git Integration** (`utils/git_utils.py`):
|
||||
- Repository state analysis for precommit validation
|
||||
- Change detection for documentation updates
|
||||
- Branch and commit tracking
|
||||
|
||||
**Token Management** (`utils/token_utils.py`):
|
||||
- Context optimization and pruning
|
||||
- File prioritization strategies
|
||||
- Memory usage monitoring
|
||||
|
||||
## Data Flow Patterns
|
||||
|
||||
### 1. Tool Execution Flow
|
||||
|
||||
```
|
||||
1. Claude sends MCP request with tool name and parameters
|
||||
2. MCP Engine validates request and routes to appropriate tool
|
||||
3. Tool loads conversation context from Redis (if continuation_id provided)
|
||||
4. Tool processes request using Gemini API with thinking mode configuration
|
||||
5. Tool stores results in conversation memory and returns formatted response
|
||||
6. MCP Engine serializes response and sends to Claude via stdio
|
||||
```
|
||||
|
||||
### 2. File Processing Pipeline
|
||||
|
||||
```
|
||||
1. File paths received and validated against security rules
|
||||
2. Docker path translation (host → container mapping)
|
||||
3. Token budget allocation based on file size and context limits
|
||||
4. Priority-based file reading (code files > documentation > logs)
|
||||
5. Content formatting with line numbers and error handling
|
||||
6. Context assembly with deduplication across conversation turns
|
||||
```
|
||||
|
||||
### 3. Security Validation Chain
|
||||
|
||||
```
|
||||
1. Path Input → Dangerous Path Detection → Rejection/Sanitization
|
||||
2. Validated Path → Absolute Path Conversion → Sandbox Boundary Check
|
||||
3. Bounded Path → Docker Translation → Container Path Generation
|
||||
4. Safe Path → File Operation → Error-Resistant Content Return
|
||||
```
|
||||
|
||||
## Performance Characteristics
|
||||
|
||||
### Scalability Factors
|
||||
|
||||
- **Concurrent Tool Execution**: AsyncIO enables parallel processing of multiple tool requests
|
||||
- **Memory Efficiency**: Token-aware file processing prevents memory exhaustion
|
||||
- **Context Optimization**: Conversation deduplication reduces redundant processing
|
||||
- **Error Resilience**: Graceful degradation maintains functionality during failures
|
||||
|
||||
### Resource Management
|
||||
|
||||
- **Token Budgeting**: 40% context reservation (30% Memory Bank + 10% Memory MCP)
|
||||
- **File Prioritization**: Direct code files prioritized over supporting documentation
|
||||
- **Redis Optimization**: Thread-based storage with automatic cleanup
|
||||
- **Gemini API Efficiency**: Thinking mode selection optimizes computational costs
|
||||
|
||||
## Extension Points
|
||||
|
||||
### Adding New Tools
|
||||
|
||||
1. **Inherit from BaseTool** (`tools/base.py:25`)
|
||||
2. **Implement required methods**: `execute()`, `get_schema()`
|
||||
3. **Register with MCP Engine**: Add to tool discovery system
|
||||
4. **Update CLAUDE.md**: Define collaboration patterns and usage guidelines
|
||||
|
||||
### Security Extensions
|
||||
|
||||
1. **Custom Validators**: Add to `utils/file_utils.py` validation chain
|
||||
2. **Path Translators**: Extend Docker path mapping for new mount points
|
||||
3. **Permission Gates**: Implement granular access controls for sensitive operations
|
||||
|
||||
### Performance Optimizations
|
||||
|
||||
1. **Caching Layers**: Add Redis caching for frequently accessed files
|
||||
2. **Context Compression**: Implement intelligent context summarization
|
||||
3. **Parallel Processing**: Extend AsyncIO patterns for I/O-bound operations
|
||||
|
||||
---
|
||||
|
||||
This architecture provides a robust, secure, and extensible foundation for AI-assisted development workflows while maintaining clear separation of concerns and comprehensive error handling.
|
||||
Reference in New Issue
Block a user