refactor: cleanup and comprehensive documentation
Major changes: - Add comprehensive documentation to all modules with detailed docstrings - Remove unused THINKING_MODEL config (use single GEMINI_MODEL with thinking_mode param) - Remove list_models functionality (simplified to single model configuration) - Rename DEFAULT_MODEL to GEMINI_MODEL for clarity - Remove unused python-dotenv dependency - Fix missing pydantic in setup.py dependencies Documentation improvements: - Document security measures in file_utils.py (path validation, sandboxing) - Add detailed comments to critical logic sections - Document tool creation process in BaseTool - Explain configuration values and their impact - Add comprehensive function-level documentation Code quality: - Apply black formatting to all files - Fix all ruff linting issues - Update tests to match refactored code - All 63 tests passing 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
14
README.md
14
README.md
@@ -122,7 +122,6 @@ Just ask Claude naturally:
|
|||||||
- **Pre-commit validation?** → `review_changes` (validate git changes before committing)
|
- **Pre-commit validation?** → `review_changes` (validate git changes before committing)
|
||||||
- **Something's broken?** → `debug_issue` (root cause analysis, error tracing)
|
- **Something's broken?** → `debug_issue` (root cause analysis, error tracing)
|
||||||
- **Want to understand code?** → `analyze` (architecture, patterns, dependencies)
|
- **Want to understand code?** → `analyze` (architecture, patterns, dependencies)
|
||||||
- **Check models?** → `list_models` (see available Gemini models)
|
|
||||||
- **Server info?** → `get_version` (version and configuration details)
|
- **Server info?** → `get_version` (version and configuration details)
|
||||||
|
|
||||||
**Tools Overview:**
|
**Tools Overview:**
|
||||||
@@ -132,8 +131,7 @@ Just ask Claude naturally:
|
|||||||
4. [`review_changes`](#4-review_changes---pre-commit-validation) - Validate git changes before committing
|
4. [`review_changes`](#4-review_changes---pre-commit-validation) - Validate git changes before committing
|
||||||
5. [`debug_issue`](#5-debug_issue---expert-debugging-assistant) - Root cause analysis and debugging
|
5. [`debug_issue`](#5-debug_issue---expert-debugging-assistant) - Root cause analysis and debugging
|
||||||
6. [`analyze`](#6-analyze---smart-file-analysis) - General-purpose file and code analysis
|
6. [`analyze`](#6-analyze---smart-file-analysis) - General-purpose file and code analysis
|
||||||
7. [`list_models`](#7-list_models---see-available-gemini-models) - List available Gemini models
|
7. [`get_version`](#7-get_version---server-information) - Get server version and configuration
|
||||||
8. [`get_version`](#8-get_version---server-information) - Get server version and configuration
|
|
||||||
|
|
||||||
### 1. `chat` - General Development Chat & Collaborative Thinking
|
### 1. `chat` - General Development Chat & Collaborative Thinking
|
||||||
**Your thinking partner - bounce ideas, get second opinions, brainstorm collaboratively**
|
**Your thinking partner - bounce ideas, get second opinions, brainstorm collaboratively**
|
||||||
@@ -346,13 +344,7 @@ Combine your findings with gemini's to create a comprehensive security report."
|
|||||||
|
|
||||||
**Triggers:** analyze, examine, look at, understand, inspect
|
**Triggers:** analyze, examine, look at, understand, inspect
|
||||||
|
|
||||||
### 7. `list_models` - See Available Gemini Models
|
### 7. `get_version` - Server Information
|
||||||
```
|
|
||||||
"Use gemini to list available models"
|
|
||||||
"Get gemini to show me what models I can use"
|
|
||||||
```
|
|
||||||
|
|
||||||
### 8. `get_version` - Server Information
|
|
||||||
```
|
```
|
||||||
"Use gemini for its version"
|
"Use gemini for its version"
|
||||||
"Get gemini to show server configuration"
|
"Get gemini to show server configuration"
|
||||||
@@ -530,7 +522,7 @@ All tools support a `thinking_mode` parameter that controls Gemini's thinking bu
|
|||||||
The server includes several configurable properties that control its behavior:
|
The server includes several configurable properties that control its behavior:
|
||||||
|
|
||||||
### Model Configuration
|
### Model Configuration
|
||||||
- **`DEFAULT_MODEL`**: `"gemini-2.5-pro-preview-06-05"` - The latest Gemini 2.5 Pro model with native thinking support
|
- **`GEMINI_MODEL`**: `"gemini-2.5-pro-preview-06-05"` - The latest Gemini 2.5 Pro model with native thinking support
|
||||||
- **`MAX_CONTEXT_TOKENS`**: `1,000,000` - Maximum input context (1M tokens for Gemini 2.5 Pro)
|
- **`MAX_CONTEXT_TOKENS`**: `1,000,000` - Maximum input context (1M tokens for Gemini 2.5 Pro)
|
||||||
|
|
||||||
### Temperature Defaults
|
### Temperature Defaults
|
||||||
|
|||||||
37
config.py
37
config.py
@@ -1,20 +1,43 @@
|
|||||||
"""
|
"""
|
||||||
Configuration and constants for Gemini MCP Server
|
Configuration and constants for Gemini MCP Server
|
||||||
|
|
||||||
|
This module centralizes all configuration settings for the Gemini MCP Server.
|
||||||
|
It defines model configurations, token limits, temperature defaults, and other
|
||||||
|
constants used throughout the application.
|
||||||
|
|
||||||
|
Configuration values can be overridden by environment variables where appropriate.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# Version and metadata
|
# Version and metadata
|
||||||
__version__ = "2.8.0"
|
# These values are used in server responses and for tracking releases
|
||||||
__updated__ = "2025-09-09"
|
__version__ = "2.8.0" # Semantic versioning: MAJOR.MINOR.PATCH
|
||||||
__author__ = "Fahad Gilani"
|
__updated__ = "2025-09-09" # Last update date in ISO format
|
||||||
|
__author__ = "Fahad Gilani" # Primary maintainer
|
||||||
|
|
||||||
# Model configuration
|
# Model configuration
|
||||||
DEFAULT_MODEL = "gemini-2.5-pro-preview-06-05"
|
# GEMINI_MODEL: The Gemini model used for all AI operations
|
||||||
THINKING_MODEL = (
|
# This should be a stable, high-performance model suitable for code analysis
|
||||||
"gemini-2.0-flash-thinking-exp" # Enhanced reasoning model for think_deeper
|
GEMINI_MODEL = "gemini-2.5-pro-preview-06-05"
|
||||||
)
|
|
||||||
|
# MAX_CONTEXT_TOKENS: Maximum number of tokens that can be included in a single request
|
||||||
|
# This limit includes both the prompt and expected response
|
||||||
|
# Gemini Pro models support up to 1M tokens, but practical usage should reserve
|
||||||
|
# space for the model's response (typically 50K-100K tokens reserved)
|
||||||
MAX_CONTEXT_TOKENS = 1_000_000 # 1M tokens for Gemini Pro
|
MAX_CONTEXT_TOKENS = 1_000_000 # 1M tokens for Gemini Pro
|
||||||
|
|
||||||
# Temperature defaults for different tool types
|
# Temperature defaults for different tool types
|
||||||
|
# Temperature controls the randomness/creativity of model responses
|
||||||
|
# Lower values (0.0-0.3) produce more deterministic, focused responses
|
||||||
|
# Higher values (0.7-1.0) produce more creative, varied responses
|
||||||
|
|
||||||
|
# TEMPERATURE_ANALYTICAL: Used for tasks requiring precision and consistency
|
||||||
|
# Ideal for code review, debugging, and error analysis where accuracy is critical
|
||||||
TEMPERATURE_ANALYTICAL = 0.2 # For code review, debugging
|
TEMPERATURE_ANALYTICAL = 0.2 # For code review, debugging
|
||||||
|
|
||||||
|
# TEMPERATURE_BALANCED: Middle ground for general conversations
|
||||||
|
# Provides a good balance between consistency and helpful variety
|
||||||
TEMPERATURE_BALANCED = 0.5 # For general chat
|
TEMPERATURE_BALANCED = 0.5 # For general chat
|
||||||
|
|
||||||
|
# TEMPERATURE_CREATIVE: Higher temperature for exploratory tasks
|
||||||
|
# Used when brainstorming, exploring alternatives, or architectural discussions
|
||||||
TEMPERATURE_CREATIVE = 0.7 # For architecture, deep thinking
|
TEMPERATURE_CREATIVE = 0.7 # For architecture, deep thinking
|
||||||
|
|||||||
@@ -1,6 +1,5 @@
|
|||||||
mcp>=1.0.0
|
mcp>=1.0.0
|
||||||
google-genai>=1.19.0
|
google-genai>=1.19.0
|
||||||
python-dotenv>=1.0.0
|
|
||||||
pydantic>=2.0.0
|
pydantic>=2.0.0
|
||||||
|
|
||||||
# Development dependencies
|
# Development dependencies
|
||||||
|
|||||||
197
server.py
197
server.py
@@ -1,5 +1,21 @@
|
|||||||
"""
|
"""
|
||||||
Gemini MCP Server - Main server implementation
|
Gemini MCP Server - Main server implementation
|
||||||
|
|
||||||
|
This module implements the core MCP (Model Context Protocol) server that provides
|
||||||
|
AI-powered tools for code analysis, review, and assistance using Google's Gemini models.
|
||||||
|
|
||||||
|
The server follows the MCP specification to expose various AI tools as callable functions
|
||||||
|
that can be used by MCP clients (like Claude). Each tool provides specialized functionality
|
||||||
|
such as code review, debugging, deep thinking, and general chat capabilities.
|
||||||
|
|
||||||
|
Key Components:
|
||||||
|
- MCP Server: Handles protocol communication and tool discovery
|
||||||
|
- Tool Registry: Maps tool names to their implementations
|
||||||
|
- Request Handler: Processes incoming tool calls and returns formatted responses
|
||||||
|
- Configuration: Manages API keys and model settings
|
||||||
|
|
||||||
|
The server runs on stdio (standard input/output) and communicates using JSON-RPC messages
|
||||||
|
as defined by the MCP protocol.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import asyncio
|
import asyncio
|
||||||
@@ -9,14 +25,13 @@ import sys
|
|||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
from typing import Any, Dict, List
|
from typing import Any, Dict, List
|
||||||
|
|
||||||
from google import genai
|
|
||||||
from mcp.server import Server
|
from mcp.server import Server
|
||||||
from mcp.server.models import InitializationOptions
|
from mcp.server.models import InitializationOptions
|
||||||
from mcp.server.stdio import stdio_server
|
from mcp.server.stdio import stdio_server
|
||||||
from mcp.types import TextContent, Tool
|
from mcp.types import TextContent, Tool
|
||||||
|
|
||||||
from config import (
|
from config import (
|
||||||
DEFAULT_MODEL,
|
GEMINI_MODEL,
|
||||||
MAX_CONTEXT_TOKENS,
|
MAX_CONTEXT_TOKENS,
|
||||||
__author__,
|
__author__,
|
||||||
__updated__,
|
__updated__,
|
||||||
@@ -31,41 +46,67 @@ from tools import (
|
|||||||
ThinkDeeperTool,
|
ThinkDeeperTool,
|
||||||
)
|
)
|
||||||
|
|
||||||
# Configure logging
|
# Configure logging for server operations
|
||||||
|
# Set to INFO level to capture important operational messages without being too verbose
|
||||||
logging.basicConfig(level=logging.INFO)
|
logging.basicConfig(level=logging.INFO)
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Create the MCP server instance
|
# Create the MCP server instance with a unique name identifier
|
||||||
|
# This name is used by MCP clients to identify and connect to this specific server
|
||||||
server: Server = Server("gemini-server")
|
server: Server = Server("gemini-server")
|
||||||
|
|
||||||
# Initialize tools
|
# Initialize the tool registry with all available AI-powered tools
|
||||||
|
# Each tool provides specialized functionality for different development tasks
|
||||||
|
# Tools are instantiated once and reused across requests (stateless design)
|
||||||
TOOLS = {
|
TOOLS = {
|
||||||
"think_deeper": ThinkDeeperTool(),
|
"think_deeper": ThinkDeeperTool(), # Extended reasoning for complex problems
|
||||||
"review_code": ReviewCodeTool(),
|
"review_code": ReviewCodeTool(), # Comprehensive code review and quality analysis
|
||||||
"debug_issue": DebugIssueTool(),
|
"debug_issue": DebugIssueTool(), # Root cause analysis and debugging assistance
|
||||||
"analyze": AnalyzeTool(),
|
"analyze": AnalyzeTool(), # General-purpose file and code analysis
|
||||||
"chat": ChatTool(),
|
"chat": ChatTool(), # Interactive development chat and brainstorming
|
||||||
"review_changes": ReviewChanges(),
|
"review_changes": ReviewChanges(), # Pre-commit review of git changes
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
def configure_gemini():
|
def configure_gemini():
|
||||||
"""Configure Gemini API with the provided API key"""
|
"""
|
||||||
|
Configure Gemini API with the provided API key.
|
||||||
|
|
||||||
|
This function validates that the GEMINI_API_KEY environment variable is set.
|
||||||
|
The actual API key is used when creating Gemini clients within individual tools
|
||||||
|
to ensure proper isolation and error handling.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
ValueError: If GEMINI_API_KEY environment variable is not set
|
||||||
|
"""
|
||||||
api_key = os.getenv("GEMINI_API_KEY")
|
api_key = os.getenv("GEMINI_API_KEY")
|
||||||
if not api_key:
|
if not api_key:
|
||||||
raise ValueError(
|
raise ValueError(
|
||||||
"GEMINI_API_KEY environment variable is required. "
|
"GEMINI_API_KEY environment variable is required. "
|
||||||
"Please set it with your Gemini API key."
|
"Please set it with your Gemini API key."
|
||||||
)
|
)
|
||||||
# API key is used when creating clients in tools
|
# Note: We don't store the API key globally for security reasons
|
||||||
|
# Each tool creates its own Gemini client with the API key when needed
|
||||||
logger.info("Gemini API key found")
|
logger.info("Gemini API key found")
|
||||||
|
|
||||||
|
|
||||||
@server.list_tools()
|
@server.list_tools()
|
||||||
async def handle_list_tools() -> List[Tool]:
|
async def handle_list_tools() -> List[Tool]:
|
||||||
"""List all available tools with verbose descriptions"""
|
"""
|
||||||
|
List all available tools with their descriptions and input schemas.
|
||||||
|
|
||||||
|
This handler is called by MCP clients during initialization to discover
|
||||||
|
what tools are available. Each tool provides:
|
||||||
|
- name: Unique identifier for the tool
|
||||||
|
- description: Detailed explanation of what the tool does
|
||||||
|
- inputSchema: JSON Schema defining the expected parameters
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of Tool objects representing all available tools
|
||||||
|
"""
|
||||||
tools = []
|
tools = []
|
||||||
|
|
||||||
|
# Add all registered AI-powered tools from the TOOLS registry
|
||||||
for tool in TOOLS.values():
|
for tool in TOOLS.values():
|
||||||
tools.append(
|
tools.append(
|
||||||
Tool(
|
Tool(
|
||||||
@@ -75,17 +116,10 @@ async def handle_list_tools() -> List[Tool]:
|
|||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
|
||||||
# Add utility tools
|
# Add utility tools that provide server metadata and configuration info
|
||||||
|
# These tools don't require AI processing but are useful for clients
|
||||||
tools.extend(
|
tools.extend(
|
||||||
[
|
[
|
||||||
Tool(
|
|
||||||
name="list_models",
|
|
||||||
description=(
|
|
||||||
"LIST AVAILABLE MODELS - Show all Gemini models you can use. "
|
|
||||||
"Lists model names, descriptions, and which one is the default."
|
|
||||||
),
|
|
||||||
inputSchema={"type": "object", "properties": {}},
|
|
||||||
),
|
|
||||||
Tool(
|
Tool(
|
||||||
name="get_version",
|
name="get_version",
|
||||||
description=(
|
description=(
|
||||||
@@ -102,100 +136,65 @@ async def handle_list_tools() -> List[Tool]:
|
|||||||
|
|
||||||
@server.call_tool()
|
@server.call_tool()
|
||||||
async def handle_call_tool(name: str, arguments: Dict[str, Any]) -> List[TextContent]:
|
async def handle_call_tool(name: str, arguments: Dict[str, Any]) -> List[TextContent]:
|
||||||
"""Handle tool execution requests"""
|
"""
|
||||||
|
Handle incoming tool execution requests from MCP clients.
|
||||||
|
|
||||||
# Handle dynamic tools
|
This is the main request dispatcher that routes tool calls to their
|
||||||
|
appropriate handlers. It supports both AI-powered tools (from TOOLS registry)
|
||||||
|
and utility tools (implemented as static functions).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
name: The name of the tool to execute
|
||||||
|
arguments: Dictionary of arguments to pass to the tool
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of TextContent objects containing the tool's response
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Route to AI-powered tools that require Gemini API calls
|
||||||
if name in TOOLS:
|
if name in TOOLS:
|
||||||
tool = TOOLS[name]
|
tool = TOOLS[name]
|
||||||
return await tool.execute(arguments)
|
return await tool.execute(arguments)
|
||||||
|
|
||||||
# Handle static tools
|
# Route to utility tools that provide server information
|
||||||
elif name == "list_models":
|
|
||||||
return await handle_list_models()
|
|
||||||
|
|
||||||
elif name == "get_version":
|
elif name == "get_version":
|
||||||
return await handle_get_version()
|
return await handle_get_version()
|
||||||
|
|
||||||
|
# Handle unknown tool requests gracefully
|
||||||
else:
|
else:
|
||||||
return [TextContent(type="text", text=f"Unknown tool: {name}")]
|
return [TextContent(type="text", text=f"Unknown tool: {name}")]
|
||||||
|
|
||||||
|
|
||||||
async def handle_list_models() -> List[TextContent]:
|
|
||||||
"""List available Gemini models"""
|
|
||||||
try:
|
|
||||||
import json
|
|
||||||
|
|
||||||
# Get API key
|
|
||||||
api_key = os.getenv("GEMINI_API_KEY")
|
|
||||||
if not api_key:
|
|
||||||
return [TextContent(type="text", text="Error: GEMINI_API_KEY not set")]
|
|
||||||
|
|
||||||
client = genai.Client(api_key=api_key)
|
|
||||||
models = []
|
|
||||||
|
|
||||||
# List models using the new API
|
|
||||||
try:
|
|
||||||
model_list = client.models.list()
|
|
||||||
for model_info in model_list:
|
|
||||||
models.append(
|
|
||||||
{
|
|
||||||
"name": getattr(model_info, "id", "Unknown"),
|
|
||||||
"display_name": getattr(
|
|
||||||
model_info,
|
|
||||||
"display_name",
|
|
||||||
getattr(model_info, "id", "Unknown"),
|
|
||||||
),
|
|
||||||
"description": getattr(
|
|
||||||
model_info, "description", "No description"
|
|
||||||
),
|
|
||||||
"is_default": getattr(model_info, "id", "").endswith(
|
|
||||||
DEFAULT_MODEL
|
|
||||||
),
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
except Exception:
|
|
||||||
# Fallback: return some known models
|
|
||||||
models = [
|
|
||||||
{
|
|
||||||
"name": "gemini-2.5-pro-preview-06-05",
|
|
||||||
"display_name": "Gemini 2.5 Pro",
|
|
||||||
"description": "Latest Gemini 2.5 Pro model",
|
|
||||||
"is_default": True,
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"name": "gemini-2.0-flash-thinking-exp",
|
|
||||||
"display_name": "Gemini 2.0 Flash Thinking",
|
|
||||||
"description": "Enhanced reasoning model",
|
|
||||||
"is_default": False,
|
|
||||||
},
|
|
||||||
]
|
|
||||||
|
|
||||||
return [TextContent(type="text", text=json.dumps(models, indent=2))]
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
return [TextContent(type="text", text=f"Error listing models: {str(e)}")]
|
|
||||||
|
|
||||||
|
|
||||||
async def handle_get_version() -> List[TextContent]:
|
async def handle_get_version() -> List[TextContent]:
|
||||||
"""Get version and configuration information"""
|
"""
|
||||||
|
Get comprehensive version and configuration information about the server.
|
||||||
|
|
||||||
|
Provides details about the server version, configuration settings,
|
||||||
|
available tools, and runtime environment. Useful for debugging and
|
||||||
|
understanding the server's capabilities.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Formatted text with version and configuration details
|
||||||
|
"""
|
||||||
|
# Gather comprehensive server information
|
||||||
version_info = {
|
version_info = {
|
||||||
"version": __version__,
|
"version": __version__,
|
||||||
"updated": __updated__,
|
"updated": __updated__,
|
||||||
"author": __author__,
|
"author": __author__,
|
||||||
"default_model": DEFAULT_MODEL,
|
"gemini_model": GEMINI_MODEL,
|
||||||
"max_context_tokens": f"{MAX_CONTEXT_TOKENS:,}",
|
"max_context_tokens": f"{MAX_CONTEXT_TOKENS:,}",
|
||||||
"python_version": f"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}",
|
"python_version": f"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}",
|
||||||
"server_started": datetime.now().isoformat(),
|
"server_started": datetime.now().isoformat(),
|
||||||
"available_tools": list(TOOLS.keys()) + ["chat", "list_models", "get_version"],
|
"available_tools": list(TOOLS.keys()) + ["get_version"],
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# Format the information in a human-readable way
|
||||||
text = f"""Gemini MCP Server v{__version__}
|
text = f"""Gemini MCP Server v{__version__}
|
||||||
Updated: {__updated__}
|
Updated: {__updated__}
|
||||||
Author: {__author__}
|
Author: {__author__}
|
||||||
|
|
||||||
Configuration:
|
Configuration:
|
||||||
- Default Model: {DEFAULT_MODEL}
|
- Gemini Model: {GEMINI_MODEL}
|
||||||
- Max Context: {MAX_CONTEXT_TOKENS:,} tokens
|
- Max Context: {MAX_CONTEXT_TOKENS:,} tokens
|
||||||
- Python: {version_info['python_version']}
|
- Python: {version_info['python_version']}
|
||||||
- Started: {version_info['server_started']}
|
- Started: {version_info['server_started']}
|
||||||
@@ -209,11 +208,21 @@ For updates, visit: https://github.com/BeehiveInnovations/gemini-mcp-server"""
|
|||||||
|
|
||||||
|
|
||||||
async def main():
|
async def main():
|
||||||
"""Main entry point for the server"""
|
"""
|
||||||
# Configure Gemini API
|
Main entry point for the MCP server.
|
||||||
|
|
||||||
|
Initializes the Gemini API configuration and starts the server using
|
||||||
|
stdio transport. The server will continue running until the client
|
||||||
|
disconnects or an error occurs.
|
||||||
|
|
||||||
|
The server communicates via standard input/output streams using the
|
||||||
|
MCP protocol's JSON-RPC message format.
|
||||||
|
"""
|
||||||
|
# Validate that Gemini API key is available before starting
|
||||||
configure_gemini()
|
configure_gemini()
|
||||||
|
|
||||||
# Run the server using stdio transport
|
# Run the server using stdio transport (standard input/output)
|
||||||
|
# This allows the server to be launched by MCP clients as a subprocess
|
||||||
async with stdio_server() as (read_stream, write_stream):
|
async with stdio_server() as (read_stream, write_stream):
|
||||||
await server.run(
|
await server.run(
|
||||||
read_stream,
|
read_stream,
|
||||||
@@ -221,7 +230,7 @@ async def main():
|
|||||||
InitializationOptions(
|
InitializationOptions(
|
||||||
server_name="gemini",
|
server_name="gemini",
|
||||||
server_version=__version__,
|
server_version=__version__,
|
||||||
capabilities={"tools": {}},
|
capabilities={"tools": {}}, # Advertise tool support capability
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|||||||
2
setup.py
2
setup.py
@@ -24,7 +24,7 @@ setup(
|
|||||||
install_requires=[
|
install_requires=[
|
||||||
"mcp>=1.0.0",
|
"mcp>=1.0.0",
|
||||||
"google-genai>=1.19.0",
|
"google-genai>=1.19.0",
|
||||||
"python-dotenv>=1.0.0",
|
"pydantic>=2.0.0",
|
||||||
],
|
],
|
||||||
extras_require={
|
extras_require={
|
||||||
"dev": [
|
"dev": [
|
||||||
|
|||||||
@@ -3,7 +3,7 @@ Tests for configuration
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
from config import (
|
from config import (
|
||||||
DEFAULT_MODEL,
|
GEMINI_MODEL,
|
||||||
MAX_CONTEXT_TOKENS,
|
MAX_CONTEXT_TOKENS,
|
||||||
TEMPERATURE_ANALYTICAL,
|
TEMPERATURE_ANALYTICAL,
|
||||||
TEMPERATURE_BALANCED,
|
TEMPERATURE_BALANCED,
|
||||||
@@ -31,7 +31,7 @@ class TestConfig:
|
|||||||
|
|
||||||
def test_model_config(self):
|
def test_model_config(self):
|
||||||
"""Test model configuration"""
|
"""Test model configuration"""
|
||||||
assert DEFAULT_MODEL == "gemini-2.5-pro-preview-06-05"
|
assert GEMINI_MODEL == "gemini-2.5-pro-preview-06-05"
|
||||||
assert MAX_CONTEXT_TOKENS == 1_000_000
|
assert MAX_CONTEXT_TOKENS == 1_000_000
|
||||||
|
|
||||||
def test_temperature_defaults(self):
|
def test_temperature_defaults(self):
|
||||||
|
|||||||
@@ -2,7 +2,6 @@
|
|||||||
Tests for the main server functionality
|
Tests for the main server functionality
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import json
|
|
||||||
from unittest.mock import Mock, patch
|
from unittest.mock import Mock, patch
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
@@ -26,11 +25,10 @@ class TestServerTools:
|
|||||||
assert "analyze" in tool_names
|
assert "analyze" in tool_names
|
||||||
assert "chat" in tool_names
|
assert "chat" in tool_names
|
||||||
assert "review_changes" in tool_names
|
assert "review_changes" in tool_names
|
||||||
assert "list_models" in tool_names
|
|
||||||
assert "get_version" in tool_names
|
assert "get_version" in tool_names
|
||||||
|
|
||||||
# Should have exactly 8 tools
|
# Should have exactly 7 tools
|
||||||
assert len(tools) == 8
|
assert len(tools) == 7
|
||||||
|
|
||||||
# Check descriptions are verbose
|
# Check descriptions are verbose
|
||||||
for tool in tools:
|
for tool in tools:
|
||||||
@@ -69,22 +67,6 @@ class TestServerTools:
|
|||||||
assert response_data["status"] == "success"
|
assert response_data["status"] == "success"
|
||||||
assert response_data["content"] == "Chat response"
|
assert response_data["content"] == "Chat response"
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_handle_list_models(self):
|
|
||||||
"""Test listing models"""
|
|
||||||
result = await handle_call_tool("list_models", {})
|
|
||||||
assert len(result) == 1
|
|
||||||
|
|
||||||
# Check if we got models or an error
|
|
||||||
text = result[0].text
|
|
||||||
if "Error" in text:
|
|
||||||
# API key not set in test environment
|
|
||||||
assert "GEMINI_API_KEY" in text
|
|
||||||
else:
|
|
||||||
# Should have models
|
|
||||||
models = json.loads(text)
|
|
||||||
assert len(models) >= 1
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
@pytest.mark.asyncio
|
||||||
async def test_handle_get_version(self):
|
async def test_handle_get_version(self):
|
||||||
"""Test getting version info"""
|
"""Test getting version info"""
|
||||||
|
|||||||
260
tools/base.py
260
tools/base.py
@@ -1,5 +1,16 @@
|
|||||||
"""
|
"""
|
||||||
Base class for all Gemini MCP tools
|
Base class for all Gemini MCP tools
|
||||||
|
|
||||||
|
This module provides the abstract base class that all tools must inherit from.
|
||||||
|
It defines the contract that tools must implement and provides common functionality
|
||||||
|
for request validation, error handling, and response formatting.
|
||||||
|
|
||||||
|
Key responsibilities:
|
||||||
|
- Define the tool interface (abstract methods that must be implemented)
|
||||||
|
- Handle request validation and file path security
|
||||||
|
- Manage Gemini model creation with appropriate configurations
|
||||||
|
- Standardize response formatting and error handling
|
||||||
|
- Support for clarification requests when more information is needed
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from abc import ABC, abstractmethod
|
from abc import ABC, abstractmethod
|
||||||
@@ -16,7 +27,13 @@ from .models import ToolOutput, ClarificationRequest
|
|||||||
|
|
||||||
|
|
||||||
class ToolRequest(BaseModel):
|
class ToolRequest(BaseModel):
|
||||||
"""Base request model for all tools"""
|
"""
|
||||||
|
Base request model for all tools.
|
||||||
|
|
||||||
|
This Pydantic model defines common parameters that can be used by any tool.
|
||||||
|
Tools can extend this model to add their specific parameters while inheriting
|
||||||
|
these common fields.
|
||||||
|
"""
|
||||||
|
|
||||||
model: Optional[str] = Field(
|
model: Optional[str] = Field(
|
||||||
None, description="Model to use (defaults to Gemini 2.5 Pro)"
|
None, description="Model to use (defaults to Gemini 2.5 Pro)"
|
||||||
@@ -24,6 +41,8 @@ class ToolRequest(BaseModel):
|
|||||||
temperature: Optional[float] = Field(
|
temperature: Optional[float] = Field(
|
||||||
None, description="Temperature for response (tool-specific defaults)"
|
None, description="Temperature for response (tool-specific defaults)"
|
||||||
)
|
)
|
||||||
|
# Thinking mode controls how much computational budget the model uses for reasoning
|
||||||
|
# Higher values allow for more complex reasoning but increase latency and cost
|
||||||
thinking_mode: Optional[Literal["minimal", "low", "medium", "high", "max"]] = Field(
|
thinking_mode: Optional[Literal["minimal", "low", "medium", "high", "max"]] = Field(
|
||||||
None,
|
None,
|
||||||
description="Thinking depth: minimal (128), low (2048), medium (8192), high (16384), max (32768)",
|
description="Thinking depth: minimal (128), low (2048), medium (8192), high (16384), max (32768)",
|
||||||
@@ -31,52 +50,130 @@ class ToolRequest(BaseModel):
|
|||||||
|
|
||||||
|
|
||||||
class BaseTool(ABC):
|
class BaseTool(ABC):
|
||||||
"""Base class for all Gemini tools"""
|
"""
|
||||||
|
Abstract base class for all Gemini tools.
|
||||||
|
|
||||||
|
This class defines the interface that all tools must implement and provides
|
||||||
|
common functionality for request handling, model creation, and response formatting.
|
||||||
|
|
||||||
|
To create a new tool:
|
||||||
|
1. Create a new class that inherits from BaseTool
|
||||||
|
2. Implement all abstract methods
|
||||||
|
3. Define a request model that inherits from ToolRequest
|
||||||
|
4. Register the tool in server.py's TOOLS dictionary
|
||||||
|
"""
|
||||||
|
|
||||||
def __init__(self):
|
def __init__(self):
|
||||||
|
# Cache tool metadata at initialization to avoid repeated calls
|
||||||
self.name = self.get_name()
|
self.name = self.get_name()
|
||||||
self.description = self.get_description()
|
self.description = self.get_description()
|
||||||
self.default_temperature = self.get_default_temperature()
|
self.default_temperature = self.get_default_temperature()
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def get_name(self) -> str:
|
def get_name(self) -> str:
|
||||||
"""Return the tool name"""
|
"""
|
||||||
|
Return the unique name identifier for this tool.
|
||||||
|
|
||||||
|
This name is used by MCP clients to invoke the tool and must be
|
||||||
|
unique across all registered tools.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
str: The tool's unique name (e.g., "review_code", "analyze")
|
||||||
|
"""
|
||||||
pass
|
pass
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def get_description(self) -> str:
|
def get_description(self) -> str:
|
||||||
"""Return the verbose tool description for Claude"""
|
"""
|
||||||
|
Return a detailed description of what this tool does.
|
||||||
|
|
||||||
|
This description is shown to MCP clients (like Claude) to help them
|
||||||
|
understand when and how to use the tool. It should be comprehensive
|
||||||
|
and include trigger phrases.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
str: Detailed tool description with usage examples
|
||||||
|
"""
|
||||||
pass
|
pass
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def get_input_schema(self) -> Dict[str, Any]:
|
def get_input_schema(self) -> Dict[str, Any]:
|
||||||
"""Return the JSON schema for tool inputs"""
|
"""
|
||||||
|
Return the JSON Schema that defines this tool's parameters.
|
||||||
|
|
||||||
|
This schema is used by MCP clients to validate inputs before
|
||||||
|
sending requests. It should match the tool's request model.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dict[str, Any]: JSON Schema object defining required and optional parameters
|
||||||
|
"""
|
||||||
pass
|
pass
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def get_system_prompt(self) -> str:
|
def get_system_prompt(self) -> str:
|
||||||
"""Return the system prompt for this tool"""
|
"""
|
||||||
|
Return the system prompt that configures the AI model's behavior.
|
||||||
|
|
||||||
|
This prompt sets the context and instructions for how the model
|
||||||
|
should approach the task. It's prepended to the user's request.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
str: System prompt with role definition and instructions
|
||||||
|
"""
|
||||||
pass
|
pass
|
||||||
|
|
||||||
def get_default_temperature(self) -> float:
|
def get_default_temperature(self) -> float:
|
||||||
"""Return default temperature for this tool"""
|
"""
|
||||||
|
Return the default temperature setting for this tool.
|
||||||
|
|
||||||
|
Override this method to set tool-specific temperature defaults.
|
||||||
|
Lower values (0.0-0.3) for analytical tasks, higher (0.7-1.0) for creative tasks.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
float: Default temperature between 0.0 and 1.0
|
||||||
|
"""
|
||||||
return 0.5
|
return 0.5
|
||||||
|
|
||||||
def get_default_thinking_mode(self) -> str:
|
def get_default_thinking_mode(self) -> str:
|
||||||
"""Return default thinking_mode for this tool"""
|
"""
|
||||||
|
Return the default thinking mode for this tool.
|
||||||
|
|
||||||
|
Thinking mode controls computational budget for reasoning.
|
||||||
|
Override for tools that need more or less reasoning depth.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
str: One of "minimal", "low", "medium", "high", "max"
|
||||||
|
"""
|
||||||
return "medium" # Default to medium thinking for better reasoning
|
return "medium" # Default to medium thinking for better reasoning
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def get_request_model(self):
|
def get_request_model(self):
|
||||||
"""Return the Pydantic model for request validation"""
|
"""
|
||||||
|
Return the Pydantic model class used for validating requests.
|
||||||
|
|
||||||
|
This model should inherit from ToolRequest and define all
|
||||||
|
parameters specific to this tool.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Type[ToolRequest]: The request model class
|
||||||
|
"""
|
||||||
pass
|
pass
|
||||||
|
|
||||||
def validate_file_paths(self, request) -> Optional[str]:
|
def validate_file_paths(self, request) -> Optional[str]:
|
||||||
"""
|
"""
|
||||||
Validate that all file paths in the request are absolute.
|
Validate that all file paths in the request are absolute.
|
||||||
Returns error message if validation fails, None if all paths are valid.
|
|
||||||
|
This is a critical security function that prevents path traversal attacks
|
||||||
|
and ensures all file access is properly controlled. All file paths must
|
||||||
|
be absolute to avoid ambiguity and security issues.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: The validated request object
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Optional[str]: Error message if validation fails, None if all paths are valid
|
||||||
"""
|
"""
|
||||||
# Check if request has 'files' attribute
|
# Check if request has 'files' attribute (used by most tools)
|
||||||
if hasattr(request, "files") and request.files:
|
if hasattr(request, "files") and request.files:
|
||||||
for file_path in request.files:
|
for file_path in request.files:
|
||||||
if not os.path.isabs(file_path):
|
if not os.path.isabs(file_path):
|
||||||
@@ -86,7 +183,7 @@ class BaseTool(ABC):
|
|||||||
f"Please provide the full absolute path starting with '/'"
|
f"Please provide the full absolute path starting with '/'"
|
||||||
)
|
)
|
||||||
|
|
||||||
# Check if request has 'path' attribute (for review_changes)
|
# Check if request has 'path' attribute (used by review_changes tool)
|
||||||
if hasattr(request, "path") and request.path:
|
if hasattr(request, "path") and request.path:
|
||||||
if not os.path.isabs(request.path):
|
if not os.path.isabs(request.path):
|
||||||
return (
|
return (
|
||||||
@@ -98,13 +195,31 @@ class BaseTool(ABC):
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
async def execute(self, arguments: Dict[str, Any]) -> List[TextContent]:
|
async def execute(self, arguments: Dict[str, Any]) -> List[TextContent]:
|
||||||
"""Execute the tool with given arguments"""
|
"""
|
||||||
|
Execute the tool with the provided arguments.
|
||||||
|
|
||||||
|
This is the main entry point for tool execution. It handles:
|
||||||
|
1. Request validation using the tool's Pydantic model
|
||||||
|
2. File path security validation
|
||||||
|
3. Prompt preparation
|
||||||
|
4. Model creation and configuration
|
||||||
|
5. Response generation and formatting
|
||||||
|
6. Error handling and recovery
|
||||||
|
|
||||||
|
Args:
|
||||||
|
arguments: Dictionary of arguments from the MCP client
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List[TextContent]: Formatted response as MCP TextContent objects
|
||||||
|
"""
|
||||||
try:
|
try:
|
||||||
# Validate request
|
# Validate request using the tool's Pydantic model
|
||||||
|
# This ensures all required fields are present and properly typed
|
||||||
request_model = self.get_request_model()
|
request_model = self.get_request_model()
|
||||||
request = request_model(**arguments)
|
request = request_model(**arguments)
|
||||||
|
|
||||||
# Validate file paths
|
# Validate file paths for security
|
||||||
|
# This prevents path traversal attacks and ensures proper access control
|
||||||
path_error = self.validate_file_paths(request)
|
path_error = self.validate_file_paths(request)
|
||||||
if path_error:
|
if path_error:
|
||||||
error_output = ToolOutput(
|
error_output = ToolOutput(
|
||||||
@@ -114,13 +229,14 @@ class BaseTool(ABC):
|
|||||||
)
|
)
|
||||||
return [TextContent(type="text", text=error_output.model_dump_json())]
|
return [TextContent(type="text", text=error_output.model_dump_json())]
|
||||||
|
|
||||||
# Prepare the prompt
|
# Prepare the full prompt by combining system prompt with user request
|
||||||
|
# This is delegated to the tool implementation for customization
|
||||||
prompt = await self.prepare_prompt(request)
|
prompt = await self.prepare_prompt(request)
|
||||||
|
|
||||||
# Get model configuration
|
# Extract model configuration from request or use defaults
|
||||||
from config import DEFAULT_MODEL
|
from config import GEMINI_MODEL
|
||||||
|
|
||||||
model_name = getattr(request, "model", None) or DEFAULT_MODEL
|
model_name = getattr(request, "model", None) or GEMINI_MODEL
|
||||||
temperature = getattr(request, "temperature", None)
|
temperature = getattr(request, "temperature", None)
|
||||||
if temperature is None:
|
if temperature is None:
|
||||||
temperature = self.get_default_temperature()
|
temperature = self.get_default_temperature()
|
||||||
@@ -128,20 +244,23 @@ class BaseTool(ABC):
|
|||||||
if thinking_mode is None:
|
if thinking_mode is None:
|
||||||
thinking_mode = self.get_default_thinking_mode()
|
thinking_mode = self.get_default_thinking_mode()
|
||||||
|
|
||||||
# Create and configure model
|
# Create model instance with appropriate configuration
|
||||||
|
# This handles both regular models and thinking-enabled models
|
||||||
model = self.create_model(model_name, temperature, thinking_mode)
|
model = self.create_model(model_name, temperature, thinking_mode)
|
||||||
|
|
||||||
# Generate response
|
# Generate AI response using the configured model
|
||||||
response = model.generate_content(prompt)
|
response = model.generate_content(prompt)
|
||||||
|
|
||||||
# Handle response and create standardized output
|
# Process the model's response
|
||||||
if response.candidates and response.candidates[0].content.parts:
|
if response.candidates and response.candidates[0].content.parts:
|
||||||
raw_text = response.candidates[0].content.parts[0].text
|
raw_text = response.candidates[0].content.parts[0].text
|
||||||
|
|
||||||
# Check if this is a clarification request
|
# Parse response to check for clarification requests or format output
|
||||||
tool_output = self._parse_response(raw_text, request)
|
tool_output = self._parse_response(raw_text, request)
|
||||||
|
|
||||||
else:
|
else:
|
||||||
|
# Handle cases where the model couldn't generate a response
|
||||||
|
# This might happen due to safety filters or other constraints
|
||||||
finish_reason = (
|
finish_reason = (
|
||||||
response.candidates[0].finish_reason
|
response.candidates[0].finish_reason
|
||||||
if response.candidates
|
if response.candidates
|
||||||
@@ -153,10 +272,12 @@ class BaseTool(ABC):
|
|||||||
content_type="text",
|
content_type="text",
|
||||||
)
|
)
|
||||||
|
|
||||||
# Serialize the standardized output as JSON
|
# Return standardized JSON response for consistent client handling
|
||||||
return [TextContent(type="text", text=tool_output.model_dump_json())]
|
return [TextContent(type="text", text=tool_output.model_dump_json())]
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
|
# Catch all exceptions to prevent server crashes
|
||||||
|
# Return error information in standardized format
|
||||||
error_output = ToolOutput(
|
error_output = ToolOutput(
|
||||||
status="error",
|
status="error",
|
||||||
content=f"Error in {self.name}: {str(e)}",
|
content=f"Error in {self.name}: {str(e)}",
|
||||||
@@ -165,7 +286,19 @@ class BaseTool(ABC):
|
|||||||
return [TextContent(type="text", text=error_output.model_dump_json())]
|
return [TextContent(type="text", text=error_output.model_dump_json())]
|
||||||
|
|
||||||
def _parse_response(self, raw_text: str, request) -> ToolOutput:
|
def _parse_response(self, raw_text: str, request) -> ToolOutput:
|
||||||
"""Parse the raw response and determine if it's a clarification request"""
|
"""
|
||||||
|
Parse the raw response and determine if it's a clarification request.
|
||||||
|
|
||||||
|
Some tools may return JSON indicating they need more information.
|
||||||
|
This method detects such responses and formats them appropriately.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
raw_text: The raw text response from the model
|
||||||
|
request: The original request for context
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
ToolOutput: Standardized output object
|
||||||
|
"""
|
||||||
try:
|
try:
|
||||||
# Try to parse as JSON to check for clarification requests
|
# Try to parse as JSON to check for clarification requests
|
||||||
potential_json = json.loads(raw_text.strip())
|
potential_json = json.loads(raw_text.strip())
|
||||||
@@ -214,40 +347,79 @@ class BaseTool(ABC):
|
|||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
async def prepare_prompt(self, request) -> str:
|
async def prepare_prompt(self, request) -> str:
|
||||||
"""Prepare the full prompt for Gemini"""
|
"""
|
||||||
|
Prepare the complete prompt for the Gemini model.
|
||||||
|
|
||||||
|
This method should combine the system prompt with the user's request
|
||||||
|
and any additional context (like file contents) needed for the task.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: The validated request object
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
str: Complete prompt ready for the model
|
||||||
|
"""
|
||||||
pass
|
pass
|
||||||
|
|
||||||
def format_response(self, response: str, request) -> str:
|
def format_response(self, response: str, request) -> str:
|
||||||
"""Format the response for display (can be overridden)"""
|
"""
|
||||||
|
Format the model's response for display.
|
||||||
|
|
||||||
|
Override this method to add tool-specific formatting like headers,
|
||||||
|
summaries, or structured output. Default implementation returns
|
||||||
|
the response unchanged.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
response: The raw response from the model
|
||||||
|
request: The original request for context
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
str: Formatted response
|
||||||
|
"""
|
||||||
return response
|
return response
|
||||||
|
|
||||||
def create_model(
|
def create_model(
|
||||||
self, model_name: str, temperature: float, thinking_mode: str = "medium"
|
self, model_name: str, temperature: float, thinking_mode: str = "medium"
|
||||||
):
|
):
|
||||||
"""Create a configured Gemini model with thinking configuration"""
|
"""
|
||||||
# Map thinking modes to budget values
|
Create a configured Gemini model instance.
|
||||||
|
|
||||||
|
This method handles model creation with appropriate settings including
|
||||||
|
temperature and thinking budget configuration for models that support it.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
model_name: Name of the Gemini model to use
|
||||||
|
temperature: Temperature setting for response generation
|
||||||
|
thinking_mode: Thinking depth mode (affects computational budget)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Model instance configured and ready for generation
|
||||||
|
"""
|
||||||
|
# Map thinking modes to computational budget values
|
||||||
|
# Higher budgets allow for more complex reasoning but increase latency
|
||||||
thinking_budgets = {
|
thinking_budgets = {
|
||||||
"minimal": 128, # Minimum for 2.5 Pro
|
"minimal": 128, # Minimum for 2.5 Pro - fast responses
|
||||||
"low": 2048,
|
"low": 2048, # Light reasoning tasks
|
||||||
"medium": 8192,
|
"medium": 8192, # Balanced reasoning (default)
|
||||||
"high": 16384,
|
"high": 16384, # Complex analysis
|
||||||
"max": 32768,
|
"max": 32768, # Maximum reasoning depth
|
||||||
}
|
}
|
||||||
|
|
||||||
thinking_budget = thinking_budgets.get(thinking_mode, 8192)
|
thinking_budget = thinking_budgets.get(thinking_mode, 8192)
|
||||||
|
|
||||||
# For models supporting thinking config, use the new API
|
# Gemini 2.5 models support thinking configuration for enhanced reasoning
|
||||||
# Skip in test environment to allow mocking
|
# Skip special handling in test environment to allow mocking
|
||||||
if "2.5" in model_name and not os.environ.get("PYTEST_CURRENT_TEST"):
|
if "2.5" in model_name and not os.environ.get("PYTEST_CURRENT_TEST"):
|
||||||
try:
|
try:
|
||||||
# Get API key
|
# Retrieve API key for Gemini client creation
|
||||||
api_key = os.environ.get("GEMINI_API_KEY")
|
api_key = os.environ.get("GEMINI_API_KEY")
|
||||||
if not api_key:
|
if not api_key:
|
||||||
raise ValueError("GEMINI_API_KEY environment variable is required")
|
raise ValueError("GEMINI_API_KEY environment variable is required")
|
||||||
|
|
||||||
client = genai.Client(api_key=api_key)
|
client = genai.Client(api_key=api_key)
|
||||||
|
|
||||||
# Create a wrapper to match the expected interface
|
# Create a wrapper class to provide a consistent interface
|
||||||
|
# This abstracts the differences between API versions
|
||||||
class ModelWrapper:
|
class ModelWrapper:
|
||||||
def __init__(
|
def __init__(
|
||||||
self, client, model_name, temperature, thinking_budget
|
self, client, model_name, temperature, thinking_budget
|
||||||
@@ -270,7 +442,8 @@ class BaseTool(ABC):
|
|||||||
),
|
),
|
||||||
)
|
)
|
||||||
|
|
||||||
# Convert to match expected format
|
# Wrap the response to match the expected format
|
||||||
|
# This ensures compatibility across different API versions
|
||||||
class ResponseWrapper:
|
class ResponseWrapper:
|
||||||
def __init__(self, text):
|
def __init__(self, text):
|
||||||
self.text = text
|
self.text = text
|
||||||
@@ -302,18 +475,19 @@ class BaseTool(ABC):
|
|||||||
return ModelWrapper(client, model_name, temperature, thinking_budget)
|
return ModelWrapper(client, model_name, temperature, thinking_budget)
|
||||||
|
|
||||||
except Exception:
|
except Exception:
|
||||||
# Fall back to regular genai model if new API fails
|
# Fall back to regular API if thinking configuration fails
|
||||||
|
# This ensures the tool remains functional even with API changes
|
||||||
pass
|
pass
|
||||||
|
|
||||||
# For non-2.5 models or if thinking not needed, use regular API
|
# For models that don't support thinking configuration, use standard API
|
||||||
# Get API key
|
|
||||||
api_key = os.environ.get("GEMINI_API_KEY")
|
api_key = os.environ.get("GEMINI_API_KEY")
|
||||||
if not api_key:
|
if not api_key:
|
||||||
raise ValueError("GEMINI_API_KEY environment variable is required")
|
raise ValueError("GEMINI_API_KEY environment variable is required")
|
||||||
|
|
||||||
client = genai.Client(api_key=api_key)
|
client = genai.Client(api_key=api_key)
|
||||||
|
|
||||||
# Create wrapper for consistency
|
# Create a simple wrapper for models without thinking configuration
|
||||||
|
# This provides the same interface as the thinking-enabled wrapper
|
||||||
class SimpleModelWrapper:
|
class SimpleModelWrapper:
|
||||||
def __init__(self, client, model_name, temperature):
|
def __init__(self, client, model_name, temperature):
|
||||||
self.client = client
|
self.client = client
|
||||||
|
|||||||
@@ -1,5 +1,17 @@
|
|||||||
"""
|
"""
|
||||||
Code Review tool - Comprehensive code analysis and review
|
Code Review tool - Comprehensive code analysis and review
|
||||||
|
|
||||||
|
This tool provides professional-grade code review capabilities using
|
||||||
|
Gemini's understanding of code patterns, best practices, and common issues.
|
||||||
|
It can analyze individual files or entire codebases, providing actionable
|
||||||
|
feedback categorized by severity.
|
||||||
|
|
||||||
|
Key Features:
|
||||||
|
- Multi-file and directory support
|
||||||
|
- Configurable review types (full, security, performance, quick)
|
||||||
|
- Severity-based issue filtering
|
||||||
|
- Custom focus areas and coding standards
|
||||||
|
- Structured output with specific remediation steps
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from typing import Any, Dict, List, Optional
|
from typing import Any, Dict, List, Optional
|
||||||
@@ -14,7 +26,13 @@ from .base import BaseTool, ToolRequest
|
|||||||
|
|
||||||
|
|
||||||
class ReviewCodeRequest(ToolRequest):
|
class ReviewCodeRequest(ToolRequest):
|
||||||
"""Request model for review_code tool"""
|
"""
|
||||||
|
Request model for the code review tool.
|
||||||
|
|
||||||
|
This model defines all parameters that can be used to customize
|
||||||
|
the code review process, from selecting files to specifying
|
||||||
|
review focus and standards.
|
||||||
|
"""
|
||||||
|
|
||||||
files: List[str] = Field(
|
files: List[str] = Field(
|
||||||
...,
|
...,
|
||||||
@@ -36,7 +54,13 @@ class ReviewCodeRequest(ToolRequest):
|
|||||||
|
|
||||||
|
|
||||||
class ReviewCodeTool(BaseTool):
|
class ReviewCodeTool(BaseTool):
|
||||||
"""Professional code review tool"""
|
"""
|
||||||
|
Professional code review tool implementation.
|
||||||
|
|
||||||
|
This tool analyzes code for bugs, security vulnerabilities, performance
|
||||||
|
issues, and code quality problems. It provides detailed feedback with
|
||||||
|
severity ratings and specific remediation steps.
|
||||||
|
"""
|
||||||
|
|
||||||
def get_name(self) -> str:
|
def get_name(self) -> str:
|
||||||
return "review_code"
|
return "review_code"
|
||||||
@@ -105,11 +129,25 @@ class ReviewCodeTool(BaseTool):
|
|||||||
return ReviewCodeRequest
|
return ReviewCodeRequest
|
||||||
|
|
||||||
async def prepare_prompt(self, request: ReviewCodeRequest) -> str:
|
async def prepare_prompt(self, request: ReviewCodeRequest) -> str:
|
||||||
"""Prepare the code review prompt"""
|
"""
|
||||||
# Read all files
|
Prepare the code review prompt with customized instructions.
|
||||||
|
|
||||||
|
This method reads the requested files, validates token limits,
|
||||||
|
and constructs a detailed prompt based on the review parameters.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
request: The validated review request
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
str: Complete prompt for the Gemini model
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
ValueError: If the code exceeds token limits
|
||||||
|
"""
|
||||||
|
# Read all requested files, expanding directories as needed
|
||||||
file_content, summary = read_files(request.files)
|
file_content, summary = read_files(request.files)
|
||||||
|
|
||||||
# Check token limits
|
# Validate that the code fits within model context limits
|
||||||
within_limit, estimated_tokens = check_token_limit(file_content)
|
within_limit, estimated_tokens = check_token_limit(file_content)
|
||||||
if not within_limit:
|
if not within_limit:
|
||||||
raise ValueError(
|
raise ValueError(
|
||||||
@@ -117,7 +155,7 @@ class ReviewCodeTool(BaseTool):
|
|||||||
f"Maximum is {MAX_CONTEXT_TOKENS:,} tokens."
|
f"Maximum is {MAX_CONTEXT_TOKENS:,} tokens."
|
||||||
)
|
)
|
||||||
|
|
||||||
# Build review instructions
|
# Build customized review instructions based on review type
|
||||||
review_focus = []
|
review_focus = []
|
||||||
if request.review_type == "security":
|
if request.review_type == "security":
|
||||||
review_focus.append(
|
review_focus.append(
|
||||||
@@ -132,12 +170,15 @@ class ReviewCodeTool(BaseTool):
|
|||||||
"Provide a quick review focusing on critical issues only"
|
"Provide a quick review focusing on critical issues only"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Add any additional focus areas specified by the user
|
||||||
if request.focus_on:
|
if request.focus_on:
|
||||||
review_focus.append(f"Pay special attention to: {request.focus_on}")
|
review_focus.append(f"Pay special attention to: {request.focus_on}")
|
||||||
|
|
||||||
|
# Include custom coding standards if provided
|
||||||
if request.standards:
|
if request.standards:
|
||||||
review_focus.append(f"Enforce these standards: {request.standards}")
|
review_focus.append(f"Enforce these standards: {request.standards}")
|
||||||
|
|
||||||
|
# Apply severity filtering to reduce noise if requested
|
||||||
if request.severity_filter != "all":
|
if request.severity_filter != "all":
|
||||||
review_focus.append(
|
review_focus.append(
|
||||||
f"Only report issues of {request.severity_filter} severity or higher"
|
f"Only report issues of {request.severity_filter} severity or higher"
|
||||||
@@ -145,7 +186,7 @@ class ReviewCodeTool(BaseTool):
|
|||||||
|
|
||||||
focus_instruction = "\n".join(review_focus) if review_focus else ""
|
focus_instruction = "\n".join(review_focus) if review_focus else ""
|
||||||
|
|
||||||
# Combine everything
|
# Construct the complete prompt with system instructions and code
|
||||||
full_prompt = f"""{self.get_system_prompt()}
|
full_prompt = f"""{self.get_system_prompt()}
|
||||||
|
|
||||||
{focus_instruction}
|
{focus_instruction}
|
||||||
@@ -159,7 +200,19 @@ Please provide a comprehensive code review following the format specified in the
|
|||||||
return full_prompt
|
return full_prompt
|
||||||
|
|
||||||
def format_response(self, response: str, request: ReviewCodeRequest) -> str:
|
def format_response(self, response: str, request: ReviewCodeRequest) -> str:
|
||||||
"""Format the review response"""
|
"""
|
||||||
|
Format the review response with appropriate headers.
|
||||||
|
|
||||||
|
Adds context about the review type and focus area to help
|
||||||
|
users understand the scope of the review.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
response: The raw review from the model
|
||||||
|
request: The original request for context
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
str: Formatted response with headers
|
||||||
|
"""
|
||||||
header = f"Code Review ({request.review_type.upper()})"
|
header = f"Code Review ({request.review_type.upper()})"
|
||||||
if request.focus_on:
|
if request.focus_on:
|
||||||
header += f" - Focus: {request.focus_on}"
|
header += f" - Focus: {request.focus_on}"
|
||||||
|
|||||||
@@ -1,5 +1,21 @@
|
|||||||
"""
|
"""
|
||||||
File reading utilities with directory support and token management
|
File reading utilities with directory support and token management
|
||||||
|
|
||||||
|
This module provides secure file access functionality for the MCP server.
|
||||||
|
It implements critical security measures to prevent unauthorized file access
|
||||||
|
and manages token limits to ensure efficient API usage.
|
||||||
|
|
||||||
|
Key Features:
|
||||||
|
- Path validation and sandboxing to prevent directory traversal attacks
|
||||||
|
- Support for both individual files and recursive directory reading
|
||||||
|
- Token counting and management to stay within API limits
|
||||||
|
- Automatic file type detection and filtering
|
||||||
|
- Comprehensive error handling with informative messages
|
||||||
|
|
||||||
|
Security Model:
|
||||||
|
- All file access is restricted to PROJECT_ROOT and its subdirectories
|
||||||
|
- Absolute paths are required to prevent ambiguity
|
||||||
|
- Symbolic links are resolved to ensure they stay within bounds
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import os
|
import os
|
||||||
@@ -10,9 +26,12 @@ from .token_utils import estimate_tokens, MAX_CONTEXT_TOKENS
|
|||||||
|
|
||||||
# Get project root from environment or use current directory
|
# Get project root from environment or use current directory
|
||||||
# This defines the sandbox directory where file access is allowed
|
# This defines the sandbox directory where file access is allowed
|
||||||
|
# Security: All file operations are restricted to this directory and its children
|
||||||
PROJECT_ROOT = Path(os.environ.get("MCP_PROJECT_ROOT", os.getcwd())).resolve()
|
PROJECT_ROOT = Path(os.environ.get("MCP_PROJECT_ROOT", os.getcwd())).resolve()
|
||||||
|
|
||||||
# Security: Prevent running with overly permissive root
|
# Critical Security Check: Prevent running with overly permissive root
|
||||||
|
# Setting PROJECT_ROOT to "/" would allow access to the entire filesystem,
|
||||||
|
# which is a severe security vulnerability
|
||||||
if str(PROJECT_ROOT) == "/":
|
if str(PROJECT_ROOT) == "/":
|
||||||
raise RuntimeError(
|
raise RuntimeError(
|
||||||
"Security Error: MCP_PROJECT_ROOT cannot be set to '/'. "
|
"Security Error: MCP_PROJECT_ROOT cannot be set to '/'. "
|
||||||
@@ -20,7 +39,8 @@ if str(PROJECT_ROOT) == "/":
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
# Common code file extensions
|
# Common code file extensions that are automatically included when processing directories
|
||||||
|
# This set can be extended to support additional file types
|
||||||
CODE_EXTENSIONS = {
|
CODE_EXTENSIONS = {
|
||||||
".py",
|
".py",
|
||||||
".js",
|
".js",
|
||||||
@@ -75,11 +95,16 @@ def resolve_and_validate_path(path_str: str) -> Path:
|
|||||||
"""
|
"""
|
||||||
Validates that a path is absolute and resolves it.
|
Validates that a path is absolute and resolves it.
|
||||||
|
|
||||||
|
This is the primary security function that ensures all file access
|
||||||
|
is properly sandboxed. It enforces two critical security policies:
|
||||||
|
1. All paths must be absolute (no ambiguity)
|
||||||
|
2. All paths must resolve to within PROJECT_ROOT (sandboxing)
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
path_str: Path string (must be absolute)
|
path_str: Path string (must be absolute)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Resolved Path object
|
Resolved Path object that is guaranteed to be within PROJECT_ROOT
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
ValueError: If path is not absolute
|
ValueError: If path is not absolute
|
||||||
@@ -88,17 +113,19 @@ def resolve_and_validate_path(path_str: str) -> Path:
|
|||||||
# Create a Path object from the user-provided path
|
# Create a Path object from the user-provided path
|
||||||
user_path = Path(path_str)
|
user_path = Path(path_str)
|
||||||
|
|
||||||
# Require absolute paths
|
# Security Policy 1: Require absolute paths to prevent ambiguity
|
||||||
|
# Relative paths could be interpreted differently depending on working directory
|
||||||
if not user_path.is_absolute():
|
if not user_path.is_absolute():
|
||||||
raise ValueError(
|
raise ValueError(
|
||||||
f"Relative paths are not supported. Please provide an absolute path.\n"
|
f"Relative paths are not supported. Please provide an absolute path.\n"
|
||||||
f"Received: {path_str}"
|
f"Received: {path_str}"
|
||||||
)
|
)
|
||||||
|
|
||||||
# Resolve the absolute path
|
# Resolve the absolute path (follows symlinks, removes .. and .)
|
||||||
resolved_path = user_path.resolve()
|
resolved_path = user_path.resolve()
|
||||||
|
|
||||||
# Security check: ensure the resolved path is within PROJECT_ROOT
|
# Security Policy 2: Ensure the resolved path is within PROJECT_ROOT
|
||||||
|
# This prevents directory traversal attacks (e.g., /project/../../../etc/passwd)
|
||||||
try:
|
try:
|
||||||
resolved_path.relative_to(PROJECT_ROOT)
|
resolved_path.relative_to(PROJECT_ROOT)
|
||||||
except ValueError:
|
except ValueError:
|
||||||
@@ -115,12 +142,16 @@ def expand_paths(paths: List[str], extensions: Optional[Set[str]] = None) -> Lis
|
|||||||
"""
|
"""
|
||||||
Expand paths to individual files, handling both files and directories.
|
Expand paths to individual files, handling both files and directories.
|
||||||
|
|
||||||
|
This function recursively walks directories to find all matching files.
|
||||||
|
It automatically filters out hidden files and common non-code directories
|
||||||
|
like __pycache__ to avoid including generated or system files.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
paths: List of file or directory paths
|
paths: List of file or directory paths (must be absolute)
|
||||||
extensions: Optional set of file extensions to include
|
extensions: Optional set of file extensions to include (defaults to CODE_EXTENSIONS)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List of individual file paths
|
List of individual file paths, sorted for consistent ordering
|
||||||
"""
|
"""
|
||||||
if extensions is None:
|
if extensions is None:
|
||||||
extensions = CODE_EXTENSIONS
|
extensions = CODE_EXTENSIONS
|
||||||
@@ -130,9 +161,10 @@ def expand_paths(paths: List[str], extensions: Optional[Set[str]] = None) -> Lis
|
|||||||
|
|
||||||
for path in paths:
|
for path in paths:
|
||||||
try:
|
try:
|
||||||
|
# Validate each path for security before processing
|
||||||
path_obj = resolve_and_validate_path(path)
|
path_obj = resolve_and_validate_path(path)
|
||||||
except (ValueError, PermissionError):
|
except (ValueError, PermissionError):
|
||||||
# Skip invalid paths
|
# Skip invalid paths silently to allow partial success
|
||||||
continue
|
continue
|
||||||
|
|
||||||
if not path_obj.exists():
|
if not path_obj.exists():
|
||||||
@@ -145,51 +177,61 @@ def expand_paths(paths: List[str], extensions: Optional[Set[str]] = None) -> Lis
|
|||||||
seen.add(str(path_obj))
|
seen.add(str(path_obj))
|
||||||
|
|
||||||
elif path_obj.is_dir():
|
elif path_obj.is_dir():
|
||||||
# Walk directory recursively
|
# Walk directory recursively to find all files
|
||||||
for root, dirs, files in os.walk(path_obj):
|
for root, dirs, files in os.walk(path_obj):
|
||||||
# Skip hidden directories and __pycache__
|
# Filter directories in-place to skip hidden and cache directories
|
||||||
|
# This prevents descending into .git, .venv, __pycache__, etc.
|
||||||
dirs[:] = [
|
dirs[:] = [
|
||||||
d for d in dirs if not d.startswith(".") and d != "__pycache__"
|
d for d in dirs if not d.startswith(".") and d != "__pycache__"
|
||||||
]
|
]
|
||||||
|
|
||||||
for file in files:
|
for file in files:
|
||||||
# Skip hidden files
|
# Skip hidden files (e.g., .DS_Store, .gitignore)
|
||||||
if file.startswith("."):
|
if file.startswith("."):
|
||||||
continue
|
continue
|
||||||
|
|
||||||
file_path = Path(root) / file
|
file_path = Path(root) / file
|
||||||
|
|
||||||
# Check extension
|
# Filter by extension if specified
|
||||||
if not extensions or file_path.suffix.lower() in extensions:
|
if not extensions or file_path.suffix.lower() in extensions:
|
||||||
full_path = str(file_path)
|
full_path = str(file_path)
|
||||||
|
# Use set to prevent duplicates
|
||||||
if full_path not in seen:
|
if full_path not in seen:
|
||||||
expanded_files.append(full_path)
|
expanded_files.append(full_path)
|
||||||
seen.add(full_path)
|
seen.add(full_path)
|
||||||
|
|
||||||
# Sort for consistent ordering
|
# Sort for consistent ordering across different runs
|
||||||
|
# This makes output predictable and easier to debug
|
||||||
expanded_files.sort()
|
expanded_files.sort()
|
||||||
return expanded_files
|
return expanded_files
|
||||||
|
|
||||||
|
|
||||||
def read_file_content(file_path: str, max_size: int = 1_000_000) -> Tuple[str, int]:
|
def read_file_content(file_path: str, max_size: int = 1_000_000) -> Tuple[str, int]:
|
||||||
"""
|
"""
|
||||||
Read a single file and format it for Gemini.
|
Read a single file and format it for inclusion in AI prompts.
|
||||||
|
|
||||||
|
This function handles various error conditions gracefully and always
|
||||||
|
returns formatted content, even for errors. This ensures the AI model
|
||||||
|
gets context about what files were attempted but couldn't be read.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
file_path: Path to file (must be absolute)
|
file_path: Path to file (must be absolute)
|
||||||
max_size: Maximum file size to read
|
max_size: Maximum file size to read (default 1MB to prevent memory issues)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
(formatted_content, estimated_tokens)
|
Tuple of (formatted_content, estimated_tokens)
|
||||||
|
Content is wrapped with clear delimiters for AI parsing
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
|
# Validate path security before any file operations
|
||||||
path = resolve_and_validate_path(file_path)
|
path = resolve_and_validate_path(file_path)
|
||||||
except (ValueError, PermissionError) as e:
|
except (ValueError, PermissionError) as e:
|
||||||
|
# Return error in a format that provides context to the AI
|
||||||
content = f"\n--- ERROR ACCESSING FILE: {file_path} ---\nError: {str(e)}\n--- END FILE ---\n"
|
content = f"\n--- ERROR ACCESSING FILE: {file_path} ---\nError: {str(e)}\n--- END FILE ---\n"
|
||||||
return content, estimate_tokens(content)
|
return content, estimate_tokens(content)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Check if path exists and is a file
|
# Validate file existence and type
|
||||||
if not path.exists():
|
if not path.exists():
|
||||||
content = f"\n--- FILE NOT FOUND: {file_path} ---\nError: File does not exist\n--- END FILE ---\n"
|
content = f"\n--- FILE NOT FOUND: {file_path} ---\nError: File does not exist\n--- END FILE ---\n"
|
||||||
return content, estimate_tokens(content)
|
return content, estimate_tokens(content)
|
||||||
@@ -198,17 +240,19 @@ def read_file_content(file_path: str, max_size: int = 1_000_000) -> Tuple[str, i
|
|||||||
content = f"\n--- NOT A FILE: {file_path} ---\nError: Path is not a file\n--- END FILE ---\n"
|
content = f"\n--- NOT A FILE: {file_path} ---\nError: Path is not a file\n--- END FILE ---\n"
|
||||||
return content, estimate_tokens(content)
|
return content, estimate_tokens(content)
|
||||||
|
|
||||||
# Check file size
|
# Check file size to prevent memory exhaustion
|
||||||
file_size = path.stat().st_size
|
file_size = path.stat().st_size
|
||||||
if file_size > max_size:
|
if file_size > max_size:
|
||||||
content = f"\n--- FILE TOO LARGE: {file_path} ---\nFile size: {file_size:,} bytes (max: {max_size:,})\n--- END FILE ---\n"
|
content = f"\n--- FILE TOO LARGE: {file_path} ---\nFile size: {file_size:,} bytes (max: {max_size:,})\n--- END FILE ---\n"
|
||||||
return content, estimate_tokens(content)
|
return content, estimate_tokens(content)
|
||||||
|
|
||||||
# Read the file
|
# Read the file with UTF-8 encoding, replacing invalid characters
|
||||||
|
# This ensures we can handle files with mixed encodings
|
||||||
with open(path, "r", encoding="utf-8", errors="replace") as f:
|
with open(path, "r", encoding="utf-8", errors="replace") as f:
|
||||||
file_content = f.read()
|
file_content = f.read()
|
||||||
|
|
||||||
# Format with clear delimiters for Gemini
|
# Format with clear delimiters that help the AI understand file boundaries
|
||||||
|
# Using consistent markers makes it easier for the model to parse
|
||||||
formatted = f"\n--- BEGIN FILE: {file_path} ---\n{file_content}\n--- END FILE: {file_path} ---\n"
|
formatted = f"\n--- BEGIN FILE: {file_path} ---\n{file_content}\n--- END FILE: {file_path} ---\n"
|
||||||
return formatted, estimate_tokens(formatted)
|
return formatted, estimate_tokens(formatted)
|
||||||
|
|
||||||
@@ -226,14 +270,21 @@ def read_files(
|
|||||||
"""
|
"""
|
||||||
Read multiple files and optional direct code with smart token management.
|
Read multiple files and optional direct code with smart token management.
|
||||||
|
|
||||||
|
This function implements intelligent token budgeting to maximize the amount
|
||||||
|
of relevant content that can be included in an AI prompt while staying
|
||||||
|
within token limits. It prioritizes direct code and reads files until
|
||||||
|
the token budget is exhausted.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
file_paths: List of file or directory paths
|
file_paths: List of file or directory paths (absolute paths required)
|
||||||
code: Optional direct code to include
|
code: Optional direct code to include (prioritized over files)
|
||||||
max_tokens: Maximum tokens to use (defaults to MAX_CONTEXT_TOKENS)
|
max_tokens: Maximum tokens to use (defaults to MAX_CONTEXT_TOKENS)
|
||||||
reserve_tokens: Tokens to reserve for prompt and response
|
reserve_tokens: Tokens to reserve for prompt and response (default 50K)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
(full_content, brief_summary)
|
Tuple of (full_content, brief_summary)
|
||||||
|
- full_content: All file contents formatted for AI consumption
|
||||||
|
- brief_summary: Human-readable summary of what was processed
|
||||||
"""
|
"""
|
||||||
if max_tokens is None:
|
if max_tokens is None:
|
||||||
max_tokens = MAX_CONTEXT_TOKENS
|
max_tokens = MAX_CONTEXT_TOKENS
|
||||||
@@ -247,7 +298,8 @@ def read_files(
|
|||||||
files_skipped = []
|
files_skipped = []
|
||||||
dirs_processed = []
|
dirs_processed = []
|
||||||
|
|
||||||
# First, handle direct code if provided
|
# Priority 1: Handle direct code if provided
|
||||||
|
# Direct code is prioritized because it's explicitly provided by the user
|
||||||
if code:
|
if code:
|
||||||
formatted_code = (
|
formatted_code = (
|
||||||
f"\n--- BEGIN DIRECT CODE ---\n{code}\n--- END DIRECT CODE ---\n"
|
f"\n--- BEGIN DIRECT CODE ---\n{code}\n--- END DIRECT CODE ---\n"
|
||||||
@@ -258,19 +310,23 @@ def read_files(
|
|||||||
content_parts.append(formatted_code)
|
content_parts.append(formatted_code)
|
||||||
total_tokens += code_tokens
|
total_tokens += code_tokens
|
||||||
available_tokens -= code_tokens
|
available_tokens -= code_tokens
|
||||||
|
# Create a preview for the summary
|
||||||
code_preview = code[:50] + "..." if len(code) > 50 else code
|
code_preview = code[:50] + "..." if len(code) > 50 else code
|
||||||
summary_parts.append(f"Direct code: {code_preview}")
|
summary_parts.append(f"Direct code: {code_preview}")
|
||||||
else:
|
else:
|
||||||
summary_parts.append("Direct code skipped (too large)")
|
summary_parts.append("Direct code skipped (too large)")
|
||||||
|
|
||||||
# Expand all paths to get individual files
|
# Priority 2: Process file paths
|
||||||
if file_paths:
|
if file_paths:
|
||||||
# Track which paths are directories
|
# Track which paths are directories for summary
|
||||||
for path in file_paths:
|
for path in file_paths:
|
||||||
|
try:
|
||||||
if Path(path).is_dir():
|
if Path(path).is_dir():
|
||||||
dirs_processed.append(path)
|
dirs_processed.append(path)
|
||||||
|
except Exception:
|
||||||
|
pass # Ignore invalid paths
|
||||||
|
|
||||||
# Expand to get all files
|
# Expand directories to get all individual files
|
||||||
all_files = expand_paths(file_paths)
|
all_files = expand_paths(file_paths)
|
||||||
|
|
||||||
if not all_files and file_paths:
|
if not all_files and file_paths:
|
||||||
@@ -279,7 +335,7 @@ def read_files(
|
|||||||
f"\n--- NO FILES FOUND ---\nProvided paths: {', '.join(file_paths)}\n--- END ---\n"
|
f"\n--- NO FILES FOUND ---\nProvided paths: {', '.join(file_paths)}\n--- END ---\n"
|
||||||
)
|
)
|
||||||
else:
|
else:
|
||||||
# Read files up to token limit
|
# Read files sequentially until token limit is reached
|
||||||
for file_path in all_files:
|
for file_path in all_files:
|
||||||
if total_tokens >= available_tokens:
|
if total_tokens >= available_tokens:
|
||||||
files_skipped.append(file_path)
|
files_skipped.append(file_path)
|
||||||
@@ -293,9 +349,10 @@ def read_files(
|
|||||||
total_tokens += file_tokens
|
total_tokens += file_tokens
|
||||||
files_read.append(file_path)
|
files_read.append(file_path)
|
||||||
else:
|
else:
|
||||||
|
# File too large for remaining budget
|
||||||
files_skipped.append(file_path)
|
files_skipped.append(file_path)
|
||||||
|
|
||||||
# Build summary
|
# Build human-readable summary of what was processed
|
||||||
if dirs_processed:
|
if dirs_processed:
|
||||||
summary_parts.append(f"Processed {len(dirs_processed)} dir(s)")
|
summary_parts.append(f"Processed {len(dirs_processed)} dir(s)")
|
||||||
if files_read:
|
if files_read:
|
||||||
@@ -305,11 +362,12 @@ def read_files(
|
|||||||
if total_tokens > 0:
|
if total_tokens > 0:
|
||||||
summary_parts.append(f"~{total_tokens:,} tokens used")
|
summary_parts.append(f"~{total_tokens:,} tokens used")
|
||||||
|
|
||||||
# Add skipped files note if any were skipped
|
# Add informative note about skipped files to help users understand
|
||||||
|
# what was omitted and why
|
||||||
if files_skipped:
|
if files_skipped:
|
||||||
skip_note = "\n\n--- SKIPPED FILES (TOKEN LIMIT) ---\n"
|
skip_note = "\n\n--- SKIPPED FILES (TOKEN LIMIT) ---\n"
|
||||||
skip_note += f"Total skipped: {len(files_skipped)}\n"
|
skip_note += f"Total skipped: {len(files_skipped)}\n"
|
||||||
# Show first 10 skipped files
|
# Show first 10 skipped files as examples
|
||||||
for i, file_path in enumerate(files_skipped[:10]):
|
for i, file_path in enumerate(files_skipped[:10]):
|
||||||
skip_note += f" - {file_path}\n"
|
skip_note += f" - {file_path}\n"
|
||||||
if len(files_skipped) > 10:
|
if len(files_skipped) > 10:
|
||||||
|
|||||||
@@ -1,5 +1,20 @@
|
|||||||
"""
|
"""
|
||||||
Git utilities for finding repositories and generating diffs.
|
Git utilities for finding repositories and generating diffs.
|
||||||
|
|
||||||
|
This module provides Git integration functionality for the MCP server,
|
||||||
|
enabling tools to work with version control information. It handles
|
||||||
|
repository discovery, status checking, and diff generation.
|
||||||
|
|
||||||
|
Key Features:
|
||||||
|
- Recursive repository discovery with depth limits
|
||||||
|
- Safe command execution with timeouts
|
||||||
|
- Comprehensive status information extraction
|
||||||
|
- Support for staged and unstaged changes
|
||||||
|
|
||||||
|
Security Considerations:
|
||||||
|
- All git commands are run with timeouts to prevent hanging
|
||||||
|
- Repository discovery ignores common build/dependency directories
|
||||||
|
- Error handling for permission-denied scenarios
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import subprocess
|
import subprocess
|
||||||
@@ -8,16 +23,18 @@ from pathlib import Path
|
|||||||
|
|
||||||
|
|
||||||
# Directories to ignore when searching for git repositories
|
# Directories to ignore when searching for git repositories
|
||||||
|
# These are typically build artifacts, dependencies, or cache directories
|
||||||
|
# that don't contain source code and would slow down repository discovery
|
||||||
IGNORED_DIRS = {
|
IGNORED_DIRS = {
|
||||||
"node_modules",
|
"node_modules", # Node.js dependencies
|
||||||
"__pycache__",
|
"__pycache__", # Python bytecode cache
|
||||||
"venv",
|
"venv", # Python virtual environment
|
||||||
"env",
|
"env", # Alternative virtual environment name
|
||||||
"build",
|
"build", # Common build output directory
|
||||||
"dist",
|
"dist", # Distribution/release builds
|
||||||
"target",
|
"target", # Maven/Rust build output
|
||||||
".tox",
|
".tox", # Tox testing environments
|
||||||
".pytest_cache",
|
".pytest_cache", # Pytest cache directory
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@@ -25,38 +42,45 @@ def find_git_repositories(start_path: str, max_depth: int = 5) -> List[str]:
|
|||||||
"""
|
"""
|
||||||
Recursively find all git repositories starting from the given path.
|
Recursively find all git repositories starting from the given path.
|
||||||
|
|
||||||
|
This function walks the directory tree looking for .git directories,
|
||||||
|
which indicate the root of a git repository. It respects depth limits
|
||||||
|
to prevent excessive recursion in deep directory structures.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
start_path: Directory to start searching from
|
start_path: Directory to start searching from (must be absolute)
|
||||||
max_depth: Maximum depth to search (prevents excessive recursion)
|
max_depth: Maximum depth to search (default 5 prevents excessive recursion)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List of absolute paths to git repositories
|
List of absolute paths to git repositories, sorted alphabetically
|
||||||
"""
|
"""
|
||||||
repositories = []
|
repositories = []
|
||||||
start_path = Path(start_path).resolve()
|
start_path = Path(start_path).resolve()
|
||||||
|
|
||||||
def _find_repos(current_path: Path, current_depth: int):
|
def _find_repos(current_path: Path, current_depth: int):
|
||||||
|
# Stop recursion if we've reached maximum depth
|
||||||
if current_depth > max_depth:
|
if current_depth > max_depth:
|
||||||
return
|
return
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Check if current directory is a git repo
|
# Check if current directory contains a .git directory
|
||||||
git_dir = current_path / ".git"
|
git_dir = current_path / ".git"
|
||||||
if git_dir.exists() and git_dir.is_dir():
|
if git_dir.exists() and git_dir.is_dir():
|
||||||
repositories.append(str(current_path))
|
repositories.append(str(current_path))
|
||||||
# Don't search inside .git directory
|
# Don't search inside git repositories for nested repos
|
||||||
|
# This prevents finding submodules which should be handled separately
|
||||||
return
|
return
|
||||||
|
|
||||||
# Search subdirectories
|
# Search subdirectories for more repositories
|
||||||
for item in current_path.iterdir():
|
for item in current_path.iterdir():
|
||||||
if item.is_dir() and not item.name.startswith("."):
|
if item.is_dir() and not item.name.startswith("."):
|
||||||
# Skip common non-code directories
|
# Skip common non-code directories to improve performance
|
||||||
if item.name in IGNORED_DIRS:
|
if item.name in IGNORED_DIRS:
|
||||||
continue
|
continue
|
||||||
_find_repos(item, current_depth + 1)
|
_find_repos(item, current_depth + 1)
|
||||||
|
|
||||||
except PermissionError:
|
except PermissionError:
|
||||||
# Skip directories we can't access
|
# Skip directories we don't have permission to read
|
||||||
|
# This is common for system directories or other users' files
|
||||||
pass
|
pass
|
||||||
|
|
||||||
_find_repos(start_path, 0)
|
_find_repos(start_path, 0)
|
||||||
@@ -67,16 +91,28 @@ def run_git_command(repo_path: str, command: List[str]) -> Tuple[bool, str]:
|
|||||||
"""
|
"""
|
||||||
Run a git command in the specified repository.
|
Run a git command in the specified repository.
|
||||||
|
|
||||||
|
This function provides a safe way to execute git commands with:
|
||||||
|
- Timeout protection (30 seconds) to prevent hanging
|
||||||
|
- Proper error handling and output capture
|
||||||
|
- Working directory context management
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
repo_path: Path to the git repository
|
repo_path: Path to the git repository (working directory)
|
||||||
command: Git command as a list of arguments
|
command: Git command as a list of arguments (excluding 'git' itself)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Tuple of (success, output/error)
|
Tuple of (success, output/error)
|
||||||
|
- success: True if command returned 0, False otherwise
|
||||||
|
- output/error: stdout if successful, stderr or error message if failed
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
|
# Execute git command with safety measures
|
||||||
result = subprocess.run(
|
result = subprocess.run(
|
||||||
["git"] + command, cwd=repo_path, capture_output=True, text=True, timeout=30
|
["git"] + command,
|
||||||
|
cwd=repo_path, # Run in repository directory
|
||||||
|
capture_output=True, # Capture stdout and stderr
|
||||||
|
text=True, # Return strings instead of bytes
|
||||||
|
timeout=30, # Prevent hanging on slow operations
|
||||||
)
|
)
|
||||||
|
|
||||||
if result.returncode == 0:
|
if result.returncode == 0:
|
||||||
@@ -85,21 +121,36 @@ def run_git_command(repo_path: str, command: List[str]) -> Tuple[bool, str]:
|
|||||||
return False, result.stderr
|
return False, result.stderr
|
||||||
|
|
||||||
except subprocess.TimeoutExpired:
|
except subprocess.TimeoutExpired:
|
||||||
return False, "Command timed out"
|
return False, "Command timed out after 30 seconds"
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
return False, str(e)
|
return False, f"Git command failed: {str(e)}"
|
||||||
|
|
||||||
|
|
||||||
def get_git_status(repo_path: str) -> Dict[str, any]:
|
def get_git_status(repo_path: str) -> Dict[str, any]:
|
||||||
"""
|
"""
|
||||||
Get the current git status of a repository.
|
Get comprehensive git status information for a repository.
|
||||||
|
|
||||||
|
This function gathers various pieces of repository state including:
|
||||||
|
- Current branch name
|
||||||
|
- Commits ahead/behind upstream
|
||||||
|
- Lists of staged, unstaged, and untracked files
|
||||||
|
|
||||||
|
The function is resilient to repositories without remotes or
|
||||||
|
in detached HEAD state.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
repo_path: Path to the git repository
|
repo_path: Path to the git repository
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Dictionary with status information
|
Dictionary with status information:
|
||||||
|
- branch: Current branch name (empty if detached)
|
||||||
|
- ahead: Number of commits ahead of upstream
|
||||||
|
- behind: Number of commits behind upstream
|
||||||
|
- staged_files: List of files with staged changes
|
||||||
|
- unstaged_files: List of files with unstaged changes
|
||||||
|
- untracked_files: List of untracked files
|
||||||
"""
|
"""
|
||||||
|
# Initialize status structure with default values
|
||||||
status = {
|
status = {
|
||||||
"branch": "",
|
"branch": "",
|
||||||
"ahead": 0,
|
"ahead": 0,
|
||||||
@@ -109,12 +160,12 @@ def get_git_status(repo_path: str) -> Dict[str, any]:
|
|||||||
"untracked_files": [],
|
"untracked_files": [],
|
||||||
}
|
}
|
||||||
|
|
||||||
# Get current branch
|
# Get current branch name (empty if in detached HEAD state)
|
||||||
success, branch = run_git_command(repo_path, ["branch", "--show-current"])
|
success, branch = run_git_command(repo_path, ["branch", "--show-current"])
|
||||||
if success:
|
if success:
|
||||||
status["branch"] = branch.strip()
|
status["branch"] = branch.strip()
|
||||||
|
|
||||||
# Get ahead/behind info
|
# Get ahead/behind information relative to upstream branch
|
||||||
if status["branch"]:
|
if status["branch"]:
|
||||||
success, ahead_behind = run_git_command(
|
success, ahead_behind = run_git_command(
|
||||||
repo_path,
|
repo_path,
|
||||||
@@ -131,33 +182,38 @@ def get_git_status(repo_path: str) -> Dict[str, any]:
|
|||||||
if len(parts) == 2:
|
if len(parts) == 2:
|
||||||
status["behind"] = int(parts[0])
|
status["behind"] = int(parts[0])
|
||||||
status["ahead"] = int(parts[1])
|
status["ahead"] = int(parts[1])
|
||||||
# else: Could not get ahead/behind status (branch may not have upstream)
|
# Note: This will fail gracefully if branch has no upstream set
|
||||||
|
|
||||||
# Get file status
|
# Get file status using porcelain format for machine parsing
|
||||||
|
# Format: XY filename where X=staged status, Y=unstaged status
|
||||||
success, status_output = run_git_command(repo_path, ["status", "--porcelain"])
|
success, status_output = run_git_command(repo_path, ["status", "--porcelain"])
|
||||||
if success:
|
if success:
|
||||||
for line in status_output.strip().split("\n"):
|
for line in status_output.strip().split("\n"):
|
||||||
if not line:
|
if not line:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
status_code = line[:2]
|
status_code = line[:2] # Two-character status code
|
||||||
path_info = line[3:]
|
path_info = line[3:] # Filename (after space)
|
||||||
|
|
||||||
# Handle staged changes
|
# Parse staged changes (first character of status code)
|
||||||
if status_code[0] == "R":
|
if status_code[0] == "R":
|
||||||
# Format is "old_path -> new_path" for renamed files
|
# Special handling for renamed files
|
||||||
|
# Format is "old_path -> new_path"
|
||||||
if " -> " in path_info:
|
if " -> " in path_info:
|
||||||
_, new_path = path_info.split(" -> ", 1)
|
_, new_path = path_info.split(" -> ", 1)
|
||||||
status["staged_files"].append(new_path)
|
status["staged_files"].append(new_path)
|
||||||
else:
|
else:
|
||||||
status["staged_files"].append(path_info)
|
status["staged_files"].append(path_info)
|
||||||
elif status_code[0] in ["M", "A", "D", "C"]:
|
elif status_code[0] in ["M", "A", "D", "C"]:
|
||||||
|
# M=modified, A=added, D=deleted, C=copied
|
||||||
status["staged_files"].append(path_info)
|
status["staged_files"].append(path_info)
|
||||||
|
|
||||||
# Handle unstaged changes
|
# Parse unstaged changes (second character of status code)
|
||||||
if status_code[1] in ["M", "D"]:
|
if status_code[1] in ["M", "D"]:
|
||||||
|
# M=modified, D=deleted in working tree
|
||||||
status["unstaged_files"].append(path_info)
|
status["unstaged_files"].append(path_info)
|
||||||
elif status_code == "??":
|
elif status_code == "??":
|
||||||
|
# Untracked files have special marker "??"
|
||||||
status["untracked_files"].append(path_info)
|
status["untracked_files"].append(path_info)
|
||||||
|
|
||||||
return status
|
return status
|
||||||
|
|||||||
@@ -1,5 +1,12 @@
|
|||||||
"""
|
"""
|
||||||
Token counting utilities
|
Token counting utilities for managing API context limits
|
||||||
|
|
||||||
|
This module provides functions for estimating token counts to ensure
|
||||||
|
requests stay within the Gemini API's context window limits.
|
||||||
|
|
||||||
|
Note: The estimation uses a simple character-to-token ratio which is
|
||||||
|
approximate. For production systems requiring precise token counts,
|
||||||
|
consider using the actual tokenizer for the specific model.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from typing import Tuple
|
from typing import Tuple
|
||||||
@@ -8,14 +15,40 @@ from config import MAX_CONTEXT_TOKENS
|
|||||||
|
|
||||||
|
|
||||||
def estimate_tokens(text: str) -> int:
|
def estimate_tokens(text: str) -> int:
|
||||||
"""Estimate token count (rough: 1 token ≈ 4 characters)"""
|
"""
|
||||||
|
Estimate token count using a character-based approximation.
|
||||||
|
|
||||||
|
This uses a rough heuristic where 1 token ≈ 4 characters, which is
|
||||||
|
a reasonable approximation for English text. The actual token count
|
||||||
|
may vary based on:
|
||||||
|
- Language (non-English text may have different ratios)
|
||||||
|
- Code vs prose (code often has more tokens per character)
|
||||||
|
- Special characters and formatting
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: The text to estimate tokens for
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
int: Estimated number of tokens
|
||||||
|
"""
|
||||||
return len(text) // 4
|
return len(text) // 4
|
||||||
|
|
||||||
|
|
||||||
def check_token_limit(text: str) -> Tuple[bool, int]:
|
def check_token_limit(text: str) -> Tuple[bool, int]:
|
||||||
"""
|
"""
|
||||||
Check if text exceeds token limit.
|
Check if text exceeds the maximum token limit for Gemini models.
|
||||||
Returns: (is_within_limit, estimated_tokens)
|
|
||||||
|
This function is used to validate that prepared prompts will fit
|
||||||
|
within the model's context window, preventing API errors and ensuring
|
||||||
|
reliable operation.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
text: The text to check
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Tuple[bool, int]: (is_within_limit, estimated_tokens)
|
||||||
|
- is_within_limit: True if the text fits within MAX_CONTEXT_TOKENS
|
||||||
|
- estimated_tokens: The estimated token count
|
||||||
"""
|
"""
|
||||||
estimated = estimate_tokens(text)
|
estimated = estimate_tokens(text)
|
||||||
return estimated <= MAX_CONTEXT_TOKENS, estimated
|
return estimated <= MAX_CONTEXT_TOKENS, estimated
|
||||||
|
|||||||
Reference in New Issue
Block a user