## 🚀 Major Improvements ### Docker Environment Simplification - **BREAKING**: Simplified Docker configuration by auto-detecting sandbox from WORKSPACE_ROOT - Removed redundant MCP_PROJECT_ROOT requirement for Docker setups - Updated all Docker config examples and setup scripts - Added security validation for dangerous WORKSPACE_ROOT paths ### Security Enhancements - **CRITICAL**: Fixed insecure PROJECT_ROOT fallback to use current directory instead of home - Enhanced path validation with proper Docker environment detection - Removed information disclosure in error messages - Strengthened symlink and path traversal protection ### File Handling Optimization - **PERFORMANCE**: Optimized read_files() to return content only (removed summary) - Unified file reading across all tools using standardized file_utils routines - Fixed review_changes tool to use consistent file loading patterns - Improved token management and reduced unnecessary processing ### Tool Improvements - **UX**: Enhanced ReviewCodeTool to require user context for targeted reviews - Removed deprecated _get_secure_container_path function and _sanitize_filename - Standardized file access patterns across analyze, review_changes, and other tools - Added contextual prompting to align reviews with user expectations ### Code Quality & Testing - Updated all tests for new function signatures and requirements - Added comprehensive Docker path integration tests - Achieved 100% test coverage (95 tests passing) - Full compliance with ruff, black, and isort linting standards ### Configuration & Deployment - Added pyproject.toml for modern Python packaging - Streamlined Docker setup removing redundant environment variables - Updated setup scripts across all platforms (Windows, macOS, Linux) - Improved error handling and validation throughout ## 🔧 Technical Changes - **Removed**: `_get_secure_container_path()`, `_sanitize_filename()`, unused SANDBOX_MODE - **Enhanced**: Path translation, security validation, token management - **Standardized**: File reading patterns, error handling, Docker detection - **Updated**: All tool prompts for better context alignment ## 🛡️ Security Notes This release significantly improves the security posture by: - Eliminating broad filesystem access defaults - Adding validation for Docker environment variables - Removing information disclosure in error paths - Strengthening path traversal and symlink protections 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
53 lines
1.6 KiB
Python
53 lines
1.6 KiB
Python
"""
|
|
Token counting utilities for managing API context limits
|
|
|
|
This module provides functions for estimating token counts to ensure
|
|
requests stay within the Gemini API's context window limits.
|
|
|
|
Note: The estimation uses a simple character-to-token ratio which is
|
|
approximate. For production systems requiring precise token counts,
|
|
consider using the actual tokenizer for the specific model.
|
|
"""
|
|
|
|
from config import MAX_CONTEXT_TOKENS
|
|
|
|
|
|
def estimate_tokens(text: str) -> int:
|
|
"""
|
|
Estimate token count using a character-based approximation.
|
|
|
|
This uses a rough heuristic where 1 token ≈ 4 characters, which is
|
|
a reasonable approximation for English text. The actual token count
|
|
may vary based on:
|
|
- Language (non-English text may have different ratios)
|
|
- Code vs prose (code often has more tokens per character)
|
|
- Special characters and formatting
|
|
|
|
Args:
|
|
text: The text to estimate tokens for
|
|
|
|
Returns:
|
|
int: Estimated number of tokens
|
|
"""
|
|
return len(text) // 4
|
|
|
|
|
|
def check_token_limit(text: str) -> tuple[bool, int]:
|
|
"""
|
|
Check if text exceeds the maximum token limit for Gemini models.
|
|
|
|
This function is used to validate that prepared prompts will fit
|
|
within the model's context window, preventing API errors and ensuring
|
|
reliable operation.
|
|
|
|
Args:
|
|
text: The text to check
|
|
|
|
Returns:
|
|
Tuple[bool, int]: (is_within_limit, estimated_tokens)
|
|
- is_within_limit: True if the text fits within MAX_CONTEXT_TOKENS
|
|
- estimated_tokens: The estimated token count
|
|
"""
|
|
estimated = estimate_tokens(text)
|
|
return estimated <= MAX_CONTEXT_TOKENS, estimated
|