refactor: cleanup and comprehensive documentation

Major changes:
- Add comprehensive documentation to all modules with detailed docstrings
- Remove unused THINKING_MODEL config (use single GEMINI_MODEL with thinking_mode param)
- Remove list_models functionality (simplified to single model configuration)
- Rename DEFAULT_MODEL to GEMINI_MODEL for clarity
- Remove unused python-dotenv dependency
- Fix missing pydantic in setup.py dependencies

Documentation improvements:
- Document security measures in file_utils.py (path validation, sandboxing)
- Add detailed comments to critical logic sections
- Document tool creation process in BaseTool
- Explain configuration values and their impact
- Add comprehensive function-level documentation

Code quality:
- Apply black formatting to all files
- Fix all ruff linting issues
- Update tests to match refactored code
- All 63 tests passing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Fahad
2025-06-09 19:04:24 +04:00
parent fd6e2f9b64
commit 783ba73181
12 changed files with 639 additions and 260 deletions

View File

@@ -1,5 +1,12 @@
"""
Token counting utilities
Token counting utilities for managing API context limits
This module provides functions for estimating token counts to ensure
requests stay within the Gemini API's context window limits.
Note: The estimation uses a simple character-to-token ratio which is
approximate. For production systems requiring precise token counts,
consider using the actual tokenizer for the specific model.
"""
from typing import Tuple
@@ -8,14 +15,40 @@ from config import MAX_CONTEXT_TOKENS
def estimate_tokens(text: str) -> int:
"""Estimate token count (rough: 1 token ≈ 4 characters)"""
"""
Estimate token count using a character-based approximation.
This uses a rough heuristic where 1 token ≈ 4 characters, which is
a reasonable approximation for English text. The actual token count
may vary based on:
- Language (non-English text may have different ratios)
- Code vs prose (code often has more tokens per character)
- Special characters and formatting
Args:
text: The text to estimate tokens for
Returns:
int: Estimated number of tokens
"""
return len(text) // 4
def check_token_limit(text: str) -> Tuple[bool, int]:
"""
Check if text exceeds token limit.
Returns: (is_within_limit, estimated_tokens)
Check if text exceeds the maximum token limit for Gemini models.
This function is used to validate that prepared prompts will fit
within the model's context window, preventing API errors and ensuring
reliable operation.
Args:
text: The text to check
Returns:
Tuple[bool, int]: (is_within_limit, estimated_tokens)
- is_within_limit: True if the text fits within MAX_CONTEXT_TOKENS
- estimated_tokens: The estimated token count
"""
estimated = estimate_tokens(text)
return estimated <= MAX_CONTEXT_TOKENS, estimated