Migration from Docker to Standalone Python Server (#73)

* Migration from docker to standalone server
Migration handling
Fixed tests
Use simpler in-memory storage
Support for concurrent logging to disk
Simplified direct connections to localhost

* Migration from docker / redis to standalone script
Updated tests
Updated run script
Fixed requirements
Use dotenv
Ask if user would like to install MCP in Claude Desktop once
Updated docs

* More cleanup and references to docker removed

* Cleanup

* Comments

* Fixed tests

* Fix GitHub Actions workflow for standalone Python architecture

- Install requirements-dev.txt for pytest and testing dependencies
- Remove Docker setup from simulation tests (now standalone)
- Simplify linting job to use requirements-dev.txt
- Update simulation tests to run directly without Docker

Fixes unit test failures in CI due to missing pytest dependency.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Remove simulation tests from GitHub Actions

- Removed simulation-tests job that makes real API calls
- Keep only unit tests (mocked, no API costs) and linting
- Simulation tests should be run manually with real API keys
- Reduces CI costs and complexity

GitHub Actions now only runs:
- Unit tests (569 tests, all mocked)
- Code quality checks (ruff, black)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fixed tests

* Fixed tests

---------

Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
Beehive Innovations
2025-06-18 23:41:22 +04:00
committed by GitHub
parent 9d72545ecd
commit 4151c3c3a5
121 changed files with 2842 additions and 3168 deletions

View File

@@ -3,15 +3,29 @@ Conversation Memory for AI-to-AI Multi-turn Discussions
This module provides conversation persistence and context reconstruction for
stateless MCP (Model Context Protocol) environments. It enables multi-turn
conversations between Claude and Gemini by storing conversation state in Redis
conversations between Claude and Gemini by storing conversation state in memory
across independent request cycles.
CRITICAL ARCHITECTURAL REQUIREMENT:
This conversation memory system is designed for PERSISTENT MCP SERVER PROCESSES.
It uses in-memory storage that persists only within a single Python process.
⚠️ IMPORTANT: This system will NOT work correctly if MCP tool calls are made
as separate subprocess invocations (each subprocess starts with empty memory).
WORKING SCENARIO: Claude Desktop with persistent MCP server process
FAILING SCENARIO: Simulator tests calling server.py as individual subprocesses
Root cause of test failures: Each subprocess call loses the conversation
state from previous calls because memory is process-specific, not shared
across subprocess boundaries.
ARCHITECTURE OVERVIEW:
The MCP protocol is inherently stateless - each tool request is independent
with no memory of previous interactions. This module bridges that gap by:
1. Creating persistent conversation threads with unique UUIDs
2. Storing complete conversation context (turns, files, metadata) in Redis
2. Storing complete conversation context (turns, files, metadata) in memory
3. Reconstructing conversation history when tools are called with continuation_id
4. Supporting cross-tool continuation - seamlessly switch between different tools
while maintaining full conversation context and file references
@@ -35,9 +49,9 @@ Key Features:
most recent file context is preserved when token limits require exclusions.
- Automatic turn limiting (20 turns max) to prevent runaway conversations
- Context reconstruction for stateless request continuity
- Redis-based persistence with automatic expiration (3 hour TTL)
- In-memory persistence with automatic expiration (3 hour TTL)
- Thread-safe operations for concurrent access
- Graceful degradation when Redis is unavailable
- Graceful degradation when storage is unavailable
DUAL PRIORITIZATION STRATEGY (Files & Conversations):
The conversation memory system implements sophisticated prioritization for both files and
@@ -187,26 +201,16 @@ class ThreadContext(BaseModel):
initial_context: dict[str, Any] # Original request parameters
def get_redis_client():
def get_storage():
"""
Get Redis client from environment configuration
Creates a Redis client using the REDIS_URL environment variable.
Defaults to localhost:6379/0 if not specified.
Get in-memory storage backend for conversation persistence.
Returns:
redis.Redis: Configured Redis client with decode_responses=True
Raises:
ValueError: If redis package is not installed
InMemoryStorage: Thread-safe in-memory storage backend
"""
try:
import redis
from .storage_backend import get_storage_backend
redis_url = os.getenv("REDIS_URL", "redis://localhost:6379/0")
return redis.from_url(redis_url, decode_responses=True)
except ImportError:
raise ValueError("redis package required. Install with: pip install redis")
return get_storage_backend()
def create_thread(tool_name: str, initial_request: dict[str, Any], parent_thread_id: Optional[str] = None) -> str:
@@ -251,10 +255,10 @@ def create_thread(tool_name: str, initial_request: dict[str, Any], parent_thread
initial_context=filtered_context,
)
# Store in Redis with configurable TTL to prevent indefinite accumulation
client = get_redis_client()
# Store in memory with configurable TTL to prevent indefinite accumulation
storage = get_storage()
key = f"thread:{thread_id}"
client.setex(key, CONVERSATION_TIMEOUT_SECONDS, context.model_dump_json())
storage.setex(key, CONVERSATION_TIMEOUT_SECONDS, context.model_dump_json())
logger.debug(f"[THREAD] Created new thread {thread_id} with parent {parent_thread_id}")
@@ -263,7 +267,7 @@ def create_thread(tool_name: str, initial_request: dict[str, Any], parent_thread
def get_thread(thread_id: str) -> Optional[ThreadContext]:
"""
Retrieve thread context from Redis
Retrieve thread context from in-memory storage
Fetches complete conversation context for cross-tool continuation.
This is the core function that enables tools to access conversation
@@ -278,22 +282,22 @@ def get_thread(thread_id: str) -> Optional[ThreadContext]:
Security:
- Validates UUID format to prevent injection attacks
- Handles Redis connection failures gracefully
- Handles storage connection failures gracefully
- No error information leakage on failure
"""
if not thread_id or not _is_valid_uuid(thread_id):
return None
try:
client = get_redis_client()
storage = get_storage()
key = f"thread:{thread_id}"
data = client.get(key)
data = storage.get(key)
if data:
return ThreadContext.model_validate_json(data)
return None
except Exception:
# Silently handle errors to avoid exposing Redis details
# Silently handle errors to avoid exposing storage details
return None
@@ -313,8 +317,7 @@ def add_turn(
Appends a new conversation turn to an existing thread. This is the core
function for building conversation history and enabling cross-tool
continuation. Each turn preserves the tool and model that generated it,
and tracks file reception order using atomic Redis counters.
continuation. Each turn preserves the tool and model that generated it.
Args:
thread_id: UUID of the conversation thread
@@ -333,7 +336,7 @@ def add_turn(
Failure cases:
- Thread doesn't exist or expired
- Maximum turn limit reached
- Redis connection failure
- Storage connection failure
Note:
- Refreshes thread TTL to configured timeout on successful update
@@ -370,14 +373,14 @@ def add_turn(
context.turns.append(turn)
context.last_updated_at = datetime.now(timezone.utc).isoformat()
# Save back to Redis and refresh TTL
# Save back to storage and refresh TTL
try:
client = get_redis_client()
storage = get_storage()
key = f"thread:{thread_id}"
client.setex(key, CONVERSATION_TIMEOUT_SECONDS, context.model_dump_json()) # Refresh TTL to configured timeout
storage.setex(key, CONVERSATION_TIMEOUT_SECONDS, context.model_dump_json()) # Refresh TTL to configured timeout
return True
except Exception as e:
logger.debug(f"[FLOW] Failed to save turn to Redis: {type(e).__name__}")
logger.debug(f"[FLOW] Failed to save turn to storage: {type(e).__name__}")
return False
@@ -591,11 +594,9 @@ def _plan_file_inclusion_by_size(all_files: list[str], max_file_tokens: int) ->
for file_path in all_files:
try:
from utils.file_utils import estimate_file_tokens, translate_path_for_environment
from utils.file_utils import estimate_file_tokens
translated_path = translate_path_for_environment(file_path)
if os.path.exists(translated_path) and os.path.isfile(translated_path):
if os.path.exists(file_path) and os.path.isfile(file_path):
# Use centralized token estimation for consistency
estimated_tokens = estimate_file_tokens(file_path)
@@ -613,7 +614,7 @@ def _plan_file_inclusion_by_size(all_files: list[str], max_file_tokens: int) ->
else:
files_to_skip.append(file_path)
# More descriptive message for missing files
if not os.path.exists(translated_path):
if not os.path.exists(file_path):
logger.debug(
f"[FILES] Skipping {file_path} - file no longer exists (may have been moved/deleted since conversation)"
)
@@ -724,7 +725,7 @@ def build_conversation_history(context: ThreadContext, model_context=None, read_
Performance Characteristics:
- O(n) file collection with newest-first prioritization
- Intelligent token budgeting prevents context window overflow
- Redis-based persistence with automatic TTL management
- In-memory persistence with automatic TTL management
- Graceful degradation when files are inaccessible or too large
"""
# Get the complete thread chain
@@ -851,10 +852,7 @@ def build_conversation_history(context: ThreadContext, model_context=None, read_
except Exception as e:
# More descriptive error handling for missing files
try:
from utils.file_utils import translate_path_for_environment
translated_path = translate_path_for_environment(file_path)
if not os.path.exists(translated_path):
if not os.path.exists(file_path):
logger.info(
f"File no longer accessible for conversation history: {file_path} - file was moved/deleted since conversation (marking as excluded)"
)