Fix model metadata preservation with continuation_id (fixes #111) (#113)

* Fix model metadata preservation when using continuation_id When continuing a conversation without specifying a model, the system now correctly retrieves and uses the model from the previous assistant turn instead of defaulting to DEFAULT_MODEL. This ensures model continuity across conversation turns and fixes the metadata mismatch issue. The fix: - In reconstruct_thread_context(), check for previous assistant turns - If no model is specified in the continuation request, use the model from the most recent assistant turn - This preserves the model choice across conversation continuations Added comprehensive tests to verify the fix handles: - Single turn conversations - Multiple turns with different models - No previous assistant turns (falls back to DEFAULT_MODEL) - Explicit model specification (overrides previous turn) - Thread chain relationships Fixes issue where continuation metadata would incorrectly report 'llama3.2' instead of the actual model used (e.g., 'deepseek-r1-8b') * Update test to reference issue #111 * Refactor tests to call reconstruct_thread_context directly Address Gemini Code Assist feedback by removing duplicated implementation logic from tests. Tests now call the actual function with proper mocking instead of reimplementing the model retrieval logic. This improves maintainability and ensures tests validate actual behavior rather than their own copy of the logic.
2025-06-22 01:28:58 -05:00
parent 81464ec6c6
commit e8275a04a0
2 changed files with 228 additions and 0 deletions
--- a/server.py
+++ b/server.py
@@ -856,6 +856,16 @@ async def reconstruct_thread_context(arguments: dict[str, Any]) -> dict[str, Any
    # Create model context early to use for history building
    from utils.model_context import ModelContext

+    # Check if we should use the model from the previous conversation turn
+    model_from_args = arguments.get("model")
+    if not model_from_args and context.turns:
+        # Find the last assistant turn to get the model used
+        for turn in reversed(context.turns):
+            if turn.role == "assistant" and turn.model_name:
+                arguments["model"] = turn.model_name
+                logger.debug(f"[CONVERSATION_DEBUG] Using model from previous turn: {turn.model_name}")
+                break
+
    model_context = ModelContext.from_arguments(arguments)

    # Build conversation history with model-specific limits