Add DocGen tool with comprehensive documentation generation capabilities (#109)
* WIP: new workflow architecture * WIP: further improvements and cleanup * WIP: cleanup and docks, replace old tool with new * WIP: cleanup and docks, replace old tool with new * WIP: new planner implementation using workflow * WIP: precommit tool working as a workflow instead of a basic tool Support for passing False to use_assistant_model to skip external models completely and use Claude only * WIP: precommit workflow version swapped with old * WIP: codereview * WIP: replaced codereview * WIP: replaced codereview * WIP: replaced refactor * WIP: workflow for thinkdeep * WIP: ensure files get embedded correctly * WIP: thinkdeep replaced with workflow version * WIP: improved messaging when an external model's response is received * WIP: analyze tool swapped * WIP: updated tests * Extract only the content when building history * Use "relevant_files" for workflow tools only * WIP: updated tests * Extract only the content when building history * Use "relevant_files" for workflow tools only * WIP: fixed get_completion_next_steps_message missing param * Fixed tests Request for files consistently * Fixed tests Request for files consistently * Fixed tests * New testgen workflow tool Updated docs * Swap testgen workflow * Fix CI test failures by excluding API-dependent tests - Update GitHub Actions workflow to exclude simulation tests that require API keys - Fix collaboration tests to properly mock workflow tool expert analysis calls - Update test assertions to handle new workflow tool response format - Ensure unit tests run without external API dependencies in CI 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * WIP - Update tests to match new tools * WIP - Update tests to match new tools * WIP - Update tests to match new tools * Should help with https://github.com/BeehiveInnovations/zen-mcp-server/issues/97 Clear python cache when running script: https://github.com/BeehiveInnovations/zen-mcp-server/issues/96 Improved retry error logging Cleanup * WIP - chat tool using new architecture and improved code sharing * Removed todo * Removed todo * Cleanup old name * Tweak wordings * Tweak wordings Migrate old tests * Support for Flash 2.0 and Flash Lite 2.0 * Support for Flash 2.0 and Flash Lite 2.0 * Support for Flash 2.0 and Flash Lite 2.0 Fixed test * Improved consensus to use the workflow base class * Improved consensus to use the workflow base class * Allow images * Allow images * Replaced old consensus tool * Cleanup tests * Tests for prompt size * New tool: docgen Tests for prompt size Fixes: https://github.com/BeehiveInnovations/zen-mcp-server/issues/107 Use available token size limits: https://github.com/BeehiveInnovations/zen-mcp-server/issues/105 * Improved docgen prompt Exclude TestGen from pytest inclusion * Updated errors * Lint * DocGen instructed not to fix bugs, surface them and stick to d * WIP * Stop claude from being lazy and only documenting a small handful * More style rules --------- Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
committed by
GitHub
parent
0655590a51
commit
c960bcb720
@@ -1,163 +1,191 @@
|
||||
"""
|
||||
Regression tests to ensure normal prompt handling still works after large prompt changes.
|
||||
Integration tests to ensure normal prompt handling works with real API calls.
|
||||
|
||||
This test module verifies that all tools continue to work correctly with
|
||||
normal-sized prompts after implementing the large prompt handling feature.
|
||||
normal-sized prompts using real integration testing instead of mocks.
|
||||
|
||||
INTEGRATION TESTS:
|
||||
These tests are marked with @pytest.mark.integration and make real API calls.
|
||||
They use the local-llama model which is FREE and runs locally via Ollama.
|
||||
|
||||
Prerequisites:
|
||||
- Ollama installed and running locally
|
||||
- CUSTOM_API_URL environment variable set to your Ollama endpoint (e.g., http://localhost:11434)
|
||||
- local-llama model available through custom provider configuration
|
||||
- No API keys required - completely FREE to run unlimited times!
|
||||
|
||||
Running Tests:
|
||||
- All tests (including integration): pytest tests/test_prompt_regression.py
|
||||
- Unit tests only: pytest tests/test_prompt_regression.py -m "not integration"
|
||||
- Integration tests only: pytest tests/test_prompt_regression.py -m "integration"
|
||||
|
||||
Note: Integration tests skip gracefully if CUSTOM_API_URL is not set.
|
||||
They are excluded from CI/CD but run by default locally when Ollama is configured.
|
||||
"""
|
||||
|
||||
import json
|
||||
from unittest.mock import MagicMock, patch
|
||||
import os
|
||||
import tempfile
|
||||
|
||||
import pytest
|
||||
|
||||
# Load environment variables from .env file
|
||||
from dotenv import load_dotenv
|
||||
|
||||
from tools.analyze import AnalyzeTool
|
||||
from tools.chat import ChatTool
|
||||
from tools.codereview import CodeReviewTool
|
||||
|
||||
# from tools.debug import DebugIssueTool # Commented out - debug tool refactored
|
||||
from tools.thinkdeep import ThinkDeepTool
|
||||
|
||||
load_dotenv()
|
||||
|
||||
class TestPromptRegression:
|
||||
"""Regression test suite for normal prompt handling."""
|
||||
# Check if CUSTOM_API_URL is available for local-llama
|
||||
CUSTOM_API_AVAILABLE = os.getenv("CUSTOM_API_URL") is not None
|
||||
|
||||
@pytest.fixture
|
||||
def mock_model_response(self):
|
||||
"""Create a mock model response."""
|
||||
from unittest.mock import Mock
|
||||
|
||||
def _create_response(text="Test response"):
|
||||
# Return a Mock that acts like ModelResponse
|
||||
return Mock(
|
||||
content=text,
|
||||
usage={"input_tokens": 10, "output_tokens": 20, "total_tokens": 30},
|
||||
model_name="gemini-2.5-flash",
|
||||
metadata={"finish_reason": "STOP"},
|
||||
)
|
||||
def skip_if_no_custom_api():
|
||||
"""Helper to skip integration tests if CUSTOM_API_URL is not available."""
|
||||
if not CUSTOM_API_AVAILABLE:
|
||||
pytest.skip(
|
||||
"CUSTOM_API_URL not set. To run integration tests with local-llama, ensure CUSTOM_API_URL is set in .env file (e.g., http://localhost:11434/v1)"
|
||||
)
|
||||
|
||||
return _create_response
|
||||
|
||||
class TestPromptIntegration:
|
||||
"""Integration test suite for normal prompt handling with real API calls."""
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_chat_normal_prompt(self, mock_model_response):
|
||||
"""Test chat tool with normal prompt."""
|
||||
async def test_chat_normal_prompt(self):
|
||||
"""Test chat tool with normal prompt using real API."""
|
||||
skip_if_no_custom_api()
|
||||
|
||||
tool = ChatTool()
|
||||
|
||||
with patch.object(tool, "get_model_provider") as mock_get_provider:
|
||||
mock_provider = MagicMock()
|
||||
mock_provider.get_provider_type.return_value = MagicMock(value="google")
|
||||
mock_provider.supports_thinking_mode.return_value = False
|
||||
mock_provider.generate_content.return_value = mock_model_response(
|
||||
"This is a helpful response about Python."
|
||||
)
|
||||
mock_get_provider.return_value = mock_provider
|
||||
result = await tool.execute(
|
||||
{
|
||||
"prompt": "Explain Python decorators in one sentence",
|
||||
"model": "local-llama", # Use available model for integration tests
|
||||
}
|
||||
)
|
||||
|
||||
result = await tool.execute({"prompt": "Explain Python decorators"})
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] in ["success", "continuation_available"]
|
||||
assert "content" in output
|
||||
assert len(output["content"]) > 0
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_chat_with_files(self):
|
||||
"""Test chat tool with files parameter using real API."""
|
||||
skip_if_no_custom_api()
|
||||
|
||||
tool = ChatTool()
|
||||
|
||||
# Create a temporary Python file for testing
|
||||
with tempfile.NamedTemporaryFile(mode="w", suffix=".py", delete=False) as f:
|
||||
f.write(
|
||||
"""
|
||||
def hello_world():
|
||||
\"\"\"A simple hello world function.\"\"\"
|
||||
return "Hello, World!"
|
||||
|
||||
if __name__ == "__main__":
|
||||
print(hello_world())
|
||||
"""
|
||||
)
|
||||
temp_file = f.name
|
||||
|
||||
try:
|
||||
result = await tool.execute(
|
||||
{"prompt": "What does this Python code do?", "files": [temp_file], "model": "local-llama"}
|
||||
)
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] == "success"
|
||||
assert "helpful response about Python" in output["content"]
|
||||
|
||||
# Verify provider was called
|
||||
mock_provider.generate_content.assert_called_once()
|
||||
assert output["status"] in ["success", "continuation_available"]
|
||||
assert "content" in output
|
||||
# Should mention the hello world function
|
||||
assert "hello" in output["content"].lower() or "function" in output["content"].lower()
|
||||
finally:
|
||||
# Clean up temp file
|
||||
os.unlink(temp_file)
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_chat_with_files(self, mock_model_response):
|
||||
"""Test chat tool with files parameter."""
|
||||
tool = ChatTool()
|
||||
async def test_thinkdeep_normal_analysis(self):
|
||||
"""Test thinkdeep tool with normal analysis using real API."""
|
||||
skip_if_no_custom_api()
|
||||
|
||||
with patch.object(tool, "get_model_provider") as mock_get_provider:
|
||||
mock_provider = MagicMock()
|
||||
mock_provider.get_provider_type.return_value = MagicMock(value="google")
|
||||
mock_provider.supports_thinking_mode.return_value = False
|
||||
mock_provider.generate_content.return_value = mock_model_response()
|
||||
mock_get_provider.return_value = mock_provider
|
||||
|
||||
# Mock file reading through the centralized method
|
||||
with patch.object(tool, "_prepare_file_content_for_prompt") as mock_prepare_files:
|
||||
mock_prepare_files.return_value = ("File content here", ["/path/to/file.py"])
|
||||
|
||||
result = await tool.execute({"prompt": "Analyze this code", "files": ["/path/to/file.py"]})
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] == "success"
|
||||
mock_prepare_files.assert_called_once_with(["/path/to/file.py"], None, "Context files")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_thinkdeep_normal_analysis(self, mock_model_response):
|
||||
"""Test thinkdeep tool with normal analysis."""
|
||||
tool = ThinkDeepTool()
|
||||
|
||||
with patch.object(tool, "get_model_provider") as mock_get_provider:
|
||||
mock_provider = MagicMock()
|
||||
mock_provider.get_provider_type.return_value = MagicMock(value="google")
|
||||
mock_provider.supports_thinking_mode.return_value = False
|
||||
mock_provider.generate_content.return_value = mock_model_response(
|
||||
"Here's a deeper analysis with edge cases..."
|
||||
)
|
||||
mock_get_provider.return_value = mock_provider
|
||||
result = await tool.execute(
|
||||
{
|
||||
"step": "I think we should use a cache for performance",
|
||||
"step_number": 1,
|
||||
"total_steps": 1,
|
||||
"next_step_required": False,
|
||||
"findings": "Building a high-traffic API - considering scalability and reliability",
|
||||
"problem_context": "Building a high-traffic API",
|
||||
"focus_areas": ["scalability", "reliability"],
|
||||
"model": "local-llama",
|
||||
}
|
||||
)
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
# ThinkDeep workflow tool should process the analysis
|
||||
assert "status" in output
|
||||
assert output["status"] in ["calling_expert_analysis", "analysis_complete", "pause_for_investigation"]
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_codereview_normal_review(self):
|
||||
"""Test codereview tool with workflow inputs using real API."""
|
||||
skip_if_no_custom_api()
|
||||
|
||||
tool = CodeReviewTool()
|
||||
|
||||
# Create a temporary Python file for testing
|
||||
with tempfile.NamedTemporaryFile(mode="w", suffix=".py", delete=False) as f:
|
||||
f.write(
|
||||
"""
|
||||
def process_user_input(user_input):
|
||||
# Potentially unsafe code for demonstration
|
||||
query = f"SELECT * FROM users WHERE name = '{user_input}'"
|
||||
return query
|
||||
|
||||
def main():
|
||||
user_name = input("Enter name: ")
|
||||
result = process_user_input(user_name)
|
||||
print(result)
|
||||
"""
|
||||
)
|
||||
temp_file = f.name
|
||||
|
||||
try:
|
||||
result = await tool.execute(
|
||||
{
|
||||
"step": "I think we should use a cache for performance",
|
||||
"step": "Initial code review investigation - examining security vulnerabilities",
|
||||
"step_number": 1,
|
||||
"total_steps": 1,
|
||||
"next_step_required": False,
|
||||
"findings": "Building a high-traffic API - considering scalability and reliability",
|
||||
"problem_context": "Building a high-traffic API",
|
||||
"focus_areas": ["scalability", "reliability"],
|
||||
"total_steps": 2,
|
||||
"next_step_required": True,
|
||||
"findings": "Found security issues in code",
|
||||
"relevant_files": [temp_file],
|
||||
"review_type": "security",
|
||||
"focus_on": "Look for SQL injection vulnerabilities",
|
||||
"model": "local-llama",
|
||||
}
|
||||
)
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
# ThinkDeep workflow tool returns calling_expert_analysis status when complete
|
||||
assert output["status"] == "calling_expert_analysis"
|
||||
# Check that expert analysis was performed and contains expected content
|
||||
if "expert_analysis" in output:
|
||||
expert_analysis = output["expert_analysis"]
|
||||
analysis_content = str(expert_analysis)
|
||||
assert (
|
||||
"Critical Evaluation Required" in analysis_content
|
||||
or "deeper analysis" in analysis_content
|
||||
or "cache" in analysis_content
|
||||
)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_codereview_normal_review(self, mock_model_response):
|
||||
"""Test codereview tool with workflow inputs."""
|
||||
tool = CodeReviewTool()
|
||||
|
||||
with patch.object(tool, "get_model_provider") as mock_get_provider:
|
||||
mock_provider = MagicMock()
|
||||
mock_provider.get_provider_type.return_value = MagicMock(value="google")
|
||||
mock_provider.supports_thinking_mode.return_value = False
|
||||
mock_provider.generate_content.return_value = mock_model_response(
|
||||
"Found 3 issues: 1) Missing error handling..."
|
||||
)
|
||||
mock_get_provider.return_value = mock_provider
|
||||
|
||||
# Mock file reading
|
||||
with patch("tools.base.read_files") as mock_read_files:
|
||||
mock_read_files.return_value = "def main(): pass"
|
||||
|
||||
result = await tool.execute(
|
||||
{
|
||||
"step": "Initial code review investigation - examining security vulnerabilities",
|
||||
"step_number": 1,
|
||||
"total_steps": 2,
|
||||
"next_step_required": True,
|
||||
"findings": "Found security issues in code",
|
||||
"relevant_files": ["/path/to/code.py"],
|
||||
"review_type": "security",
|
||||
"focus_on": "Look for SQL injection vulnerabilities",
|
||||
}
|
||||
)
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] == "pause_for_code_review"
|
||||
assert "status" in output
|
||||
assert output["status"] in ["pause_for_code_review", "calling_expert_analysis"]
|
||||
finally:
|
||||
# Clean up temp file
|
||||
os.unlink(temp_file)
|
||||
|
||||
# NOTE: Precommit test has been removed because the precommit tool has been
|
||||
# refactored to use a workflow-based pattern instead of accepting simple prompt/path fields.
|
||||
@@ -193,164 +221,196 @@ class TestPromptRegression:
|
||||
#
|
||||
# assert len(result) == 1
|
||||
# output = json.loads(result[0].text)
|
||||
# assert output["status"] == "success"
|
||||
# assert output["status"] in ["success", "continuation_available"]
|
||||
# assert "Next Steps:" in output["content"]
|
||||
# assert "Root cause" in output["content"]
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_analyze_normal_question(self, mock_model_response):
|
||||
"""Test analyze tool with normal question."""
|
||||
async def test_analyze_normal_question(self):
|
||||
"""Test analyze tool with normal question using real API."""
|
||||
skip_if_no_custom_api()
|
||||
|
||||
tool = AnalyzeTool()
|
||||
|
||||
with patch.object(tool, "get_model_provider") as mock_get_provider:
|
||||
mock_provider = MagicMock()
|
||||
mock_provider.get_provider_type.return_value = MagicMock(value="google")
|
||||
mock_provider.supports_thinking_mode.return_value = False
|
||||
mock_provider.generate_content.return_value = mock_model_response(
|
||||
"The code follows MVC pattern with clear separation..."
|
||||
# Create a temporary Python file demonstrating MVC pattern
|
||||
with tempfile.NamedTemporaryFile(mode="w", suffix=".py", delete=False) as f:
|
||||
f.write(
|
||||
"""
|
||||
# Model
|
||||
class User:
|
||||
def __init__(self, name, email):
|
||||
self.name = name
|
||||
self.email = email
|
||||
|
||||
# View
|
||||
class UserView:
|
||||
def display_user(self, user):
|
||||
return f"User: {user.name} ({user.email})"
|
||||
|
||||
# Controller
|
||||
class UserController:
|
||||
def __init__(self, model, view):
|
||||
self.model = model
|
||||
self.view = view
|
||||
|
||||
def get_user_display(self):
|
||||
return self.view.display_user(self.model)
|
||||
"""
|
||||
)
|
||||
mock_get_provider.return_value = mock_provider
|
||||
temp_file = f.name
|
||||
|
||||
# Mock file reading
|
||||
with patch("tools.base.read_files") as mock_read_files:
|
||||
mock_read_files.return_value = "class UserController: ..."
|
||||
|
||||
result = await tool.execute(
|
||||
{
|
||||
"step": "What design patterns are used in this codebase?",
|
||||
"step_number": 1,
|
||||
"total_steps": 1,
|
||||
"next_step_required": False,
|
||||
"findings": "Initial architectural analysis",
|
||||
"relevant_files": ["/path/to/project"],
|
||||
"analysis_type": "architecture",
|
||||
}
|
||||
)
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
# Workflow analyze tool returns "calling_expert_analysis" for step 1
|
||||
assert output["status"] == "calling_expert_analysis"
|
||||
assert "step_number" in output
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_empty_optional_fields(self, mock_model_response):
|
||||
"""Test tools work with empty optional fields."""
|
||||
tool = ChatTool()
|
||||
|
||||
with patch.object(tool, "get_model_provider") as mock_get_provider:
|
||||
mock_provider = MagicMock()
|
||||
mock_provider.get_provider_type.return_value = MagicMock(value="google")
|
||||
mock_provider.supports_thinking_mode.return_value = False
|
||||
mock_provider.generate_content.return_value = mock_model_response()
|
||||
mock_get_provider.return_value = mock_provider
|
||||
|
||||
# Test with no files parameter
|
||||
result = await tool.execute({"prompt": "Hello"})
|
||||
try:
|
||||
result = await tool.execute(
|
||||
{
|
||||
"step": "What design patterns are used in this codebase?",
|
||||
"step_number": 1,
|
||||
"total_steps": 1,
|
||||
"next_step_required": False,
|
||||
"findings": "Initial architectural analysis",
|
||||
"relevant_files": [temp_file],
|
||||
"analysis_type": "architecture",
|
||||
"model": "local-llama",
|
||||
}
|
||||
)
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] == "success"
|
||||
assert "status" in output
|
||||
# Workflow analyze tool should process the analysis
|
||||
assert output["status"] in ["calling_expert_analysis", "pause_for_investigation"]
|
||||
finally:
|
||||
# Clean up temp file
|
||||
os.unlink(temp_file)
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_thinking_modes_work(self, mock_model_response):
|
||||
"""Test that thinking modes are properly passed through."""
|
||||
async def test_empty_optional_fields(self):
|
||||
"""Test tools work with empty optional fields using real API."""
|
||||
skip_if_no_custom_api()
|
||||
|
||||
tool = ChatTool()
|
||||
|
||||
with patch.object(tool, "get_model_provider") as mock_get_provider:
|
||||
mock_provider = MagicMock()
|
||||
mock_provider.get_provider_type.return_value = MagicMock(value="google")
|
||||
mock_provider.supports_thinking_mode.return_value = False
|
||||
mock_provider.generate_content.return_value = mock_model_response()
|
||||
mock_get_provider.return_value = mock_provider
|
||||
# Test with no files parameter
|
||||
result = await tool.execute({"prompt": "Hello", "model": "local-llama"})
|
||||
|
||||
result = await tool.execute({"prompt": "Test", "thinking_mode": "high", "temperature": 0.8})
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] == "success"
|
||||
|
||||
# Verify generate_content was called with correct parameters
|
||||
mock_provider.generate_content.assert_called_once()
|
||||
call_kwargs = mock_provider.generate_content.call_args[1]
|
||||
assert call_kwargs.get("temperature") == 0.8
|
||||
# thinking_mode would be passed if the provider supports it
|
||||
# In this test, we set supports_thinking_mode to False, so it won't be passed
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] in ["success", "continuation_available"]
|
||||
assert "content" in output
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_special_characters_in_prompts(self, mock_model_response):
|
||||
"""Test prompts with special characters work correctly."""
|
||||
async def test_thinking_modes_work(self):
|
||||
"""Test that thinking modes are properly passed through using real API."""
|
||||
skip_if_no_custom_api()
|
||||
|
||||
tool = ChatTool()
|
||||
|
||||
with patch.object(tool, "get_model_provider") as mock_get_provider:
|
||||
mock_provider = MagicMock()
|
||||
mock_provider.get_provider_type.return_value = MagicMock(value="google")
|
||||
mock_provider.supports_thinking_mode.return_value = False
|
||||
mock_provider.generate_content.return_value = mock_model_response()
|
||||
mock_get_provider.return_value = mock_provider
|
||||
result = await tool.execute(
|
||||
{
|
||||
"prompt": "Explain quantum computing briefly",
|
||||
"thinking_mode": "low",
|
||||
"temperature": 0.8,
|
||||
"model": "local-llama",
|
||||
}
|
||||
)
|
||||
|
||||
special_prompt = 'Test with "quotes" and\nnewlines\tand tabs'
|
||||
result = await tool.execute({"prompt": special_prompt})
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] == "success"
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] in ["success", "continuation_available"]
|
||||
assert "content" in output
|
||||
# Should contain some quantum-related content
|
||||
assert "quantum" in output["content"].lower() or "computing" in output["content"].lower()
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_mixed_file_paths(self, mock_model_response):
|
||||
"""Test handling of various file path formats."""
|
||||
async def test_special_characters_in_prompts(self):
|
||||
"""Test prompts with special characters work correctly using real API."""
|
||||
skip_if_no_custom_api()
|
||||
|
||||
tool = ChatTool()
|
||||
|
||||
special_prompt = (
|
||||
'Test with "quotes" and\nnewlines\tand tabs. Please just respond with the number that is the answer to 1+1.'
|
||||
)
|
||||
result = await tool.execute({"prompt": special_prompt, "model": "local-llama"})
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] in ["success", "continuation_available"]
|
||||
assert "content" in output
|
||||
# Should handle the special characters without crashing - the exact content doesn't matter as much as not failing
|
||||
assert len(output["content"]) > 0
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_mixed_file_paths(self):
|
||||
"""Test handling of various file path formats using real API."""
|
||||
skip_if_no_custom_api()
|
||||
|
||||
tool = AnalyzeTool()
|
||||
|
||||
with patch.object(tool, "get_model_provider") as mock_get_provider:
|
||||
mock_provider = MagicMock()
|
||||
mock_provider.get_provider_type.return_value = MagicMock(value="google")
|
||||
mock_provider.supports_thinking_mode.return_value = False
|
||||
mock_provider.generate_content.return_value = mock_model_response()
|
||||
mock_get_provider.return_value = mock_provider
|
||||
# Create multiple temporary files to test different path formats
|
||||
temp_files = []
|
||||
try:
|
||||
# Create first file
|
||||
with tempfile.NamedTemporaryFile(mode="w", suffix=".py", delete=False) as f:
|
||||
f.write("def function_one(): pass")
|
||||
temp_files.append(f.name)
|
||||
|
||||
with patch("utils.file_utils.read_files") as mock_read_files:
|
||||
mock_read_files.return_value = "Content"
|
||||
# Create second file
|
||||
with tempfile.NamedTemporaryFile(mode="w", suffix=".js", delete=False) as f:
|
||||
f.write("function functionTwo() { return 'hello'; }")
|
||||
temp_files.append(f.name)
|
||||
|
||||
result = await tool.execute(
|
||||
{
|
||||
"step": "Analyze these files",
|
||||
"step_number": 1,
|
||||
"total_steps": 1,
|
||||
"next_step_required": False,
|
||||
"findings": "Initial file analysis",
|
||||
"relevant_files": [
|
||||
"/absolute/path/file.py",
|
||||
"/Users/name/project/src/",
|
||||
"/home/user/code.js",
|
||||
],
|
||||
}
|
||||
)
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
# Analyze workflow tool returns calling_expert_analysis status when complete
|
||||
assert output["status"] == "calling_expert_analysis"
|
||||
mock_read_files.assert_called_once()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_unicode_content(self, mock_model_response):
|
||||
"""Test handling of unicode content in prompts."""
|
||||
tool = ChatTool()
|
||||
|
||||
with patch.object(tool, "get_model_provider") as mock_get_provider:
|
||||
mock_provider = MagicMock()
|
||||
mock_provider.get_provider_type.return_value = MagicMock(value="google")
|
||||
mock_provider.supports_thinking_mode.return_value = False
|
||||
mock_provider.generate_content.return_value = mock_model_response()
|
||||
mock_get_provider.return_value = mock_provider
|
||||
|
||||
unicode_prompt = "Explain this: 你好世界 مرحبا بالعالم"
|
||||
result = await tool.execute({"prompt": unicode_prompt})
|
||||
result = await tool.execute(
|
||||
{
|
||||
"step": "Analyze these files",
|
||||
"step_number": 1,
|
||||
"total_steps": 1,
|
||||
"next_step_required": False,
|
||||
"findings": "Initial file analysis",
|
||||
"relevant_files": temp_files,
|
||||
"model": "local-llama",
|
||||
}
|
||||
)
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] == "success"
|
||||
assert "status" in output
|
||||
# Should process the files
|
||||
assert output["status"] in [
|
||||
"calling_expert_analysis",
|
||||
"pause_for_investigation",
|
||||
"files_required_to_continue",
|
||||
]
|
||||
finally:
|
||||
# Clean up temp files
|
||||
for temp_file in temp_files:
|
||||
if os.path.exists(temp_file):
|
||||
os.unlink(temp_file)
|
||||
|
||||
@pytest.mark.integration
|
||||
@pytest.mark.asyncio
|
||||
async def test_unicode_content(self):
|
||||
"""Test handling of unicode content in prompts using real API."""
|
||||
skip_if_no_custom_api()
|
||||
|
||||
tool = ChatTool()
|
||||
|
||||
unicode_prompt = "Explain what these mean: 你好世界 (Chinese) and مرحبا بالعالم (Arabic)"
|
||||
result = await tool.execute({"prompt": unicode_prompt, "model": "local-llama"})
|
||||
|
||||
assert len(result) == 1
|
||||
output = json.loads(result[0].text)
|
||||
assert output["status"] in ["success", "continuation_available"]
|
||||
assert "content" in output
|
||||
# Should mention hello or world or greeting in some form
|
||||
content_lower = output["content"].lower()
|
||||
assert "hello" in content_lower or "world" in content_lower or "greeting" in content_lower
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
pytest.main([__file__, "-v"])
|
||||
# Run integration tests by default when called directly
|
||||
pytest.main([__file__, "-v", "-m", "integration"])
|
||||
|
||||
Reference in New Issue
Block a user