- Remove unused cassette files with incomplete recordings - Delete broken respx test files (test_o3_pro_respx_simple.py, test_o3_pro_http_recording.py) - Fix respx references in docstrings to mention HTTP transport recorder - Simplify vcr-testing.md documentation (60% reduction, more task-oriented) - Add simplified PR template with better test instructions - Fix cassette path consistency in examples - Add security note about reviewing cassettes before committing 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
4.4 KiB
PR Title Format
fix: Fix o3-pro empty response issue by using output_text field
Description
This PR fixes a critical bug where o3-pro API calls were returning empty responses. The root cause was incorrect response parsing - the code was trying to manually parse response.output.content[] array structure, but o3-pro provides a simpler output_text convenience field directly on the response object. This PR also introduces a secure HTTP recording system for testing expensive o3-pro calls.
Changes Made
- Fixed o3-pro response parsing by using the
output_textconvenience field instead of manual parsing - Added
_safe_extract_output_textmethod with proper validation to handle o3-pro's response format - Implemented custom HTTP transport recorder to replace respx for more reliable test recordings
- Added comprehensive PII sanitization to prevent accidental API key exposure in test cassettes
- Sanitized all existing test cassettes to remove any exposed secrets
- Updated documentation for the new testing infrastructure
- Added test suite to validate the fix and ensure PII sanitization works correctly
No breaking changes - The fix only affects o3-pro model parsing internally.
Dependencies added:
- None (uses existing httpx and standard library modules)
Testing
Run all linting and tests (required):
# Activate virtual environment first
source venv/bin/activate
# Run comprehensive code quality checks (recommended)
./code_quality_checks.sh
# If you made tool changes, also run simulator tests
python communication_simulator_test.py
- All linting passes (ruff, black, isort)
- All unit tests pass
- For bug fixes: Tests added to prevent regression
test_o3_pro_output_text_fix.py- Validates o3-pro response parsing works correctlytest_o3_pro_http_recording.py- Tests HTTP recording functionalitytest_pii_sanitizer.py- Ensures PII sanitization works properly
- Manual testing completed with realistic scenarios
- Verified o3-pro calls return actual content instead of empty responses
- Validated that recorded cassettes contain no exposed API keys
Related Issues
Fixes o3-pro API calls returning empty responses on master branch.
Checklist
- PR title follows the format guidelines above
- Activated venv and ran code quality checks:
source venv/bin/activate && ./code_quality_checks.sh - Self-review completed
- Tests added for ALL changes (see Testing section above)
- Documentation updated as needed
- Updated
docs/testing.mdwith new testing approach - Added
docs/vcr-testing.mdfor HTTP recording documentation
- Updated
- All unit tests passing
- Ready for review
Additional Notes
The Bug:
On master branch, o3-pro API calls were returning empty responses because the code was trying to parse the response incorrectly:
# Master branch - incorrect parsing
if hasattr(response.output, "content") and response.output.content:
for content_item in response.output.content:
if hasattr(content_item, "type") and content_item.type == "output_text":
content = content_item.text
break
The o3-pro response object actually provides an output_text convenience field directly:
# Fixed version - correct parsing
content = response.output_text
The Fix:
- Added
_safe_extract_output_textmethod that properly validates and extracts theoutput_textfield - Updated the response parsing logic in
_generate_with_responses_endpointto use this new method - Added proper error handling and validation to catch future response format issues
Additional Improvements:
- Testing Infrastructure: Implemented HTTP transport recorder to enable testing without repeated expensive API calls
- Security: Added automatic PII sanitization to prevent API keys from being accidentally committed in test recordings
Development Notes:
- During development, we encountered timeout issues with the initial respx-based approach which led to implementing the custom HTTP transport recorder
- The transport recorder solution properly handles streaming responses and gzip compression
For Reviewers:
- The core fix is in
providers/openai_compatible.pylines 307-335 and line 396 - The HTTP transport recorder is test infrastructure only and doesn't affect production code
- All test cassettes have been sanitized and verified to contain no secrets