- Fix o3-pro response parsing to use output_text convenience field - Replace respx with custom httpx transport solution for better reliability - Implement comprehensive PII sanitization to prevent secret exposure - Add HTTP request/response recording with cassette format for testing - Sanitize all existing cassettes to remove exposed API keys - Update documentation to reflect new HTTP transport recorder - Add test suite for PII sanitization and HTTP recording This change: 1. Fixes timeout issues with o3-pro API calls (was 2+ minutes, now ~15-22 seconds) 2. Properly captures response content without httpx.ResponseNotRead exceptions 3. Preserves original HTTP response format including gzip compression 4. Prevents future secret exposure with automatic PII sanitization 5. Enables reliable replay testing for o3-pro interactions Co-Authored-By: Claude <noreply@anthropic.com>
6.2 KiB
6.2 KiB
HTTP Recording/Replay Testing with HTTP Transport Recorder
This project uses a custom HTTP Transport Recorder for testing expensive API integrations (like o3-pro) with real recorded responses.
What is HTTP Transport Recorder?
The HTTP Transport Recorder is a custom httpx transport implementation that intercepts HTTP requests/responses at the transport layer. This approach provides:
- Real API structure: Tests use actual API responses, not guessed mocks
- Cost efficiency: Only pay for API calls once during recording
- Deterministic tests: Same response every time, no API variability
- Transport-level interception: Works seamlessly with httpx and OpenAI SDK
- Full response capture: Captures complete HTTP responses including headers and gzipped content
Directory Structure
tests/
├── openai_cassettes/ # Recorded HTTP interactions
│ ├── o3_pro_basic_math.json
│ └── o3_pro_content_capture.json
├── http_transport_recorder.py # Transport recorder implementation
├── test_content_capture.py # Example recording test
└── test_replay.py # Example replay test
Key Components
RecordingTransport
- Wraps httpx's default transport
- Makes real HTTP calls and captures responses
- Handles gzip compression/decompression properly
- Saves interactions to JSON cassettes
ReplayTransport
- Serves saved responses from cassettes
- No real HTTP calls made
- Matches requests by method, path, and content hash
- Re-applies gzip compression when needed
TransportFactory
- Auto-selects record vs replay mode based on cassette existence
- Simplifies test setup
Workflow
1. Use Transport Recorder in Tests
from tests.http_transport_recorder import TransportFactory
# Create transport based on cassette existence
cassette_path = "tests/openai_cassettes/my_test.json"
transport = TransportFactory.create_transport(cassette_path)
# Inject into OpenAI provider
provider = ModelProviderRegistry.get_provider_for_model("o3-pro")
provider._test_transport = transport
# Make API calls - will be recorded/replayed automatically
2. Initial Recording (Expensive)
# With real API key, cassette doesn't exist -> records
python test_content_capture.py
# ⚠️ This will cost money! O3-Pro is $15-60 per 1K tokens
# But only needs to be done once
3. Subsequent Runs (Free)
# Cassette exists -> replays
python test_replay.py
# Can even use fake API key to prove no real calls
OPENAI_API_KEY="sk-fake-key" python test_replay.py
# Fast, free, deterministic
4. Re-recording (When API Changes)
# Delete cassette to force re-recording
rm tests/openai_cassettes/my_test.json
# Run test again with real API key
python test_content_capture.py
How It Works
- Transport Injection: Custom transport injected into httpx client
- Request Interception: All HTTP requests go through custom transport
- Mode Detection: Checks if cassette exists (replay) or needs creation (record)
- Content Capture: Properly handles streaming responses and gzip encoding
- Request Matching: Uses method + path + content hash for deterministic matching
Cassette Format
{
"interactions": [
{
"request": {
"method": "POST",
"url": "https://api.openai.com/v1/responses",
"path": "/v1/responses",
"headers": {
"content-type": "application/json",
"accept-encoding": "gzip, deflate"
},
"content": {
"model": "o3-pro-2025-06-10",
"input": [...],
"reasoning": {"effort": "medium"}
}
},
"response": {
"status_code": 200,
"headers": {
"content-type": "application/json",
"content-encoding": "gzip"
},
"content": {
"data": "base64_encoded_response_body",
"encoding": "base64",
"size": 1413
},
"reason_phrase": "OK"
}
}
]
}
Key features:
- Complete request/response capture
- Base64 encoding for binary content
- Preserves gzip compression
- Sanitizes sensitive data (API keys removed)
Benefits Over Previous Approaches
- Works with any HTTP client: Not tied to OpenAI SDK specifically
- Handles compression: Properly manages gzipped responses
- Full HTTP fidelity: Captures headers, status codes, etc.
- Simpler than VCR.py: No sync/async conflicts or monkey patching
- Better than respx: No streaming response issues
Example Test
#!/usr/bin/env python3
import asyncio
from pathlib import Path
from tests.http_transport_recorder import TransportFactory
from providers import ModelProviderRegistry
from tools.chat import ChatTool
async def test_with_recording():
cassette_path = "tests/openai_cassettes/test_example.json"
# Setup transport
transport = TransportFactory.create_transport(cassette_path)
provider = ModelProviderRegistry.get_provider_for_model("o3-pro")
provider._test_transport = transport
# Use ChatTool normally
chat_tool = ChatTool()
result = await chat_tool.execute({
"prompt": "What is 2+2?",
"model": "o3-pro",
"temperature": 1.0
})
print(f"Response: {result[0].text}")
if __name__ == "__main__":
asyncio.run(test_with_recording())
Timeout Protection
Tests can use GNU timeout to prevent hanging:
# Install GNU coreutils if needed
brew install coreutils
# Run with 30 second timeout
gtimeout 30s python test_content_capture.py
CI/CD Integration
# In CI, tests use existing cassettes (no API keys needed)
- name: Run OpenAI tests
run: |
# Tests will use replay mode with existing cassettes
python -m pytest tests/test_o3_pro.py
Cost Management
- One-time cost: Initial recording per test scenario
- Zero ongoing cost: Replays are free
- Controlled re-recording: Manual cassette deletion required
- CI-friendly: No accidental API calls in automation
This HTTP transport recorder approach provides accurate API testing with cost efficiency, specifically optimized for expensive endpoints like o3-pro while being flexible enough for any HTTP-based API.