refactor: Clean up test files and simplify documentation
- Remove unused cassette files with incomplete recordings - Delete broken respx test files (test_o3_pro_respx_simple.py, test_o3_pro_http_recording.py) - Fix respx references in docstrings to mention HTTP transport recorder - Simplify vcr-testing.md documentation (60% reduction, more task-oriented) - Add simplified PR template with better test instructions - Fix cassette path consistency in examples - Add security note about reviewing cassettes before committing 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -1,216 +1,87 @@
|
||||
# HTTP Recording/Replay Testing with HTTP Transport Recorder
|
||||
# HTTP Transport Recorder for Testing
|
||||
|
||||
This project uses a custom HTTP Transport Recorder for testing expensive API integrations (like o3-pro) with real recorded responses.
|
||||
A custom HTTP recorder for testing expensive API calls (like o3-pro) with real responses.
|
||||
|
||||
## What is HTTP Transport Recorder?
|
||||
## Overview
|
||||
|
||||
The HTTP Transport Recorder is a custom httpx transport implementation that intercepts HTTP requests/responses at the transport layer. This approach provides:
|
||||
The HTTP Transport Recorder captures and replays HTTP interactions at the transport layer, enabling:
|
||||
- Cost-efficient testing of expensive APIs (record once, replay forever)
|
||||
- Deterministic tests with real API responses
|
||||
- Seamless integration with httpx and OpenAI SDK
|
||||
|
||||
- **Real API structure**: Tests use actual API responses, not guessed mocks
|
||||
- **Cost efficiency**: Only pay for API calls once during recording
|
||||
- **Deterministic tests**: Same response every time, no API variability
|
||||
- **Transport-level interception**: Works seamlessly with httpx and OpenAI SDK
|
||||
- **Full response capture**: Captures complete HTTP responses including headers and gzipped content
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
tests/
|
||||
├── openai_cassettes/ # Recorded HTTP interactions
|
||||
│ ├── o3_pro_basic_math.json
|
||||
│ └── o3_pro_content_capture.json
|
||||
├── http_transport_recorder.py # Transport recorder implementation
|
||||
├── test_content_capture.py # Example recording test
|
||||
└── test_replay.py # Example replay test
|
||||
```
|
||||
|
||||
## Key Components
|
||||
|
||||
### RecordingTransport
|
||||
- Wraps httpx's default transport
|
||||
- Makes real HTTP calls and captures responses
|
||||
- Handles gzip compression/decompression properly
|
||||
- Saves interactions to JSON cassettes
|
||||
|
||||
### ReplayTransport
|
||||
- Serves saved responses from cassettes
|
||||
- No real HTTP calls made
|
||||
- Matches requests by method, path, and content hash
|
||||
- Re-applies gzip compression when needed
|
||||
|
||||
### TransportFactory
|
||||
- Auto-selects record vs replay mode based on cassette existence
|
||||
- Simplifies test setup
|
||||
|
||||
## Workflow
|
||||
|
||||
### 1. Use Transport Recorder in Tests
|
||||
## Quick Start
|
||||
|
||||
```python
|
||||
from tests.http_transport_recorder import TransportFactory
|
||||
from providers import ModelProviderRegistry
|
||||
|
||||
# Create transport based on cassette existence
|
||||
# Setup transport recorder
|
||||
cassette_path = "tests/openai_cassettes/my_test.json"
|
||||
transport = TransportFactory.create_transport(cassette_path)
|
||||
|
||||
# Inject into OpenAI provider
|
||||
# Inject into provider
|
||||
provider = ModelProviderRegistry.get_provider_for_model("o3-pro")
|
||||
provider._test_transport = transport
|
||||
|
||||
# Make API calls - will be recorded/replayed automatically
|
||||
```
|
||||
|
||||
### 2. Initial Recording (Expensive)
|
||||
|
||||
```bash
|
||||
# With real API key, cassette doesn't exist -> records
|
||||
python test_content_capture.py
|
||||
|
||||
# ⚠️ This will cost money! O3-Pro is $15-60 per 1K tokens
|
||||
# But only needs to be done once
|
||||
```
|
||||
|
||||
### 3. Subsequent Runs (Free)
|
||||
|
||||
```bash
|
||||
# Cassette exists -> replays
|
||||
python test_replay.py
|
||||
|
||||
# Can even use fake API key to prove no real calls
|
||||
OPENAI_API_KEY="sk-fake-key" python test_replay.py
|
||||
|
||||
# Fast, free, deterministic
|
||||
```
|
||||
|
||||
### 4. Re-recording (When API Changes)
|
||||
|
||||
```bash
|
||||
# Delete cassette to force re-recording
|
||||
rm tests/openai_cassettes/my_test.json
|
||||
|
||||
# Run test again with real API key
|
||||
python test_content_capture.py
|
||||
# Make API calls - automatically recorded/replayed
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Transport Injection**: Custom transport injected into httpx client
|
||||
2. **Request Interception**: All HTTP requests go through custom transport
|
||||
3. **Mode Detection**: Checks if cassette exists (replay) or needs creation (record)
|
||||
4. **Content Capture**: Properly handles streaming responses and gzip encoding
|
||||
5. **Request Matching**: Uses method + path + content hash for deterministic matching
|
||||
1. **First run** (cassette doesn't exist): Records real API calls
|
||||
2. **Subsequent runs** (cassette exists): Replays saved responses
|
||||
3. **Re-record**: Delete cassette file and run again
|
||||
|
||||
## Cassette Format
|
||||
## Usage in Tests
|
||||
|
||||
```json
|
||||
{
|
||||
"interactions": [
|
||||
{
|
||||
"request": {
|
||||
"method": "POST",
|
||||
"url": "https://api.openai.com/v1/responses",
|
||||
"path": "/v1/responses",
|
||||
"headers": {
|
||||
"content-type": "application/json",
|
||||
"accept-encoding": "gzip, deflate"
|
||||
},
|
||||
"content": {
|
||||
"model": "o3-pro-2025-06-10",
|
||||
"input": [...],
|
||||
"reasoning": {"effort": "medium"}
|
||||
}
|
||||
},
|
||||
"response": {
|
||||
"status_code": 200,
|
||||
"headers": {
|
||||
"content-type": "application/json",
|
||||
"content-encoding": "gzip"
|
||||
},
|
||||
"content": {
|
||||
"data": "base64_encoded_response_body",
|
||||
"encoding": "base64",
|
||||
"size": 1413
|
||||
},
|
||||
"reason_phrase": "OK"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Key features:
|
||||
- Complete request/response capture
|
||||
- Base64 encoding for binary content
|
||||
- Preserves gzip compression
|
||||
- Sanitizes sensitive data (API keys removed)
|
||||
|
||||
## Benefits Over Previous Approaches
|
||||
|
||||
1. **Works with any HTTP client**: Not tied to OpenAI SDK specifically
|
||||
2. **Handles compression**: Properly manages gzipped responses
|
||||
3. **Full HTTP fidelity**: Captures headers, status codes, etc.
|
||||
4. **Simpler than VCR.py**: No sync/async conflicts or monkey patching
|
||||
5. **Better than respx**: No streaming response issues
|
||||
|
||||
## Example Test
|
||||
See `test_o3_pro_output_text_fix.py` for a complete example:
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
import asyncio
|
||||
from pathlib import Path
|
||||
from tests.http_transport_recorder import TransportFactory
|
||||
from providers import ModelProviderRegistry
|
||||
from tools.chat import ChatTool
|
||||
|
||||
async def test_with_recording():
|
||||
cassette_path = "tests/openai_cassettes/test_example.json"
|
||||
|
||||
# Setup transport
|
||||
transport = TransportFactory.create_transport(cassette_path)
|
||||
provider = ModelProviderRegistry.get_provider_for_model("o3-pro")
|
||||
# Transport factory auto-detects record vs replay mode
|
||||
transport = TransportFactory.create_transport("tests/openai_cassettes/my_test.json")
|
||||
provider._test_transport = transport
|
||||
|
||||
# Use ChatTool normally
|
||||
chat_tool = ChatTool()
|
||||
result = await chat_tool.execute({
|
||||
"prompt": "What is 2+2?",
|
||||
"model": "o3-pro",
|
||||
"temperature": 1.0
|
||||
})
|
||||
|
||||
print(f"Response: {result[0].text}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(test_with_recording())
|
||||
# Use normally - recording happens transparently
|
||||
result = await chat_tool.execute({"prompt": "2+2?", "model": "o3-pro"})
|
||||
```
|
||||
|
||||
## Timeout Protection
|
||||
## File Structure
|
||||
|
||||
Tests can use GNU timeout to prevent hanging:
|
||||
|
||||
```bash
|
||||
# Install GNU coreutils if needed
|
||||
brew install coreutils
|
||||
|
||||
# Run with 30 second timeout
|
||||
gtimeout 30s python test_content_capture.py
|
||||
```
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
```yaml
|
||||
# In CI, tests use existing cassettes (no API keys needed)
|
||||
- name: Run OpenAI tests
|
||||
run: |
|
||||
# Tests will use replay mode with existing cassettes
|
||||
python -m pytest tests/test_o3_pro.py
|
||||
tests/
|
||||
├── openai_cassettes/ # Recorded API interactions
|
||||
│ └── *.json # Cassette files
|
||||
├── http_transport_recorder.py # Transport implementation
|
||||
└── test_o3_pro_output_text_fix.py # Example usage
|
||||
```
|
||||
|
||||
## Cost Management
|
||||
|
||||
- **One-time cost**: Initial recording per test scenario
|
||||
- **One-time cost**: Initial recording only
|
||||
- **Zero ongoing cost**: Replays are free
|
||||
- **Controlled re-recording**: Manual cassette deletion required
|
||||
- **CI-friendly**: No accidental API calls in automation
|
||||
- **CI-friendly**: No API keys needed for replay
|
||||
|
||||
## Re-recording
|
||||
|
||||
When API changes require new recordings:
|
||||
|
||||
```bash
|
||||
# Delete specific cassette
|
||||
rm tests/openai_cassettes/my_test.json
|
||||
|
||||
# Run test with real API key
|
||||
python -m pytest tests/test_o3_pro_output_text_fix.py
|
||||
```
|
||||
|
||||
## Implementation Details
|
||||
|
||||
- **RecordingTransport**: Captures real HTTP calls
|
||||
- **ReplayTransport**: Serves saved responses
|
||||
- **TransportFactory**: Auto-selects mode based on cassette existence
|
||||
- **PII Sanitization**: Automatically removes API keys from recordings
|
||||
|
||||
**Security Note**: Always review new cassette files before committing to ensure no sensitive data is included.
|
||||
|
||||
For implementation details, see `tests/http_transport_recorder.py`.
|
||||
|
||||
This HTTP transport recorder approach provides accurate API testing with cost efficiency, specifically optimized for expensive endpoints like o3-pro while being flexible enough for any HTTP-based API.
|
||||
Reference in New Issue
Block a user