docker related
This commit is contained in:
793
docker/README.md
Normal file
793
docker/README.md
Normal file
@@ -0,0 +1,793 @@
|
||||
# Docker TLS Security Setup
|
||||
|
||||
This directory contains scripts and configuration for securing Docker API access with TLS authentication, replacing the insecure socket mounting approach.
|
||||
|
||||
## Overview
|
||||
|
||||
Previously, the session-manager service mounted the Docker socket (`/var/run/docker.sock`) directly into containers, granting full root access to the host Docker daemon. This is a critical security vulnerability.
|
||||
|
||||
This setup replaces socket mounting with authenticated TLS API access over the network.
|
||||
|
||||
## Security Benefits
|
||||
|
||||
- ✅ **No socket mounting**: Eliminates privilege escalation risk
|
||||
- ✅ **Mutual TLS authentication**: Both client and server authenticate
|
||||
- ✅ **Encrypted communication**: All API calls are encrypted
|
||||
- ✅ **Certificate-based access**: Granular access control
|
||||
- ✅ **Network isolation**: API access is network-bound, not filesystem-bound
|
||||
|
||||
## Docker Service Abstraction
|
||||
|
||||
The session-manager now uses a clean `DockerService` abstraction layer that separates Docker operations from business logic, enabling better testing, maintainability, and future Docker client changes.
|
||||
|
||||
### Architecture Benefits
|
||||
|
||||
- 🧪 **Testability**: MockDockerService enables testing without Docker daemon
|
||||
- 🔧 **Maintainability**: Clean separation of concerns
|
||||
- 🔄 **Flexibility**: Easy to swap Docker client implementations
|
||||
- 📦 **Dependency Injection**: SessionManager receives DockerService via constructor
|
||||
- ⚡ **Performance**: Both async and sync Docker operations supported
|
||||
|
||||
### Service Interface
|
||||
|
||||
```python
|
||||
class DockerService:
|
||||
async def create_container(self, name: str, image: str, **kwargs) -> ContainerInfo
|
||||
async def start_container(self, container_id: str) -> None
|
||||
async def stop_container(self, container_id: str, timeout: int = 10) -> None
|
||||
async def remove_container(self, container_id: str, force: bool = False) -> None
|
||||
async def get_container_info(self, container_id: str) -> Optional[ContainerInfo]
|
||||
async def list_containers(self, all: bool = False) -> List[ContainerInfo]
|
||||
async def ping(self) -> bool
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
Run the comprehensive test suite:
|
||||
|
||||
```bash
|
||||
# Test Docker service abstraction
|
||||
./docker/scripts/test-docker-service.py
|
||||
|
||||
# Results: 7/7 tests passed ✅
|
||||
# - Service Interface ✅
|
||||
# - Error Handling ✅
|
||||
# - Async vs Sync Modes ✅
|
||||
# - Container Info Operations ✅
|
||||
# - Context Management ✅
|
||||
# - Integration Patterns ✅
|
||||
# - Performance and Scaling ✅
|
||||
```
|
||||
|
||||
### Usage in SessionManager
|
||||
|
||||
```python
|
||||
# Dependency injection pattern
|
||||
session_manager = SessionManager(docker_service=DockerService(use_async=True))
|
||||
|
||||
# Or with mock for testing
|
||||
test_manager = SessionManager(docker_service=MockDockerService())
|
||||
```
|
||||
|
||||
## Files Structure
|
||||
|
||||
```
|
||||
docker/
|
||||
├── certs/ # Generated TLS certificates (not in git)
|
||||
├── scripts/
|
||||
│ ├── generate-certs.sh # Certificate generation script
|
||||
│ ├── setup-docker-tls.sh # Docker daemon TLS configuration
|
||||
│ └── test-tls-connection.py # Connection testing script
|
||||
├── daemon.json # Docker daemon TLS configuration
|
||||
└── .env.example # Environment configuration template
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Generate TLS Certificates
|
||||
|
||||
```bash
|
||||
# Generate certificates for development
|
||||
DOCKER_ENV=development ./docker/scripts/generate-certs.sh
|
||||
|
||||
# Or for production with custom settings
|
||||
DOCKER_ENV=production \
|
||||
DOCKER_HOST_IP=your-server-ip \
|
||||
DOCKER_HOST_NAME=your-docker-host \
|
||||
./docker/scripts/generate-certs.sh
|
||||
```
|
||||
|
||||
### 2. Configure Docker Daemon
|
||||
|
||||
**For local development (Docker Desktop):**
|
||||
```bash
|
||||
# Certificates are automatically mounted in docker-compose.yml
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
**For production/server setup:**
|
||||
```bash
|
||||
# Configure system Docker daemon with TLS
|
||||
sudo ./docker/scripts/setup-docker-tls.sh
|
||||
```
|
||||
|
||||
### 3. Configure Environment
|
||||
|
||||
```bash
|
||||
# Copy and customize environment file
|
||||
cp docker/.env.example .env
|
||||
|
||||
# Edit .env with your settings
|
||||
# DOCKER_HOST_IP=host.docker.internal # for Docker Desktop
|
||||
# DOCKER_HOST_IP=your-server-ip # for production
|
||||
```
|
||||
|
||||
### 4. Test Configuration
|
||||
|
||||
```bash
|
||||
# Test TLS connection
|
||||
./docker/scripts/test-tls-connection.py
|
||||
|
||||
# Start services
|
||||
docker-compose --env-file .env up -d session-manager
|
||||
|
||||
# Check logs
|
||||
docker-compose logs session-manager
|
||||
```
|
||||
|
||||
## Configuration Options
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `DOCKER_TLS_VERIFY` | `1` | Enable TLS verification |
|
||||
| `DOCKER_CERT_PATH` | `./docker/certs` | Certificate directory path |
|
||||
| `DOCKER_HOST` | `tcp://host.docker.internal:2376` | Docker daemon endpoint |
|
||||
| `DOCKER_TLS_PORT` | `2376` | TLS port for Docker API |
|
||||
| `DOCKER_CA_CERT` | `./docker/certs/ca.pem` | CA certificate path |
|
||||
| `DOCKER_CLIENT_CERT` | `./docker/certs/client-cert.pem` | Client certificate path |
|
||||
| `DOCKER_CLIENT_KEY` | `./docker/certs/client-key.pem` | Client key path |
|
||||
| `DOCKER_HOST_IP` | `host.docker.internal` | Docker host IP |
|
||||
|
||||
### Certificate Generation Options
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `DOCKER_ENV` | `development` | Environment name for certificates |
|
||||
| `DOCKER_HOST_IP` | `127.0.0.1` | IP address for server certificate |
|
||||
| `DOCKER_HOST_NAME` | `localhost` | Hostname for server certificate |
|
||||
| `DAYS` | `3650` | Certificate validity in days |
|
||||
|
||||
## Production Deployment
|
||||
|
||||
### Certificate Management
|
||||
|
||||
1. **Generate certificates on a secure machine**
|
||||
2. **Distribute to servers securely** (SCP, Ansible, etc.)
|
||||
3. **Set proper permissions**:
|
||||
```bash
|
||||
chmod 444 /etc/docker/certs/*.pem # certs readable by all
|
||||
chmod 400 /etc/docker/certs/*-key.pem # keys readable by root only
|
||||
```
|
||||
4. **Rotate certificates regularly** (every 6-12 months)
|
||||
5. **Revoke compromised certificates** and regenerate
|
||||
|
||||
### Docker Daemon Configuration
|
||||
|
||||
For production servers, use the `setup-docker-tls.sh` script or manually configure `/etc/docker/daemon.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"tls": true,
|
||||
"tlsverify": true,
|
||||
"tlscacert": "/etc/docker/certs/ca.pem",
|
||||
"tlscert": "/etc/docker/certs/server-cert.pem",
|
||||
"tlskey": "/etc/docker/certs/server-key.pem",
|
||||
"hosts": ["tcp://0.0.0.0:2376"],
|
||||
"iptables": false,
|
||||
"bridge": "none",
|
||||
"live-restore": true,
|
||||
"userland-proxy": false,
|
||||
"no-new-privileges": true
|
||||
}
|
||||
```
|
||||
|
||||
### Security Hardening
|
||||
|
||||
- **Firewall**: Only allow TLS port (2376) from trusted networks
|
||||
- **TLS 1.3**: Ensure modern TLS version support
|
||||
- **Certificate pinning**: Consider certificate pinning in client code
|
||||
- **Monitoring**: Log and monitor Docker API access
|
||||
- **Rate limiting**: Implement API rate limiting
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**"Connection refused"**
|
||||
- Check if Docker daemon is running with TLS
|
||||
- Verify `DOCKER_HOST` points to correct endpoint
|
||||
- Ensure firewall allows port 2376
|
||||
|
||||
**"TLS handshake failed"**
|
||||
- Verify certificates exist and have correct permissions
|
||||
- Check certificate validity dates
|
||||
- Ensure CA certificate is correct
|
||||
|
||||
**"Permission denied"**
|
||||
- Check certificate file permissions (444 for certs, 400 for keys)
|
||||
- Ensure client certificate is signed by the CA
|
||||
|
||||
### Debug Commands
|
||||
|
||||
```bash
|
||||
# Test TLS connection manually
|
||||
docker --tlsverify \
|
||||
--tlscacert=./docker/certs/ca.pem \
|
||||
--tlscert=./docker/certs/client-cert.pem \
|
||||
--tlskey=./docker/certs/client-key.pem \
|
||||
-H tcp://host.docker.internal:2376 \
|
||||
version
|
||||
|
||||
# Check certificate validity
|
||||
openssl x509 -in ./docker/certs/server-cert.pem -text -noout
|
||||
|
||||
# Test from container
|
||||
docker-compose exec session-manager ./docker/scripts/test-tls-connection.py
|
||||
```
|
||||
|
||||
## Migration from Socket Mounting
|
||||
|
||||
### Before (Insecure)
|
||||
```yaml
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock
|
||||
```
|
||||
|
||||
### After (Secure)
|
||||
```yaml
|
||||
volumes:
|
||||
- ./docker/certs:/etc/docker/certs:ro
|
||||
environment:
|
||||
- DOCKER_TLS_VERIFY=1
|
||||
- DOCKER_HOST=tcp://host.docker.internal:2376
|
||||
```
|
||||
|
||||
### Code Changes Required
|
||||
|
||||
Update Docker client initialization:
|
||||
```python
|
||||
# Before
|
||||
self.docker_client = docker.from_env()
|
||||
|
||||
# After
|
||||
tls_config = docker.tls.TLSConfig(
|
||||
ca_cert=os.getenv('DOCKER_CA_CERT'),
|
||||
client_cert=(os.getenv('DOCKER_CLIENT_CERT'), os.getenv('DOCKER_CLIENT_KEY')),
|
||||
verify=True
|
||||
)
|
||||
self.docker_client = docker.from_env()
|
||||
self.docker_client.api = docker.APIClient(
|
||||
base_url=os.getenv('DOCKER_HOST'),
|
||||
tls=tls_config
|
||||
)
|
||||
```
|
||||
|
||||
## Dynamic Host IP Detection
|
||||
|
||||
The session-manager service now includes robust host IP detection to support proxy routing across different Docker environments:
|
||||
|
||||
### Supported Environments
|
||||
|
||||
- **Docker Desktop (Mac/Windows)**: Uses `host.docker.internal` resolution
|
||||
- **Linux Docker**: Reads gateway from `/proc/net/route`
|
||||
- **Cloud environments**: Respects `DOCKER_HOST_GATEWAY` and `GATEWAY` environment variables
|
||||
- **Custom networks**: Tests connectivity to common Docker gateway IPs
|
||||
|
||||
### Detection Methods (in priority order)
|
||||
|
||||
1. **Docker Internal**: Resolves `host.docker.internal` (Docker Desktop)
|
||||
2. **Environment Variables**: Checks `HOST_IP`, `DOCKER_HOST_GATEWAY`, `GATEWAY`
|
||||
3. **Route Table**: Parses `/proc/net/route` for default gateway
|
||||
4. **Network Connection**: Tests connectivity to determine local routing
|
||||
5. **Common Gateways**: Falls back to known Docker bridge IPs
|
||||
|
||||
### Configuration
|
||||
|
||||
The detection is automatic and cached for 5 minutes. Override with:
|
||||
|
||||
```bash
|
||||
# Force specific host IP
|
||||
export HOST_IP=192.168.1.100
|
||||
|
||||
# Or in docker-compose.yml
|
||||
environment:
|
||||
- HOST_IP=your-host-ip
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
```bash
|
||||
# Test host IP detection
|
||||
./docker/scripts/test-host-ip-detection.py
|
||||
|
||||
# Run integration test
|
||||
./docker/scripts/test-integration.sh
|
||||
```
|
||||
|
||||
### Troubleshooting
|
||||
|
||||
**"Could not detect Docker host IP"**
|
||||
- Check network configuration: `docker network inspect bridge`
|
||||
- Verify environment variables
|
||||
- Test connectivity: `ping host.docker.internal`
|
||||
- Set explicit `HOST_IP` if needed
|
||||
|
||||
**Proxy routing fails**
|
||||
- Verify detected IP is accessible from containers
|
||||
- Check firewall rules blocking container-to-host traffic
|
||||
- Ensure Docker network allows communication
|
||||
|
||||
## Structured Logging
|
||||
|
||||
Comprehensive logging infrastructure with structured JSON logs, request tracking, and production-ready log management for debugging and monitoring.
|
||||
|
||||
### Log Features
|
||||
|
||||
- **Structured JSON Logs**: Machine-readable logs for production analysis
|
||||
- **Request ID Tracking**: Trace requests across distributed operations
|
||||
- **Human-Readable Development**: Clear logs for local development
|
||||
- **Performance Metrics**: Built-in request timing and performance tracking
|
||||
- **Security Event Logging**: Audit trail for security-related events
|
||||
- **Log Rotation**: Automatic log rotation with size limits
|
||||
|
||||
### Configuration
|
||||
|
||||
```bash
|
||||
# Log level and format
|
||||
export LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL
|
||||
export LOG_FORMAT=auto # json, human, auto (detects environment)
|
||||
|
||||
# File logging
|
||||
export LOG_FILE=/var/log/lovdata-chat.log
|
||||
export LOG_MAX_SIZE_MB=10 # Max log file size
|
||||
export LOG_BACKUP_COUNT=5 # Number of backup files
|
||||
|
||||
# Output control
|
||||
export LOG_CONSOLE=true # Enable console logging
|
||||
export LOG_FILE_ENABLED=true # Enable file logging
|
||||
```
|
||||
|
||||
### Testing Structured Logging
|
||||
|
||||
```bash
|
||||
# Test logging functionality and formatters
|
||||
./docker/scripts/test-structured-logging.py
|
||||
```
|
||||
|
||||
### Log Analysis
|
||||
|
||||
**JSON Format (Production):**
|
||||
```json
|
||||
{
|
||||
"timestamp": "2024-01-15T10:30:45.123Z",
|
||||
"level": "INFO",
|
||||
"logger": "session_manager.main",
|
||||
"message": "Session created successfully",
|
||||
"request_id": "req-abc123",
|
||||
"session_id": "ses-xyz789",
|
||||
"operation": "create_session",
|
||||
"duration_ms": 245.67
|
||||
}
|
||||
```
|
||||
|
||||
**Human-Readable Format (Development):**
|
||||
```
|
||||
2024-01-15 10:30:45 [INFO ] session_manager.main:create_session:145 [req-abc123] - Session created successfully
|
||||
```
|
||||
|
||||
### Request Tracing
|
||||
|
||||
All logs include request IDs for tracing operations across the system:
|
||||
|
||||
```python
|
||||
with RequestContext():
|
||||
log_session_operation(session_id, "created")
|
||||
# All subsequent logs in this context include request_id
|
||||
```
|
||||
|
||||
## Database Persistence
|
||||
|
||||
Session data is now stored in PostgreSQL for reliability, multi-instance deployment support, and elimination of JSON file corruption vulnerabilities.
|
||||
|
||||
### Database Configuration
|
||||
|
||||
```bash
|
||||
# PostgreSQL connection settings
|
||||
export DB_HOST=localhost # Database host
|
||||
export DB_PORT=5432 # Database port
|
||||
export DB_USER=lovdata # Database user
|
||||
export DB_PASSWORD=password # Database password
|
||||
export DB_NAME=lovdata_chat # Database name
|
||||
|
||||
# Connection pool settings
|
||||
export DB_MIN_CONNECTIONS=5 # Minimum pool connections
|
||||
export DB_MAX_CONNECTIONS=20 # Maximum pool connections
|
||||
export DB_MAX_QUERIES=50000 # Max queries per connection
|
||||
export DB_MAX_INACTIVE_LIFETIME=300.0 # Connection timeout
|
||||
```
|
||||
|
||||
### Storage Backend Selection
|
||||
|
||||
```bash
|
||||
# Enable database storage (recommended for production)
|
||||
export USE_DATABASE_STORAGE=true
|
||||
|
||||
# Or use JSON file storage (legacy/development)
|
||||
export USE_DATABASE_STORAGE=false
|
||||
```
|
||||
|
||||
### Database Schema
|
||||
|
||||
**Sessions Table:**
|
||||
- `session_id` (VARCHAR, Primary Key): Unique session identifier
|
||||
- `container_name` (VARCHAR): Docker container name
|
||||
- `container_id` (VARCHAR): Docker container ID
|
||||
- `host_dir` (VARCHAR): Host directory path
|
||||
- `port` (INTEGER): Container port
|
||||
- `auth_token` (VARCHAR): Authentication token
|
||||
- `created_at` (TIMESTAMP): Creation timestamp
|
||||
- `last_accessed` (TIMESTAMP): Last access timestamp
|
||||
- `status` (VARCHAR): Session status (creating, running, stopped, error)
|
||||
- `metadata` (JSONB): Additional session metadata
|
||||
|
||||
**Indexes:**
|
||||
- Primary key on `session_id`
|
||||
- Status index for filtering active sessions
|
||||
- Last accessed index for cleanup operations
|
||||
- Created at index for session listing
|
||||
|
||||
### Testing Database Persistence
|
||||
|
||||
```bash
|
||||
# Test database connection and operations
|
||||
./docker/scripts/test-database-persistence.py
|
||||
```
|
||||
|
||||
### Health Monitoring
|
||||
|
||||
The `/health` endpoint now includes database status:
|
||||
|
||||
```json
|
||||
{
|
||||
"storage_backend": "database",
|
||||
"database": {
|
||||
"status": "healthy",
|
||||
"total_sessions": 15,
|
||||
"active_sessions": 8,
|
||||
"database_size": "25 MB"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Migration Strategy
|
||||
|
||||
**From JSON File to Database:**
|
||||
1. **Backup existing sessions** (if any)
|
||||
2. **Set environment variables** for database connection
|
||||
3. **Enable database storage**: `USE_DATABASE_STORAGE=true`
|
||||
4. **Restart service** - automatic schema creation and migration
|
||||
5. **Verify data migration** in health endpoint
|
||||
6. **Monitor performance** and adjust connection pool settings
|
||||
|
||||
**Backward Compatibility:**
|
||||
- JSON file storage remains available for development
|
||||
- Automatic fallback if database is unavailable
|
||||
- Zero-downtime migration possible
|
||||
|
||||
## Container Health Monitoring
|
||||
|
||||
Active monitoring of Docker containers with automatic failure detection and recovery mechanisms to prevent stuck sessions and improve system reliability.
|
||||
|
||||
### Health Monitoring Features
|
||||
|
||||
- **Periodic Health Checks**: Continuous monitoring of running containers every 30 seconds
|
||||
- **Automatic Failure Detection**: Identifies unhealthy or failed containers
|
||||
- **Smart Restart Logic**: Automatic container restart with configurable limits
|
||||
- **Health History Tracking**: Maintains health check history for analysis
|
||||
- **Status Integration**: Updates session status based on container health
|
||||
|
||||
### Configuration
|
||||
|
||||
```bash
|
||||
# Health check intervals and timeouts
|
||||
CONTAINER_HEALTH_CHECK_INTERVAL=30 # Check every 30 seconds
|
||||
CONTAINER_HEALTH_TIMEOUT=10.0 # Health check timeout
|
||||
CONTAINER_MAX_RESTART_ATTEMPTS=3 # Max restart attempts
|
||||
CONTAINER_RESTART_DELAY=5 # Delay between restarts
|
||||
CONTAINER_FAILURE_THRESHOLD=3 # Failures before restart
|
||||
```
|
||||
|
||||
### Health Status Types
|
||||
|
||||
- **HEALTHY**: Container running normally with optional health checks passing
|
||||
- **UNHEALTHY**: Container running but health checks failing
|
||||
- **RESTARTING**: Container being restarted due to failures
|
||||
- **FAILED**: Container stopped or permanently failed
|
||||
- **UNKNOWN**: Unable to determine container status
|
||||
|
||||
### Testing Health Monitoring
|
||||
|
||||
```bash
|
||||
# Test health monitoring functionality
|
||||
./docker/scripts/test-container-health.py
|
||||
```
|
||||
|
||||
### Health Endpoints
|
||||
|
||||
**System Health:**
|
||||
```bash
|
||||
GET /health # Includes container health statistics
|
||||
```
|
||||
|
||||
**Detailed Container Health:**
|
||||
```bash
|
||||
GET /health/container # Overall health stats
|
||||
GET /health/container/{session_id} # Specific session health
|
||||
```
|
||||
|
||||
**Health Response:**
|
||||
```json
|
||||
{
|
||||
"container_health": {
|
||||
"monitoring_active": true,
|
||||
"check_interval": 30,
|
||||
"total_sessions_monitored": 5,
|
||||
"sessions_with_failures": 1,
|
||||
"session_ses123": {
|
||||
"total_checks": 10,
|
||||
"healthy_checks": 8,
|
||||
"failed_checks": 2,
|
||||
"average_response_time": 45.2
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Recovery Mechanisms
|
||||
|
||||
1. **Health Check Failure**: Container marked as unhealthy
|
||||
2. **Consecutive Failures**: After threshold, automatic restart initiated
|
||||
3. **Restart Attempts**: Limited to prevent infinite restart loops
|
||||
4. **Session Status Update**: Session status reflects container health
|
||||
5. **Logging & Alerts**: Comprehensive logging of health events
|
||||
|
||||
### Integration Benefits
|
||||
|
||||
- **Proactive Monitoring**: Detects issues before users are affected
|
||||
- **Automatic Recovery**: Reduces manual intervention requirements
|
||||
- **Improved Reliability**: Prevents stuck sessions and system instability
|
||||
- **Operational Visibility**: Detailed health metrics and history
|
||||
- **Scalable Architecture**: Works with multiple concurrent sessions
|
||||
|
||||
## Session Authentication
|
||||
|
||||
OpenCode servers now require token-based authentication for secure individual user sessions, preventing unauthorized access and ensuring session isolation.
|
||||
|
||||
### Authentication Features
|
||||
|
||||
- **Token Generation**: Unique cryptographically secure tokens per session
|
||||
- **Automatic Expiry**: Configurable token lifetime (default 24 hours)
|
||||
- **Token Rotation**: Ability to rotate tokens for enhanced security
|
||||
- **Session Isolation**: Each user session has its own authentication credentials
|
||||
- **Proxy Integration**: Authentication headers automatically included in proxy requests
|
||||
|
||||
### Configuration
|
||||
|
||||
```bash
|
||||
# Token configuration
|
||||
export SESSION_TOKEN_LENGTH=32 # Token length in characters
|
||||
export SESSION_TOKEN_EXPIRY_HOURS=24 # Token validity period
|
||||
export SESSION_TOKEN_SECRET=auto # Token signing secret (auto-generated)
|
||||
export TOKEN_CLEANUP_INTERVAL_MINUTES=60 # Expired token cleanup interval
|
||||
```
|
||||
|
||||
### Testing Authentication
|
||||
|
||||
```bash
|
||||
# Test authentication functionality
|
||||
./docker/scripts/test-session-auth.py
|
||||
|
||||
# End-to-end authentication testing
|
||||
./docker/scripts/test-auth-end-to-end.sh
|
||||
```
|
||||
|
||||
### API Endpoints
|
||||
|
||||
**Authentication Management:**
|
||||
- `GET /sessions/{id}/auth` - Get session authentication info
|
||||
- `POST /sessions/{id}/auth/rotate` - Rotate session token
|
||||
- `GET /auth/sessions` - List authenticated sessions
|
||||
|
||||
**Health Monitoring:**
|
||||
```json
|
||||
{
|
||||
"authenticated_sessions": 3,
|
||||
"status": "healthy"
|
||||
}
|
||||
```
|
||||
|
||||
### Security Benefits
|
||||
|
||||
- **Session Isolation**: Users cannot access each other's OpenCode servers
|
||||
- **Token Expiry**: Automatic cleanup prevents token accumulation
|
||||
- **Secure Generation**: Cryptographically secure random tokens
|
||||
- **Proxy Security**: Authentication headers prevent unauthorized proxy access
|
||||
|
||||
## HTTP Connection Pooling
|
||||
|
||||
Proxy requests now use a global HTTP connection pool instead of creating new httpx clients for each request, eliminating connection overhead and dramatically improving proxy performance.
|
||||
|
||||
### Connection Pool Benefits
|
||||
|
||||
- **Eliminated Connection Overhead**: No more client creation/teardown per request
|
||||
- **Connection Reuse**: Persistent keep-alive connections reduce latency
|
||||
- **Improved Throughput**: Handle significantly more concurrent proxy requests
|
||||
- **Reduced Resource Usage**: Lower memory and CPU overhead for HTTP operations
|
||||
- **Better Scalability**: Support higher request rates with the same system resources
|
||||
|
||||
### Pool Configuration
|
||||
|
||||
The connection pool is automatically configured with optimized settings:
|
||||
|
||||
```python
|
||||
# Connection pool settings
|
||||
max_keepalive_connections=20 # Keep connections alive
|
||||
max_connections=100 # Max total connections
|
||||
keepalive_expiry=300.0 # 5-minute connection lifetime
|
||||
connect_timeout=10.0 # Connection establishment timeout
|
||||
read_timeout=30.0 # Read operation timeout
|
||||
```
|
||||
|
||||
### Performance Testing
|
||||
|
||||
```bash
|
||||
# Test HTTP connection pool functionality
|
||||
./docker/scripts/test-http-connection-pool.py
|
||||
|
||||
# Load test proxy performance improvements
|
||||
./docker/scripts/test-http-pool-load.sh
|
||||
```
|
||||
|
||||
### Health Monitoring
|
||||
|
||||
The `/health` endpoint now includes HTTP connection pool status:
|
||||
|
||||
```json
|
||||
{
|
||||
"http_connection_pool": {
|
||||
"status": "healthy",
|
||||
"config": {
|
||||
"max_keepalive_connections": 20,
|
||||
"max_connections": 100,
|
||||
"keepalive_expiry": 300.0
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Async Docker Operations
|
||||
|
||||
Docker operations now run asynchronously using aiodeocker to eliminate blocking calls in FastAPI's async event loop, significantly improving concurrency and preventing thread pool exhaustion.
|
||||
|
||||
### Async Benefits
|
||||
|
||||
- **Non-Blocking Operations**: Container creation, management, and cleanup no longer block the event loop
|
||||
- **Improved Concurrency**: Handle multiple concurrent user sessions without performance degradation
|
||||
- **Better Scalability**: Support higher throughput with the same system resources
|
||||
- **Thread Pool Preservation**: Prevent exhaustion of async thread pools
|
||||
|
||||
### Configuration
|
||||
|
||||
```bash
|
||||
# Enable async Docker operations (recommended)
|
||||
export USE_ASYNC_DOCKER=true
|
||||
|
||||
# Or disable for sync mode (legacy)
|
||||
export USE_ASYNC_DOCKER=false
|
||||
```
|
||||
|
||||
### Testing Async Operations
|
||||
|
||||
```bash
|
||||
# Test async Docker functionality
|
||||
./docker/scripts/test-async-docker.py
|
||||
|
||||
# Load test concurrent operations
|
||||
./docker/scripts/test-async-docker-load.sh
|
||||
```
|
||||
|
||||
### Performance Impact
|
||||
|
||||
Async operations provide significant performance improvements:
|
||||
|
||||
- **Concurrent Sessions**: Handle 10+ concurrent container operations without blocking
|
||||
- **Response Times**: Faster session creation under load
|
||||
- **Resource Efficiency**: Better CPU utilization with non-blocking I/O
|
||||
- **Scalability**: Support more users per server instance
|
||||
|
||||
## Resource Limits Enforcement
|
||||
|
||||
Container resource limits are now actively enforced to prevent resource exhaustion attacks and ensure fair resource allocation across user sessions.
|
||||
|
||||
### Configurable Limits
|
||||
|
||||
| Environment Variable | Default | Description |
|
||||
|---------------------|---------|-------------|
|
||||
| `CONTAINER_MEMORY_LIMIT` | `4g` | Memory limit per container |
|
||||
| `CONTAINER_CPU_QUOTA` | `100000` | CPU quota (microseconds per period) |
|
||||
| `CONTAINER_CPU_PERIOD` | `100000` | CPU period (microseconds) |
|
||||
| `MAX_CONCURRENT_SESSIONS` | `3` | Maximum concurrent user sessions |
|
||||
| `MEMORY_WARNING_THRESHOLD` | `0.8` | Memory usage warning threshold (80%) |
|
||||
| `CPU_WARNING_THRESHOLD` | `0.9` | CPU usage warning threshold (90%) |
|
||||
|
||||
### Resource Protection Features
|
||||
|
||||
- **Memory Limits**: Prevents containers from consuming unlimited RAM
|
||||
- **CPU Quotas**: Ensures fair CPU allocation across sessions
|
||||
- **Session Throttling**: Blocks new sessions when resources are constrained
|
||||
- **System Monitoring**: Continuous resource usage tracking
|
||||
- **Graceful Degradation**: Alerts and throttling before system failure
|
||||
|
||||
### Testing Resource Limits
|
||||
|
||||
```bash
|
||||
# Test resource limit configuration and validation
|
||||
./docker/scripts/test-resource-limits.py
|
||||
|
||||
# Load testing with enforcement verification
|
||||
./docker/scripts/test-resource-limits-load.sh
|
||||
```
|
||||
|
||||
### Health Monitoring
|
||||
|
||||
The `/health` endpoint now includes comprehensive resource information:
|
||||
|
||||
```json
|
||||
{
|
||||
"resource_limits": {
|
||||
"memory_limit": "4g",
|
||||
"cpu_quota": 100000,
|
||||
"max_concurrent_sessions": 3
|
||||
},
|
||||
"system_resources": {
|
||||
"memory_percent": 0.65,
|
||||
"cpu_percent": 0.45
|
||||
},
|
||||
"resource_alerts": []
|
||||
}
|
||||
```
|
||||
|
||||
### Resource Alert Levels
|
||||
|
||||
- **Warning**: System resources approaching limits (80% memory, 90% CPU)
|
||||
- **Critical**: System resources at dangerous levels (95%+ usage)
|
||||
- **Throttling**: New sessions blocked when critical alerts active
|
||||
|
||||
## Security Audit Checklist
|
||||
|
||||
- [ ] TLS certificates generated with strong encryption
|
||||
- [ ] Certificate permissions set correctly (400/444)
|
||||
- [ ] No socket mounting in docker-compose.yml
|
||||
- [ ] Environment variables properly configured
|
||||
- [ ] TLS connection tested successfully
|
||||
- [ ] Host IP detection working correctly
|
||||
- [ ] Proxy routing functional across environments
|
||||
- [ ] Resource limits properly configured and enforced
|
||||
- [ ] Session throttling prevents resource exhaustion
|
||||
- [ ] System resource monitoring active
|
||||
- [ ] Certificate rotation process documented
|
||||
- [ ] Firewall rules restrict Docker API access
|
||||
- [ ] Docker daemon configured with security options
|
||||
- [ ] Monitoring and logging enabled for API access
|
||||
Reference in New Issue
Block a user