consolidated readme
This commit is contained in:
4
.gitignore
vendored
4
.gitignore
vendored
@@ -1,2 +1,6 @@
|
|||||||
__pycache__
|
__pycache__
|
||||||
.env
|
.env
|
||||||
|
node_modules
|
||||||
|
cypress/screenshots
|
||||||
|
cypress/videos
|
||||||
|
cypress/downloads
|
||||||
|
|||||||
@@ -1,66 +0,0 @@
|
|||||||
# Lovdata Chat Development Environment
|
|
||||||
|
|
||||||
This setup creates a container-per-visitor architecture for the Norwegian legal research chat interface with socket-based Docker communication.
|
|
||||||
|
|
||||||
## Quick Start
|
|
||||||
|
|
||||||
1. **Set up environment variables:**
|
|
||||||
```bash
|
|
||||||
cp .env.example .env
|
|
||||||
# Edit .env with your API keys and MCP server URL
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **Start the services:**
|
|
||||||
```bash
|
|
||||||
docker-compose up --build
|
|
||||||
```
|
|
||||||
|
|
||||||
4. **Create a session:**
|
|
||||||
```bash
|
|
||||||
curl http://localhost/api/sessions -X POST
|
|
||||||
```
|
|
||||||
|
|
||||||
5. **Access the chat interface:**
|
|
||||||
Open the returned URL in your browser
|
|
||||||
|
|
||||||
## Architecture
|
|
||||||
|
|
||||||
- **session-manager**: FastAPI service managing container lifecycles with socket-based Docker communication
|
|
||||||
- **lovdata-mcp**: External Norwegian legal research MCP server (configured via MCP_SERVER env var)
|
|
||||||
- **caddy**: Reverse proxy with dynamic session-based routing
|
|
||||||
|
|
||||||
## Security Features
|
|
||||||
|
|
||||||
- **Socket-based Docker communication**: Direct Unix socket access for container management
|
|
||||||
- **Container isolation**: Each visitor gets dedicated container with resource limits
|
|
||||||
- **Automatic cleanup**: Sessions expire after 60 minutes of inactivity
|
|
||||||
- **Resource quotas**: 4GB RAM, 1 CPU core per container, max 3 concurrent sessions
|
|
||||||
|
|
||||||
## Development Notes
|
|
||||||
|
|
||||||
- Session data persists in ./sessions/ directory
|
|
||||||
- Docker socket mounted from host for development
|
|
||||||
- External MCP server configured via environment variables
|
|
||||||
- Health checks ensure service reliability
|
|
||||||
|
|
||||||
## API Endpoints
|
|
||||||
|
|
||||||
- `POST /api/sessions` - Create new session
|
|
||||||
- `GET /api/sessions` - List all sessions
|
|
||||||
- `GET /api/sessions/{id}` - Get session info
|
|
||||||
- `DELETE /api/sessions/{id}` - Delete session
|
|
||||||
- `POST /api/cleanup` - Manual cleanup
|
|
||||||
- `GET /api/health` - Health check
|
|
||||||
- `/{path}` - Dynamic proxy routing (with X-Session-ID header)
|
|
||||||
|
|
||||||
## Environment Variables
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Required
|
|
||||||
MCP_SERVER=http://your-lovdata-mcp-server:8001
|
|
||||||
|
|
||||||
# Optional LLM API keys
|
|
||||||
OPENAI_API_KEY=your_key
|
|
||||||
ANTHROPIC_API_KEY=your_key
|
|
||||||
GOOGLE_API_KEY=your_key
|
|
||||||
```
|
|
||||||
341
README.md
341
README.md
@@ -1,239 +1,162 @@
|
|||||||
# Lovdata Chat Interface
|
# Lovdata Chat Interface
|
||||||
|
|
||||||
A web-based chat interface that allows users to interact with Large Language Models (LLMs) equipped with Norwegian legal research tools from the Lovdata MCP server.
|
A container-per-session architecture for Norwegian legal research. Each user session gets an isolated [OpenCode](https://opencode.ai/) container connected to the external [Lovdata MCP server](https://modelcontextprotocol.io/), which provides 15+ tools for searching Norwegian laws, provisions, and cross-references.
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
This project creates a chat interface where users can:
|
|
||||||
- Choose from multiple LLM providers (OpenAI, Anthropic, Google Gemini)
|
|
||||||
- Have conversations enhanced with Norwegian legal document search capabilities
|
|
||||||
- Access laws, regulations, and legal provisions through AI-powered semantic search
|
|
||||||
- Receive properly cited legal information with cross-references
|
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
### Backend (FastAPI)
|
```
|
||||||
- **LLM Provider Layer**: Abstract interface supporting multiple LLM providers with tool calling
|
Users → Caddy (reverse proxy) → Session Manager (FastAPI)
|
||||||
- **MCP Integration**: Client connection to lovdata-ai MCP server
|
↓
|
||||||
- **Skill System**: Norwegian legal research guidance and best practices
|
Docker-in-Docker daemon
|
||||||
- **Chat Management**: Conversation history, streaming responses, session management
|
↓ ↓ ↓
|
||||||
|
[OC 1] [OC 2] [OC 3] ← OpenCode containers
|
||||||
|
↓ ↓ ↓
|
||||||
|
Lovdata MCP Server (external)
|
||||||
|
LLM APIs (OpenAI/Anthropic/Google)
|
||||||
|
```
|
||||||
|
|
||||||
### Frontend (Next.js)
|
| Component | Purpose |
|
||||||
- **Chat Interface**: Real-time messaging with streaming responses
|
|-----------|---------|
|
||||||
- **Model Selector**: Dropdown to choose LLM provider and model
|
| **Session Manager** | FastAPI service managing OpenCode container lifecycles |
|
||||||
- **Tool Visualization**: Display when legal tools are being used
|
| **OpenCode Containers** | Isolated chat environments with MCP integration |
|
||||||
- **Citation Rendering**: Properly formatted legal references and cross-references
|
| **Lovdata MCP Server** | External Norwegian legal research (laws, provisions, cross-references) |
|
||||||
|
| **Caddy** | Reverse proxy with dynamic session-based routing |
|
||||||
|
| **PostgreSQL** | Session persistence across restarts |
|
||||||
|
| **Docker-in-Docker** | TLS-secured Docker daemon for container management |
|
||||||
|
|
||||||
### External Dependencies
|
### Session Manager Components
|
||||||
- **Lovdata MCP Server**: Provides 15+ tools for Norwegian legal research
|
|
||||||
- **PostgreSQL Database**: Vector embeddings for semantic search
|
|
||||||
- **LLM APIs**: OpenAI, Anthropic, Google Gemini (with API keys)
|
|
||||||
|
|
||||||
## Supported LLM Providers
|
```
|
||||||
|
main.py → FastAPI endpoints, session lifecycle orchestration
|
||||||
|
docker_service.py → Docker abstraction layer (testable, mockable)
|
||||||
|
async_docker_client.py → Async Docker operations
|
||||||
|
database.py → PostgreSQL session persistence with asyncpg
|
||||||
|
session_auth.py → Token-based session authentication
|
||||||
|
container_health.py → Health monitoring and auto-recovery
|
||||||
|
resource_manager.py → CPU/memory limits, throttling
|
||||||
|
http_pool.py → Connection pooling for container HTTP requests
|
||||||
|
host_ip_detector.py → Docker host IP detection
|
||||||
|
logging_config.py → Structured JSON logging with context
|
||||||
|
```
|
||||||
|
|
||||||
| Provider | Models | Tool Support | Notes |
|
## Quick Start
|
||||||
|----------|--------|--------------|-------|
|
|
||||||
| OpenAI | GPT-4, GPT-4o | ✅ Native | Requires API key |
|
|
||||||
| Anthropic | Claude-3.5-Sonnet | ✅ Native | Requires API key |
|
|
||||||
| Google | Gemini-1.5-Pro | ✅ Function calling | Requires API key |
|
|
||||||
| Local | Ollama models | ⚠️ Limited | Self-hosted option |
|
|
||||||
|
|
||||||
## MCP Tools Available
|
1. **Set up environment variables:**
|
||||||
|
|
||||||
The interface integrates all tools from the lovdata-ai MCP server:
|
|
||||||
|
|
||||||
### Law Document Tools
|
|
||||||
- `get_law`: Retrieve specific laws by ID or title
|
|
||||||
- `list_laws`: Browse laws with filtering and pagination
|
|
||||||
- `get_law_content`: Get HTML content of laws
|
|
||||||
- `get_law_text`: Get plain text content
|
|
||||||
|
|
||||||
### Search Tools
|
|
||||||
- `search_laws_fulltext`: Full-text search in laws
|
|
||||||
- `search_laws_semantic`: Semantic search using vector embeddings
|
|
||||||
- `search_provisions_fulltext`: Full-text search in provisions
|
|
||||||
- `search_provisions_semantic`: Semantic search in provisions
|
|
||||||
|
|
||||||
### Provision Tools
|
|
||||||
- `get_provision`: Get individual legal provisions
|
|
||||||
- `list_provisions`: List all provisions in a law
|
|
||||||
- `get_provisions_batch`: Bulk retrieval for RAG applications
|
|
||||||
|
|
||||||
### Reference Tools
|
|
||||||
- `get_cross_references`: Find references from/to provisions
|
|
||||||
- `resolve_reference`: Parse legal reference strings (e.g., "lov/2014-06-20-42/§8")
|
|
||||||
|
|
||||||
## Skills Integration
|
|
||||||
|
|
||||||
The system loads Norwegian legal research skills that ensure:
|
|
||||||
- Proper citation standards (Lovdata URL formatting)
|
|
||||||
- Appropriate legal terminology usage
|
|
||||||
- Clear distinction between information and legal advice
|
|
||||||
- Systematic amendment tracking
|
|
||||||
- Cross-reference analysis
|
|
||||||
|
|
||||||
## Implementation Plan
|
|
||||||
|
|
||||||
### Phase 1: Core Infrastructure
|
|
||||||
1. **Project Structure Setup**
|
|
||||||
- Create backend (FastAPI) and frontend (Next.js) directories
|
|
||||||
- Set up Python virtual environment and Node.js dependencies
|
|
||||||
- Configure development tooling (linting, testing, formatting)
|
|
||||||
|
|
||||||
2. **LLM Provider Abstraction**
|
|
||||||
- Create abstract base class for LLM providers
|
|
||||||
- Implement OpenAI, Anthropic, and Google Gemini clients
|
|
||||||
- Add tool calling support and response streaming
|
|
||||||
- Implement provider switching logic
|
|
||||||
|
|
||||||
3. **MCP Server Integration**
|
|
||||||
- Build MCP client to connect to lovdata-ai server
|
|
||||||
- Create tool registry and execution pipeline
|
|
||||||
- Add error handling and retry logic
|
|
||||||
- Implement tool result formatting for LLM consumption
|
|
||||||
|
|
||||||
### Phase 2: Chat Functionality
|
|
||||||
4. **Backend API Development**
|
|
||||||
- Create chat session management endpoints
|
|
||||||
- Implement conversation history storage
|
|
||||||
- Add streaming response support
|
|
||||||
- Build health check and monitoring endpoints
|
|
||||||
|
|
||||||
5. **Skill System Implementation**
|
|
||||||
- Create skill loading and parsing system
|
|
||||||
- Implement skill application to LLM prompts
|
|
||||||
- Add skill validation and error handling
|
|
||||||
- Create skill management API endpoints
|
|
||||||
|
|
||||||
### Phase 3: Frontend Development
|
|
||||||
6. **Chat Interface**
|
|
||||||
- Build responsive chat UI with message history
|
|
||||||
- Implement real-time message streaming
|
|
||||||
- Add message formatting for legal citations
|
|
||||||
- Create conversation management (new chat, clear history)
|
|
||||||
|
|
||||||
7. **Model Selection UI**
|
|
||||||
- Create LLM provider and model selector
|
|
||||||
- Add API key management (secure storage)
|
|
||||||
- Implement model switching during conversations
|
|
||||||
- Add model capability indicators
|
|
||||||
|
|
||||||
8. **Tool Usage Visualization**
|
|
||||||
- Display when MCP tools are being used
|
|
||||||
- Show tool execution results in chat
|
|
||||||
- Add legal citation formatting
|
|
||||||
- Create expandable tool result views
|
|
||||||
|
|
||||||
### Phase 4: Deployment & Production
|
|
||||||
9. **Containerization**
|
|
||||||
- Create Dockerfiles for backend and frontend
|
|
||||||
- Set up Docker Compose for development
|
|
||||||
- Configure production Docker Compose
|
|
||||||
- Add environment variable management
|
|
||||||
|
|
||||||
10. **Deployment Configuration**
|
|
||||||
- Set up CI/CD pipeline (GitHub Actions)
|
|
||||||
- Configure cloud deployment (Railway/Render)
|
|
||||||
- Add reverse proxy configuration
|
|
||||||
- Implement SSL certificate management
|
|
||||||
|
|
||||||
11. **Monitoring & Error Handling**
|
|
||||||
- Add comprehensive logging
|
|
||||||
- Implement error tracking and reporting
|
|
||||||
- Create health check endpoints
|
|
||||||
- Add rate limiting and abuse protection
|
|
||||||
|
|
||||||
12. **Documentation**
|
|
||||||
- Create setup and deployment guides
|
|
||||||
- Document API endpoints
|
|
||||||
- Add user documentation
|
|
||||||
- Create troubleshooting guides
|
|
||||||
|
|
||||||
## Development Setup
|
|
||||||
|
|
||||||
### Prerequisites
|
|
||||||
- Python 3.12+
|
|
||||||
- Node.js 18+
|
|
||||||
- Docker and Docker Compose
|
|
||||||
- API keys for desired LLM providers
|
|
||||||
|
|
||||||
### Local Development
|
|
||||||
```bash
|
```bash
|
||||||
# Clone and setup
|
cp .env.example .env
|
||||||
git clone <repository>
|
# Edit .env with your API keys and MCP server URL
|
||||||
cd lovdata-chat
|
```
|
||||||
|
|
||||||
# Backend setup
|
2. **Start the services:**
|
||||||
cd backend
|
```bash
|
||||||
python -m venv venv
|
docker-compose up --build
|
||||||
source venv/bin/activate # On Windows: venv\Scripts\activate
|
```
|
||||||
|
|
||||||
|
3. **Create a session:**
|
||||||
|
```bash
|
||||||
|
curl http://localhost/api/sessions -X POST
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Access the chat interface** at the URL returned in step 3.
|
||||||
|
|
||||||
|
## Development
|
||||||
|
|
||||||
|
### Running the Stack
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start all services (session-manager, docker-daemon, caddy)
|
||||||
|
docker-compose up --build
|
||||||
|
|
||||||
|
# Start in background
|
||||||
|
docker-compose up -d --build
|
||||||
|
|
||||||
|
# View logs
|
||||||
|
docker-compose logs -f session-manager
|
||||||
|
|
||||||
|
# Stop services
|
||||||
|
docker-compose down
|
||||||
|
```
|
||||||
|
|
||||||
|
### Session Management API
|
||||||
|
|
||||||
|
```bash
|
||||||
|
POST /api/sessions # Create new session
|
||||||
|
GET /api/sessions # List all sessions
|
||||||
|
GET /api/sessions/{id} # Get session info
|
||||||
|
DELETE /api/sessions/{id} # Delete session
|
||||||
|
POST /api/cleanup # Manual cleanup
|
||||||
|
GET /api/health # Health check
|
||||||
|
```
|
||||||
|
|
||||||
|
### Running Locally (without Docker)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd session-manager
|
||||||
pip install -r requirements.txt
|
pip install -r requirements.txt
|
||||||
|
uvicorn main:app --reload --host 0.0.0.0 --port 8000
|
||||||
# Frontend setup
|
|
||||||
cd ../frontend
|
|
||||||
npm install
|
|
||||||
|
|
||||||
# Start development servers
|
|
||||||
docker-compose -f docker-compose.dev.yml up
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Environment Variables
|
### Testing
|
||||||
|
|
||||||
|
Test scripts live in `docker/scripts/` and are self-contained:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Backend
|
python docker/scripts/test-docker-service.py
|
||||||
LOVDATA_MCP_URL=http://localhost:8001
|
python docker/scripts/test-async-docker.py
|
||||||
OPENAI_API_KEY=your_key_here
|
python docker/scripts/test-resource-limits.py
|
||||||
ANTHROPIC_API_KEY=your_key_here
|
python docker/scripts/test-session-auth.py
|
||||||
GOOGLE_API_KEY=your_key_here
|
python docker/scripts/test-database-persistence.py
|
||||||
|
python docker/scripts/test-container-health.py
|
||||||
# Frontend
|
python docker/scripts/test-http-connection-pool.py
|
||||||
NEXT_PUBLIC_API_URL=http://localhost:8000
|
python docker/scripts/test-host-ip-detection.py
|
||||||
|
python docker/scripts/test-structured-logging.py
|
||||||
```
|
```
|
||||||
|
|
||||||
## Deployment Options
|
### Building the OpenCode Image
|
||||||
|
|
||||||
### Cloud Deployment (Recommended)
|
```bash
|
||||||
- **Frontend**: Vercel or Netlify
|
make build MCP_SERVER=http://your-lovdata-server:8001
|
||||||
- **Backend**: Railway, Render, or Fly.io
|
make run # Run interactively
|
||||||
- **Database**: Use existing lovdata-ai PostgreSQL instance
|
make clean # Clean up
|
||||||
|
```
|
||||||
|
|
||||||
### Self-Hosted Deployment
|
## Environment Configuration
|
||||||
- **Docker Compose**: Full stack containerization
|
|
||||||
- **Reverse Proxy**: Nginx or Caddy
|
|
||||||
- **SSL**: Let's Encrypt automatic certificates
|
|
||||||
|
|
||||||
## Security Considerations
|
Required variables (see `.env.example`):
|
||||||
|
|
||||||
- API keys stored securely (environment variables, secret management)
|
```bash
|
||||||
- Rate limiting on chat endpoints
|
MCP_SERVER=http://localhost:8001 # External Lovdata MCP server URL
|
||||||
- Input validation and sanitization
|
|
||||||
- CORS configuration for frontend-backend communication
|
|
||||||
- Audit logging for legal tool usage
|
|
||||||
|
|
||||||
## Performance Optimization
|
# Docker TLS (if using TLS instead of socket)
|
||||||
|
DOCKER_TLS_VERIFY=1
|
||||||
|
DOCKER_CERT_PATH=/etc/docker/certs
|
||||||
|
DOCKER_HOST=tcp://host.docker.internal:2376
|
||||||
|
|
||||||
- Response streaming for real-time chat experience
|
# Optional LLM keys (at least one required for chat)
|
||||||
- MCP tool result caching
|
OPENAI_API_KEY=...
|
||||||
- Conversation history pagination
|
ANTHROPIC_API_KEY=...
|
||||||
- Lazy loading of legal document content
|
GOOGLE_API_KEY=...
|
||||||
- CDN for static frontend assets
|
```
|
||||||
|
|
||||||
## Future Enhancements
|
## Security
|
||||||
|
|
||||||
- User authentication and conversation persistence
|
**Docker socket**: Default setup uses socket mounting (`/var/run/docker.sock`). For production, enable TLS:
|
||||||
- Advanced citation management and export
|
|
||||||
- Integration with legal research workflows
|
|
||||||
- Multi-language support beyond Norwegian
|
|
||||||
- Advanced analytics and usage tracking
|
|
||||||
|
|
||||||
## Contributing
|
```bash
|
||||||
|
cd docker && DOCKER_ENV=production ./scripts/generate-certs.sh
|
||||||
|
./scripts/setup-docker-tls.sh
|
||||||
|
```
|
||||||
|
|
||||||
1. Follow the implementation plan phases
|
**Session isolation:**
|
||||||
2. Ensure comprehensive testing for LLM integrations
|
- Each session gets a dedicated container
|
||||||
3. Document API changes and new features
|
- Resource limits: 4GB RAM, 1 CPU core per container
|
||||||
4. Maintain security best practices for API key handling
|
- Max 3 concurrent sessions (configurable via `resource_manager.py`)
|
||||||
|
- Auto-cleanup after 60 minutes inactivity
|
||||||
|
- Token-based session authentication
|
||||||
|
|
||||||
---
|
## Further Documentation
|
||||||
|
|
||||||
**Status**: Planning phase complete. Ready for implementation.
|
- [`CLAUDE.md`](CLAUDE.md) — AI assistant guidance for working with this codebase
|
||||||
|
- [`LOW_PRIORITY_IMPROVEMENTS.md`](LOW_PRIORITY_IMPROVEMENTS.md) — Backlog of non-critical improvements
|
||||||
**Next Steps**: Begin with Phase 1 - Project Structure Setup
|
- [`docs/project-analysis.md`](docs/project-analysis.md) — Detailed architectural analysis
|
||||||
|
- `docker/*.md` — Implementation docs for individual components
|
||||||
Reference in New Issue
Block a user