Skip to content

Configuration

Complete guide to configuring GreenGovRAG

Environment Variables

All configuration is managed through environment variables in the .env file.

Core Settings

# Application
APP_ENV=development  # development, staging, production
LOG_LEVEL=INFO       # DEBUG, INFO, WARNING, ERROR

# Database
DATABASE_URL=postgresql://user:password@localhost:5432/greengovrag

Authentication

# API Access Key (required for all endpoints except /api/health)
API_ACCESS_KEY=your-secret-key-here

# Recommended: Generate a secure key
# python -c "import secrets; print(secrets.token_urlsafe(32))"

LLM Configuration

OpenAI

LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...
LLM_MODEL=gpt-4o  # or gpt-4o-mini for cost savings
LLM_TEMPERATURE=0.0
LLM_PROVIDER=azure
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
LLM_MODEL=gpt-4o
AZURE_OPENAI_API_VERSION=2024-02-15-preview
LLM_TEMPERATURE=0.0

AWS Bedrock

LLM_PROVIDER=bedrock
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
LLM_MODEL=anthropic.claude-3-sonnet-20240229-v1:0

Anthropic

LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...
LLM_MODEL=claude-3-5-sonnet-20241022

Vector Store Configuration

FAISS (Local Development)

VECTOR_STORE_TYPE=faiss
FAISS_INDEX_PATH=./data/vectors/faiss_index

Qdrant (Production)

VECTOR_STORE_TYPE=qdrant
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY=  # Optional for production
QDRANT_COLLECTION_NAME=greengovrag

Embedding Configuration

EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
EMBEDDING_DIMENSION=384  # Depends on model

Cloud Storage (Optional)

AWS S3

CLOUD_PROVIDER=aws
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_S3_BUCKET=greengovrag-documents
AWS_REGION=ap-southeast-2

Azure Blob Storage

CLOUD_PROVIDER=azure
AZURE_STORAGE_CONNECTION_STRING=DefaultEndpointsProtocol=https;...
AZURE_STORAGE_CONTAINER=greengovrag-documents

Caching Configuration

# Enable caching
ENABLE_CACHE=true
CACHE_TTL=3600  # 1 hour in seconds

# Redis (optional)
REDIS_URL=redis://localhost:6379/0

# DynamoDB (AWS production)
DYNAMODB_TABLE_NAME=greengovrag-cache

ETL Configuration

# Chunking
CHUNK_SIZE=1000          # tokens per chunk
CHUNK_OVERLAP=200        # token overlap
CHUNK_BATCH_SIZE=100     # chunks per batch

# Metadata tagging
ENABLE_AUTO_TAGGING=true
TAGGING_MODEL=gpt-4o-mini  # Cheaper model for tagging

API Configuration

# Rate limiting
API_RATE_LIMIT=30/minute

# CORS
CORS_ORIGINS=http://localhost:3000,https://yourdomain.com

Configuration Profiles

Development

APP_ENV=development
LOG_LEVEL=DEBUG
VECTOR_STORE_TYPE=faiss
CLOUD_PROVIDER=local
ENABLE_CACHE=false

Production

APP_ENV=production
LOG_LEVEL=INFO
VECTOR_STORE_TYPE=qdrant
CLOUD_PROVIDER=aws
ENABLE_CACHE=true
API_RATE_LIMIT=100/minute

Validation

All configuration is validated on startup using Pydantic. Invalid configuration will cause the application to fail fast with clear error messages.

See Also