30 KiB
30 KiB
KMS Deployment Guide
Table of Contents
- Deployment Overview
- Container Architecture
- Environment Configuration
- Docker Compose Setup
- Production Deployment
- Monitoring and Health Checks
- Security Configuration
- Backup and Recovery
- Troubleshooting
Deployment Overview
The KMS is designed as a containerized application using Docker and Docker Compose for orchestration. The system consists of multiple services working together to provide secure API key management capabilities.
Service Architecture
- kms-nginx: Reverse proxy and load balancer
- kms-api: Go backend API service
- kms-frontend: React TypeScript frontend
- kms-postgres: PostgreSQL database
- kms-redis: Redis cache (optional)
Deployment Modes
- Development: Docker Compose with hot reload
- Testing: Isolated container environment
- Staging: Production-like with monitoring
- Production: High availability with load balancing
Container Architecture
graph TB
subgraph "External Network"
Internet[Internet]
CDN[Content Delivery Network<br/>Static Assets]
DNS[DNS<br/>Load Balancing]
end
subgraph "Load Balancer Tier"
LB1[Load Balancer 1<br/>HAProxy/Nginx]
LB2[Load Balancer 2<br/>HAProxy/Nginx]
VIP[Virtual IP<br/>Failover]
end
subgraph "Container Orchestration - Docker Compose"
subgraph "Reverse Proxy"
Nginx1[Nginx Container<br/>kms-nginx<br/>Port 8081:80]
end
subgraph "API Tier"
API1[API Service Container<br/>kms-api-service<br/>Port 8080:8080]
Metrics1[Metrics Endpoint<br/>Port 9090:9090<br/>Prometheus]
end
subgraph "Frontend Tier"
Frontend1[React SPA Container<br/>kms-frontend<br/>Port 3000:80]
StaticAssets[Static Assets<br/>Nginx served]
end
subgraph "Network"
InternalNet[kms-network<br/>Bridge Network<br/>Internal Communication]
end
end
subgraph "Data Tier"
subgraph "Primary Database"
PostgreSQL1[PostgreSQL 15<br/>kms-postgres<br/>Port 5432:5432]
DBData[Persistent Volume<br/>postgres_data]
end
subgraph "Cache Layer"
Redis1[Redis Cache<br/>Optional<br/>Session Store]
end
subgraph "Migration System"
Migrations[Database Migrations<br/>Auto-run on startup<br/>Volume mounted]
end
end
subgraph "Monitoring & Observability"
subgraph "Metrics Collection"
Prometheus[Prometheus<br/>Metrics Scraping]
Grafana[Grafana<br/>Dashboards]
end
subgraph "Logging"
LogAggregator[Log Aggregation<br/>ELK Stack / Loki]
StructuredLogs[Structured Logging<br/>JSON format]
end
subgraph "Health Monitoring"
HealthCheck[Health Checks<br/>/health endpoint]
Alerting[Alerting<br/>PagerDuty/Slack]
end
end
subgraph "Security & Compliance"
subgraph "Certificate Management"
TLS[TLS Certificates<br/>Let's Encrypt/Manual]
CertRotation[Certificate Rotation<br/>Automated]
end
subgraph "Secrets Management"
EnvVars[Environment Variables<br/>Container secrets]
VaultIntegration[Vault Integration<br/>Secret rotation]
end
subgraph "Backup & Recovery"
DBBackup[Database Backups<br/>Automated daily]
DisasterRecovery[Disaster Recovery<br/>Multi-region]
end
end
subgraph "Development Environment"
LocalDev[Local Development<br/>docker-compose.yml]
TestEnv[Test Environment<br/>Isolated containers]
CICDPipeline[CI/CD Pipeline<br/>GitHub Actions]
end
%% External Connections
Internet --> DNS
DNS --> VIP
CDN --> Frontend1
%% Load Balancer Configuration
VIP --> LB1
VIP --> LB2
LB1 --> Nginx1
LB2 --> Nginx1
%% Container Communication
Nginx1 --> API1
Nginx1 --> Frontend1
API1 --> PostgreSQL1
API1 --> Redis1
API1 --> Metrics1
%% Data Flow
PostgreSQL1 --> DBData
API1 --> Migrations
Migrations --> PostgreSQL1
%% Network Isolation
InternalNet --> API1
InternalNet --> Frontend1
InternalNet --> PostgreSQL1
InternalNet --> Redis1
InternalNet --> Nginx1
%% Monitoring Connections
API1 --> Prometheus
Prometheus --> Grafana
API1 --> LogAggregator
LogAggregator --> StructuredLogs
HealthCheck --> API1
HealthCheck --> PostgreSQL1
HealthCheck --> Alerting
%% Security Connections
TLS --> Nginx1
CertRotation --> TLS
EnvVars --> API1
VaultIntegration --> EnvVars
DBBackup --> PostgreSQL1
DisasterRecovery --> DBBackup
%% Development Flow
CICDPipeline --> LocalDev
LocalDev --> TestEnv
TestEnv --> API1
%% Styling
classDef external fill:#ffecb3,stroke:#f57c00,stroke-width:2px
classDef loadbalancer fill:#e1bee7,stroke:#7b1fa2,stroke-width:2px
classDef container fill:#c8e6c9,stroke:#388e3c,stroke-width:2px
classDef data fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef monitoring fill:#fff3e0,stroke:#ef6c00,stroke-width:2px
classDef security fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px
classDef development fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
class Internet,CDN,DNS external
class LB1,LB2,VIP loadbalancer
class Nginx1,API1,Frontend1,Metrics1,InternalNet,StaticAssets container
class PostgreSQL1,Redis1,DBData,Migrations data
class Prometheus,Grafana,LogAggregator,StructuredLogs,HealthCheck,Alerting monitoring
class TLS,CertRotation,EnvVars,VaultIntegration,DBBackup,DisasterRecovery security
class LocalDev,TestEnv,CICDPipeline development
Environment Configuration
Environment Variables
Database Configuration
# PostgreSQL Database
DB_HOST=postgres
DB_PORT=5432
DB_NAME=kms
DB_USER=postgres
DB_PASSWORD=secure_password_here
DB_MAX_OPEN_CONNECTIONS=25
DB_MAX_IDLE_CONNECTIONS=5
DB_CONNECTION_MAX_LIFETIME=300s
DB_CONNECTION_MAX_IDLE_TIME=60s
Server Configuration
# API Server
SERVER_HOST=0.0.0.0
SERVER_PORT=8080
SERVER_READ_TIMEOUT=30s
SERVER_WRITE_TIMEOUT=30s
SERVER_IDLE_TIMEOUT=120s
# Environment
ENVIRONMENT=production
LOG_LEVEL=info
DEBUG_MODE=false
Authentication Configuration
# Authentication Provider
AUTH_PROVIDER=header
AUTH_HEADER_USER_EMAIL=X-User-Email
AUTH_SIGNING_KEY=your_hmac_signing_key_here
# JWT Configuration
JWT_SECRET=your_jwt_secret_here
JWT_ISSUER=kms-api-service
JWT_EXPIRY=1h
# OAuth2 Configuration (optional)
OAUTH2_CLIENT_ID=your_oauth2_client_id
OAUTH2_CLIENT_SECRET=your_oauth2_client_secret
OAUTH2_REDIRECT_URL=https://your-domain.com/api/oauth2/callback
OAUTH2_PROVIDER_URL=https://oauth-provider.com
# SAML Configuration (optional)
SAML_IDP_URL=https://saml-provider.com/sso
SAML_CERT_PATH=/etc/ssl/certs/saml.crt
SAML_KEY_PATH=/etc/ssl/private/saml.key
Security Configuration
# Rate Limiting
RATE_LIMIT_ENABLED=true
RATE_LIMIT_RPS=100
RATE_LIMIT_BURST=200
AUTH_RATE_LIMIT_RPS=5
AUTH_RATE_LIMIT_BURST=10
# Security Settings
CSRF_TOKEN_MAX_AGE=1h
MAX_AUTH_FAILURES=5
IP_BLOCK_DURATION=1h
AUTH_FAILURE_WINDOW=15m
REQUEST_MAX_AGE=5m
# CORS Settings
CORS_ALLOWED_ORIGINS=https://your-frontend-domain.com,https://admin.your-domain.com
CORS_ALLOWED_METHODS=GET,POST,PUT,DELETE,OPTIONS
CORS_ALLOWED_HEADERS=Origin,Content-Type,Accept,Authorization,X-User-Email,X-CSRF-Token
Monitoring Configuration
# Metrics
METRICS_ENABLED=true
METRICS_PORT=9090
METRICS_PATH=/metrics
# Health Checks
HEALTH_CHECK_ENABLED=true
HEALTH_CHECK_INTERVAL=30s
Configuration Management
Environment Files
# Development environment
.env.development
# Testing environment
.env.test
# Staging environment
.env.staging
# Production environment
.env.production
Docker Environment
# docker-compose.override.yml for local development
version: '3.8'
services:
kms-api:
environment:
- LOG_LEVEL=debug
- DEBUG_MODE=true
volumes:
- ./:/app
command: air -c .air.toml # Hot reload
Docker Compose Setup
Development Configuration
docker-compose.yml
version: '3.8'
services:
# PostgreSQL Database
kms-postgres:
image: postgres:15-alpine
container_name: kms-postgres
environment:
POSTGRES_DB: ${DB_NAME:-kms}
POSTGRES_USER: ${DB_USER:-postgres}
POSTGRES_PASSWORD: ${DB_PASSWORD:-postgres}
POSTGRES_INITDB_ARGS: "--encoding=UTF-8 --lc-collate=C --lc-ctype=C"
volumes:
- postgres_data:/var/lib/postgresql/data
- ./migrations:/docker-entrypoint-initdb.d:ro
ports:
- "5432:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${DB_USER:-postgres}"]
interval: 30s
timeout: 10s
retries: 5
start_period: 30s
networks:
- kms-network
restart: unless-stopped
# Redis Cache (Optional)
kms-redis:
image: redis:7-alpine
container_name: kms-redis
command: redis-server --appendonly yes --requirepass ${REDIS_PASSWORD:-redis_password}
volumes:
- redis_data:/data
ports:
- "6379:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 30s
timeout: 10s
retries: 3
networks:
- kms-network
restart: unless-stopped
# API Service
kms-api:
build:
context: .
dockerfile: Dockerfile
target: production
container_name: kms-api
environment:
- DB_HOST=kms-postgres
- DB_PORT=5432
- DB_NAME=${DB_NAME:-kms}
- DB_USER=${DB_USER:-postgres}
- DB_PASSWORD=${DB_PASSWORD:-postgres}
- REDIS_HOST=kms-redis
- REDIS_PORT=6379
- REDIS_PASSWORD=${REDIS_PASSWORD:-redis_password}
- SERVER_PORT=8080
- JWT_SECRET=${JWT_SECRET}
- AUTH_SIGNING_KEY=${AUTH_SIGNING_KEY}
- LOG_LEVEL=${LOG_LEVEL:-info}
- RATE_LIMIT_ENABLED=true
ports:
- "8080:8080"
- "9090:9090" # Metrics port
depends_on:
kms-postgres:
condition: service_healthy
kms-redis:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
networks:
- kms-network
restart: unless-stopped
volumes:
- ./logs:/app/logs
# Frontend Service
kms-frontend:
build:
context: ./kms-frontend
dockerfile: Dockerfile
target: production
container_name: kms-frontend
environment:
- REACT_APP_API_BASE_URL=http://localhost:8081/api
ports:
- "3000:80"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:80"]
interval: 30s
timeout: 10s
retries: 3
networks:
- kms-network
restart: unless-stopped
# Nginx Reverse Proxy
kms-nginx:
image: nginx:alpine
container_name: kms-nginx
ports:
- "8081:80"
- "8443:443"
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./nginx/default.conf:/etc/nginx/conf.d/default.conf:ro
- ./ssl:/etc/ssl/certs:ro
depends_on:
- kms-api
- kms-frontend
healthcheck:
test: ["CMD", "nginx", "-t"]
interval: 30s
timeout: 10s
retries: 3
networks:
- kms-network
restart: unless-stopped
networks:
kms-network:
driver: bridge
ipam:
config:
- subnet: 172.20.0.0/16
volumes:
postgres_data:
driver: local
redis_data:
driver: local
Dockerfile (Multi-stage)
# Build stage
FROM golang:1.21-alpine AS builder
WORKDIR /app
# Install dependencies
RUN apk add --no-cache git ca-certificates tzdata
# Copy go mod files
COPY go.mod go.sum ./
RUN go mod download
# Copy source code
COPY . .
# Build the binary
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main ./cmd/server
# Production stage
FROM alpine:latest AS production
RUN apk --no-cache add ca-certificates curl
WORKDIR /root/
# Copy the binary from builder stage
COPY --from=builder /app/main .
COPY --from=builder /app/migrations ./migrations
# Create non-root user
RUN addgroup -g 1001 appgroup && \
adduser -D -u 1001 -G appgroup appuser
USER appuser
EXPOSE 8080 9090
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1
CMD ["./main"]
Production Configuration
docker-compose.prod.yml
version: '3.8'
services:
kms-postgres:
image: postgres:15-alpine
environment:
POSTGRES_DB: ${DB_NAME}
POSTGRES_USER: ${DB_USER}
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_MAX_CONNECTIONS: 100
volumes:
- postgres_data:/var/lib/postgresql/data
- ./postgres/postgresql.conf:/etc/postgresql/postgresql.conf
- ./postgres/pg_hba.conf:/etc/postgresql/pg_hba.conf
command: postgres -c config_file=/etc/postgresql/postgresql.conf
deploy:
resources:
limits:
memory: 1G
cpus: '0.5'
reservations:
memory: 512M
cpus: '0.25'
restart: always
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
kms-api:
image: kms-api:latest
environment:
- DB_HOST=kms-postgres
- DB_NAME=${DB_NAME}
- DB_USER=${DB_USER}
- DB_PASSWORD=${DB_PASSWORD}
- JWT_SECRET=${JWT_SECRET}
- AUTH_SIGNING_KEY=${AUTH_SIGNING_KEY}
- LOG_LEVEL=info
- ENVIRONMENT=production
deploy:
replicas: 3
resources:
limits:
memory: 512M
cpus: '0.5'
reservations:
memory: 256M
cpus: '0.25'
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
window: 120s
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
logging:
driver: "json-file"
options:
max-size: "50m"
max-file: "5"
kms-nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx/nginx.prod.conf:/etc/nginx/nginx.conf:ro
- ./ssl:/etc/ssl/certs:ro
deploy:
resources:
limits:
memory: 256M
cpus: '0.25'
restart: always
Production Deployment
Load Balancer Configuration
HAProxy Configuration
# /etc/haproxy/haproxy.cfg
global
daemon
maxconn 4096
log stdout local0
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
option httplog
frontend kms_frontend
bind *:80
bind *:443 ssl crt /etc/ssl/certs/kms.pem
redirect scheme https if !{ ssl_fc }
# Rate limiting
stick-table type ip size 100k expire 30s store http_req_rate(10s)
http-request track-sc0 src
http-request reject if { sc_http_req_rate(0) gt 20 }
default_backend kms_api_servers
backend kms_api_servers
balance roundrobin
option httpchk GET /health
server api1 kms-api-1:8080 check
server api2 kms-api-2:8080 check
server api3 kms-api-3:8080 check
Nginx Load Balancer
# /etc/nginx/nginx.conf
upstream kms_api_backend {
least_conn;
server kms-api-1:8080 weight=3 max_fails=3 fail_timeout=30s;
server kms-api-2:8080 weight=3 max_fails=3 fail_timeout=30s;
server kms-api-3:8080 weight=3 max_fails=3 fail_timeout=30s;
}
upstream kms_frontend_backend {
server kms-frontend-1:80;
server kms-frontend-2:80;
}
server {
listen 80;
server_name kms.yourdomain.com;
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name kms.yourdomain.com;
# SSL Configuration
ssl_certificate /etc/ssl/certs/kms.crt;
ssl_certificate_key /etc/ssl/private/kms.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384;
ssl_prefer_server_ciphers off;
# Security Headers
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
add_header Referrer-Policy "strict-origin-when-cross-origin";
# Rate Limiting
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/s;
# API Routes
location /api/ {
limit_req zone=api burst=20 nodelay;
proxy_pass http://kms_api_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Authentication Routes (stricter rate limiting)
location /api/login {
limit_req zone=auth burst=10 nodelay;
proxy_pass http://kms_api_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Frontend Routes
location / {
proxy_pass http://kms_frontend_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# SPA routing support
try_files $uri $uri/ /index.html;
}
# Health Check
location /health {
access_log off;
proxy_pass http://kms_api_backend;
}
# Metrics (restricted access)
location /metrics {
allow 10.0.0.0/8;
allow 172.16.0.0/12;
allow 192.168.0.0/16;
deny all;
proxy_pass http://kms_api_backend:9090;
}
}
Database High Availability
PostgreSQL Primary-Replica Setup
# docker-compose.ha.yml
version: '3.8'
services:
postgres-primary:
image: postgres:15-alpine
environment:
POSTGRES_DB: kms
POSTGRES_USER: postgres
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_REPLICATION_USER: replicator
POSTGRES_REPLICATION_PASSWORD: ${REPLICATION_PASSWORD}
volumes:
- postgres_primary_data:/var/lib/postgresql/data
- ./postgres/primary.conf:/etc/postgresql/postgresql.conf
- ./postgres/setup-replication.sh:/docker-entrypoint-initdb.d/setup-replication.sh
command: postgres -c config_file=/etc/postgresql/postgresql.conf
ports:
- "5432:5432"
postgres-replica:
image: postgres:15-alpine
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_PRIMARY_HOST: postgres-primary
POSTGRES_REPLICATION_USER: replicator
POSTGRES_REPLICATION_PASSWORD: ${REPLICATION_PASSWORD}
volumes:
- postgres_replica_data:/var/lib/postgresql/data
- ./postgres/replica.conf:/etc/postgresql/postgresql.conf
- ./postgres/setup-replica.sh:/docker-entrypoint-initdb.d/setup-replica.sh
command: postgres -c config_file=/etc/postgresql/postgresql.conf
ports:
- "5433:5432"
depends_on:
- postgres-primary
volumes:
postgres_primary_data:
postgres_replica_data:
Monitoring and Health Checks
Health Check Configuration
Application Health Checks
// File: internal/handlers/health.go
type HealthResponse struct {
Status string `json:"status"`
Timestamp time.Time `json:"timestamp"`
Services map[string]string `json:"services"`
Version string `json:"version"`
}
func (h *HealthHandler) GetHealth(c *gin.Context) {
response := &HealthResponse{
Status: "healthy",
Timestamp: time.Now(),
Services: make(map[string]string),
Version: h.version,
}
// Check database connectivity
if err := h.db.Ping(); err != nil {
response.Status = "unhealthy"
response.Services["database"] = "unhealthy"
} else {
response.Services["database"] = "healthy"
}
// Check Redis connectivity
if err := h.redis.Ping(); err != nil {
response.Services["cache"] = "unhealthy"
} else {
response.Services["cache"] = "healthy"
}
statusCode := http.StatusOK
if response.Status == "unhealthy" {
statusCode = http.StatusServiceUnavailable
}
c.JSON(statusCode, response)
}
Docker Health Checks
# API Service Health Check
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1
# Database Health Check
HEALTHCHECK --interval=30s --timeout=10s --retries=5 \
CMD pg_isready -U postgres || exit 1
# Frontend Health Check
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
CMD curl -f http://localhost:80 || exit 1
Prometheus Metrics
Metrics Configuration
# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'kms-api'
static_configs:
- targets: ['kms-api:9090']
metrics_path: /metrics
scrape_interval: 15s
- job_name: 'nginx'
static_configs:
- targets: ['kms-nginx:9113']
- job_name: 'postgres'
static_configs:
- targets: ['postgres-exporter:9187']
- job_name: 'redis'
static_configs:
- targets: ['redis-exporter:9121']
Grafana Dashboards
{
"dashboard": {
"title": "KMS System Metrics",
"panels": [
{
"title": "Request Rate",
"type": "graph",
"targets": [
{
"expr": "rate(http_requests_total[5m])",
"legendFormat": "{{method}} {{status}}"
}
]
},
{
"title": "Response Time",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))",
"legendFormat": "95th percentile"
}
]
},
{
"title": "Database Connections",
"type": "graph",
"targets": [
{
"expr": "postgres_stat_database_numbackends",
"legendFormat": "Active connections"
}
]
},
{
"title": "Authentication Success Rate",
"type": "stat",
"targets": [
{
"expr": "rate(auth_attempts_success_total[5m]) / rate(auth_attempts_total[5m])",
"legendFormat": "Success rate"
}
]
}
]
}
}
Security Configuration
SSL/TLS Configuration
Certificate Management
# Let's Encrypt certificate generation
docker run --rm -it \
-v /etc/letsencrypt:/etc/letsencrypt \
-v /var/lib/letsencrypt:/var/lib/letsencrypt \
certbot/certbot certonly \
--standalone \
--email admin@yourdomain.com \
--agree-tos \
--no-eff-email \
-d kms.yourdomain.com
# Certificate renewal cron job
0 2 * * * docker run --rm -it \
-v /etc/letsencrypt:/etc/letsencrypt \
-v /var/lib/letsencrypt:/var/lib/letsencrypt \
certbot/certbot renew --quiet
SSL Configuration
# Strong SSL configuration
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-CHACHA20-POLY1305;
ssl_prefer_server_ciphers off;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 1d;
ssl_session_tickets off;
ssl_stapling on;
ssl_stapling_verify on;
# HSTS
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
Secrets Management
Docker Secrets
version: '3.8'
secrets:
db_password:
file: ./secrets/db_password.txt
jwt_secret:
file: ./secrets/jwt_secret.txt
auth_signing_key:
file: ./secrets/auth_signing_key.txt
services:
kms-api:
image: kms-api:latest
secrets:
- db_password
- jwt_secret
- auth_signing_key
environment:
- DB_PASSWORD_FILE=/run/secrets/db_password
- JWT_SECRET_FILE=/run/secrets/jwt_secret
- AUTH_SIGNING_KEY_FILE=/run/secrets/auth_signing_key
HashiCorp Vault Integration
// Vault secret retrieval
func getSecretFromVault(path string) (string, error) {
client, err := vault.NewClient(&vault.Config{
Address: os.Getenv("VAULT_ADDR"),
})
if err != nil {
return "", err
}
client.SetToken(os.Getenv("VAULT_TOKEN"))
secret, err := client.Logical().Read(path)
if err != nil {
return "", err
}
if secret == nil || secret.Data == nil {
return "", fmt.Errorf("secret not found")
}
value, ok := secret.Data["value"].(string)
if !ok {
return "", fmt.Errorf("invalid secret format")
}
return value, nil
}
Backup and Recovery
Database Backup Strategy
Automated Backup Script
#!/bin/bash
# backup.sh
set -e
# Configuration
BACKUP_DIR="/backups"
POSTGRES_CONTAINER="kms-postgres"
S3_BUCKET="kms-backups"
RETENTION_DAYS=30
# Create backup directory
mkdir -p $BACKUP_DIR
# Generate backup filename with timestamp
BACKUP_NAME="kms_backup_$(date +%Y%m%d_%H%M%S).sql"
BACKUP_PATH="$BACKUP_DIR/$BACKUP_NAME"
# Create database dump
echo "Creating database backup..."
docker exec $POSTGRES_CONTAINER pg_dump -U postgres kms > $BACKUP_PATH
# Compress backup
gzip $BACKUP_PATH
# Upload to S3
echo "Uploading backup to S3..."
aws s3 cp "$BACKUP_PATH.gz" "s3://$S3_BUCKET/$(date +%Y/%m/%d)/$BACKUP_NAME.gz"
# Clean up local files older than retention period
find $BACKUP_DIR -name "*.gz" -mtime +$RETENTION_DAYS -delete
# Clean up S3 files older than retention period
aws s3 ls "s3://$S3_BUCKET/" --recursive | while read -r line; do
createDate=$(echo $line | awk '{print $1" "$2}')
createDate=$(date -d "$createDate" +%s)
olderThan=$(date -d "$RETENTION_DAYS days ago" +%s)
if [[ $createDate -lt $olderThan ]]; then
fileName=$(echo $line | awk '{$1=$2=$3=""; print $0}' | sed 's/^[ \t]*//')
aws s3 rm "s3://$S3_BUCKET/$fileName"
fi
done
echo "Backup completed successfully"
Backup Cron Job
# Daily backup at 2 AM
0 2 * * * /opt/kms/scripts/backup.sh >> /var/log/kms-backup.log 2>&1
# Weekly full backup at 1 AM on Sundays
0 1 * * 0 /opt/kms/scripts/full-backup.sh >> /var/log/kms-backup.log 2>&1
Disaster Recovery
Recovery Procedure
#!/bin/bash
# restore.sh
set -e
BACKUP_FILE=$1
POSTGRES_CONTAINER="kms-postgres"
if [ -z "$BACKUP_FILE" ]; then
echo "Usage: $0 <backup_file>"
exit 1
fi
# Download backup from S3 if needed
if [[ $BACKUP_FILE == s3://* ]]; then
LOCAL_FILE="/tmp/$(basename $BACKUP_FILE)"
aws s3 cp "$BACKUP_FILE" "$LOCAL_FILE"
BACKUP_FILE=$LOCAL_FILE
fi
# Extract if compressed
if [[ $BACKUP_FILE == *.gz ]]; then
gunzip -c "$BACKUP_FILE" > "/tmp/backup.sql"
BACKUP_FILE="/tmp/backup.sql"
fi
# Stop API services
echo "Stopping API services..."
docker-compose stop kms-api kms-frontend
# Drop and recreate database
echo "Recreating database..."
docker exec $POSTGRES_CONTAINER psql -U postgres -c "DROP DATABASE IF EXISTS kms;"
docker exec $POSTGRES_CONTAINER psql -U postgres -c "CREATE DATABASE kms;"
# Restore database
echo "Restoring database..."
docker exec -i $POSTGRES_CONTAINER psql -U postgres kms < "$BACKUP_FILE"
# Start services
echo "Starting services..."
docker-compose up -d
# Verify restoration
echo "Verifying restoration..."
sleep 30
curl -f http://localhost:8081/health || {
echo "Health check failed after restoration"
exit 1
}
echo "Database restoration completed successfully"
Troubleshooting
Common Issues
Database Connection Issues
# Check database container logs
docker logs kms-postgres
# Test database connectivity
docker exec kms-postgres pg_isready -U postgres
# Check connection pool status
curl http://localhost:8080/health | jq '.services.database'
Authentication Problems
# Check JWT secret configuration
docker exec kms-api env | grep JWT_SECRET
# Verify HMAC key configuration
docker exec kms-api env | grep AUTH_SIGNING_KEY
# Test authentication endpoint
curl -X POST http://localhost:8081/api/login \
-H "Content-Type: application/json" \
-d '{"app_id": "test-app", "user_id": "test@example.com"}'
Performance Issues
# Check API response times
curl -w "@curl-format.txt" -s -o /dev/null http://localhost:8081/api/health
# Monitor database connections
docker exec kms-postgres psql -U postgres -c "SELECT count(*) FROM pg_stat_activity;"
# Check memory usage
docker stats kms-api
SSL/TLS Issues
# Test SSL certificate
openssl s_client -connect kms.yourdomain.com:443 -servername kms.yourdomain.com
# Check certificate expiration
curl -vI https://kms.yourdomain.com 2>&1 | grep -i expire
# Verify certificate chain
curl -I https://kms.yourdomain.com
Debugging Commands
Container Debugging
# View container logs
docker logs -f kms-api
# Execute shell in container
docker exec -it kms-api /bin/sh
# Inspect container configuration
docker inspect kms-api
# Check resource usage
docker stats
Network Debugging
# Test inter-container connectivity
docker exec kms-api ping kms-postgres
# Check port binding
netstat -tlnp | grep :8080
# Inspect Docker network
docker network inspect kms_kms-network
This deployment guide provides comprehensive instructions for deploying, configuring, and maintaining the KMS in various environments, from development to production-scale deployments.