diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md new file mode 100644 index 0000000..de04c88 --- /dev/null +++ b/docs/ARCHITECTURE.md @@ -0,0 +1,677 @@ +# API Key Management Service (KMS) - System Architecture + +## Table of Contents +1. [System Overview](#system-overview) +2. [Architecture Principles](#architecture-principles) +3. [Component Architecture](#component-architecture) +4. [System Architecture Diagram](#system-architecture-diagram) +5. [Request Flow Pipeline](#request-flow-pipeline) +6. [Authentication Flow](#authentication-flow) +7. [API Design](#api-design) +8. [Technology Stack](#technology-stack) + +--- + +## System Overview + +The API Key Management Service (KMS) is a secure, scalable platform for managing API authentication tokens across applications. Built with Go backend and React TypeScript frontend, it provides centralized token lifecycle management with enterprise-grade security features. + +### Key Capabilities +- **Multi-Provider Authentication**: Header, JWT, OAuth2, SAML support +- **Dual Token System**: Static HMAC tokens and renewable JWT user tokens +- **Hierarchical Permissions**: Role-based access control with inheritance +- **Enterprise Security**: Rate limiting, brute force protection, audit logging +- **High Availability**: Containerized deployment with load balancing + +### Core Features +- **Token Lifecycle Management**: Create, verify, renew, and revoke tokens +- **Application Management**: Multi-tenant application configuration +- **User Session Tracking**: Comprehensive session management +- **Audit Logging**: Complete audit trail of all operations +- **Health Monitoring**: Built-in health checks and metrics + +--- + +## Architecture Principles + +### Clean Architecture +The system follows clean architecture principles with clear separation of concerns: + +``` +┌─────────────────┐ +│ Handlers │ ← HTTP request handling +├─────────────────┤ +│ Services │ ← Business logic +├─────────────────┤ +│ Repositories │ ← Data access +├─────────────────┤ +│ Database │ ← Data persistence +└─────────────────┘ +``` + +### Design Principles +- **Dependency Injection**: All services receive dependencies through constructors +- **Interface Segregation**: Repository interfaces enable testing and flexibility +- **Single Responsibility**: Each component has one clear purpose +- **Fail-Safe Defaults**: Security-first configuration with safe fallbacks +- **Immutable Operations**: Database transactions and audit logging +- **Defense in Depth**: Multiple security layers throughout the stack + +--- + +## Component Architecture + +### Backend Components (`internal/`) + +#### **Handlers Layer** (`internal/handlers/`) +HTTP request processors implementing REST API endpoints: + +- **`application.go`**: Application CRUD operations + - Create, read, update, delete applications + - HMAC key management + - Ownership validation + +- **`auth.go`**: Authentication workflows + - User login and logout + - Token renewal and validation + - Multi-provider authentication + +- **`token.go`**: Token operations + - Static token creation + - Token verification + - Token revocation + +- **`health.go`**: System health checks + - Database connectivity + - Cache availability + - Service status + +- **`oauth2.go`, `saml.go`**: External authentication providers + - OAuth2 authorization code flow + - SAML assertion validation + - Provider callback handling + +#### **Services Layer** (`internal/services/`) +Business logic implementation with transaction management: + +- **`auth_service.go`**: Authentication provider orchestration + - Multi-provider authentication + - Session management + - User context creation + +- **`token_service.go`**: Token lifecycle management + - Static token generation and validation + - JWT user token management + - Permission assignment + +- **`application_service.go`**: Application configuration management + - Application CRUD operations + - Configuration validation + - HMAC key rotation + +- **`session_service.go`**: User session tracking + - Session creation and validation + - Session timeout handling + - Cross-provider session management + +#### **Repository Layer** (`internal/repository/postgres/`) +Data access with ACID transaction support: + +- **`application_repository.go`**: Application persistence + - Secure dynamic query building + - Parameterized queries + - Ownership validation + +- **`token_repository.go`**: Static token management + - BCrypt token hashing + - Token lookup and validation + - Permission relationship management + +- **`permission_repository.go`**: Permission catalog + - Hierarchical permission structure + - Permission validation + - Bulk permission operations + +- **`session_repository.go`**: User session storage + - Session persistence + - Expiration management + - Provider metadata storage + +#### **Authentication Providers** (`internal/auth/`) +Pluggable authentication system: + +- **`header_validator.go`**: HMAC signature validation + - Timestamp-based replay protection + - Constant-time signature comparison + - Email format validation + +- **`jwt.go`**: JWT token management + - Token generation with secure JTI + - Signature validation + - Revocation list management + +- **`oauth2.go`**: OAuth2 authorization code flow + - State management + - Token exchange + - Provider integration + +- **`saml.go`**: SAML assertion validation + - XML signature validation + - Attribute extraction + - Provider configuration + +- **`permissions.go`**: Hierarchical permission evaluation + - Role-based access control + - Permission inheritance + - Bulk permission evaluation + +--- + +## System Architecture Diagram + +```mermaid +graph TB + %% External Components + Client[Client Applications] + Browser[Web Browser] + AuthProvider[OAuth2/SAML Provider] + + %% Load Balancer & Proxy + subgraph "Load Balancer Layer" + Nginx[Nginx Proxy
:80, :8081] + end + + %% Frontend Layer + subgraph "Frontend Layer" + React[React TypeScript SPA
Ant Design UI
:3000] + AuthContext[Authentication Context] + APIService[API Service Client] + end + + %% API Gateway & Middleware + subgraph "API Layer" + API[Go API Server
:8080] + + subgraph "Middleware Chain" + Logger[Request Logger] + Security[Security Headers
CORS, CSRF] + RateLimit[Rate Limiter
100 RPS] + Auth[Authentication
Header/JWT/OAuth2/SAML] + Validation[Request Validator] + end + end + + %% Business Logic Layer + subgraph "Service Layer" + AuthService[Authentication Service] + TokenService[Token Service] + AppService[Application Service] + SessionService[Session Service] + PermService[Permission Service] + end + + %% Data Access Layer + subgraph "Repository Layer" + AppRepo[Application Repository] + TokenRepo[Token Repository] + PermRepo[Permission Repository] + SessionRepo[Session Repository] + end + + %% Infrastructure Layer + subgraph "Infrastructure" + PostgreSQL[(PostgreSQL 15
:5432)] + Redis[(Redis Cache
Optional)] + Metrics[Prometheus Metrics
:9090] + end + + %% External Security + subgraph "Security & Crypto" + HMAC[HMAC Signature
Validation] + BCrypt[BCrypt Hashing
Cost 14] + JWT[JWT Token
Generation] + end + + %% Flow Connections + Client -->|API Requests| Nginx + Browser -->|HTTPS| Nginx + AuthProvider -->|OAuth2/SAML| API + + Nginx -->|Proxy| React + Nginx -->|API Proxy| API + + React --> AuthContext + React --> APIService + APIService -->|REST API| API + + API --> Logger + Logger --> Security + Security --> RateLimit + RateLimit --> Auth + Auth --> Validation + Validation --> AuthService + Validation --> TokenService + Validation --> AppService + Validation --> SessionService + Validation --> PermService + + AuthService --> AppRepo + TokenService --> TokenRepo + AppService --> AppRepo + SessionService --> SessionRepo + PermService --> PermRepo + + AppRepo --> PostgreSQL + TokenRepo --> PostgreSQL + PermRepo --> PostgreSQL + SessionRepo --> PostgreSQL + + TokenService --> HMAC + TokenService --> BCrypt + TokenService --> JWT + AuthService --> Redis + + API --> Metrics + + %% Styling + classDef frontend fill:#e1f5fe,stroke:#0277bd,stroke-width:2px + classDef api fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px + classDef service fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px + classDef data fill:#fff3e0,stroke:#ef6c00,stroke-width:2px + classDef security fill:#ffebee,stroke:#c62828,stroke-width:2px + + class React,AuthContext,APIService frontend + class API,Logger,Security,RateLimit,Auth,Validation api + class AuthService,TokenService,AppService,SessionService,PermService service + class PostgreSQL,Redis,Metrics,AppRepo,TokenRepo,PermRepo,SessionRepo data + class HMAC,BCrypt,JWT security +``` + +--- + +## Request Flow Pipeline + +```mermaid +flowchart TD + Start([HTTP Request]) --> Nginx{Nginx Proxy
Load Balancer} + + Nginx -->|Static Assets| Frontend[React SPA
Port 3000] + Nginx -->|API Routes| API[Go API Server
Port 8080] + + API --> Logger[Request Logger
Structured Logging] + Logger --> Security[Security Middleware
Headers, CORS, CSRF] + Security --> RateLimit{Rate Limiter
100 RPS, 200 Burst} + + RateLimit -->|Exceeded| RateResponse[429 Too Many Requests] + RateLimit -->|Within Limits| Auth[Authentication
Middleware] + + Auth --> AuthHeader{Auth Provider} + AuthHeader -->|header| HeaderAuth[Header Validator
X-User-Email] + AuthHeader -->|jwt| JWTAuth[JWT Validator
Signature + Claims] + AuthHeader -->|oauth2| OAuth2Auth[OAuth2 Flow
Authorization Code] + AuthHeader -->|saml| SAMLAuth[SAML Assertion
XML Validation] + + HeaderAuth --> AuthCache{Check Cache
Redis 5min TTL} + JWTAuth --> JWTValidation[Signature Validation
Expiry Check] + OAuth2Auth --> OAuth2Exchange[Token Exchange
User Info Retrieval] + SAMLAuth --> SAMLValidation[Assertion Validation
Signature Check] + + AuthCache -->|Hit| AuthContext[Create AuthContext] + AuthCache -->|Miss| DBAuth[Database Lookup
User Permissions] + JWTValidation --> RevocationCheck[Check Revocation List
Redis Cache] + OAuth2Exchange --> SessionStore[Store User Session
PostgreSQL] + SAMLValidation --> SessionStore + + DBAuth --> CacheStore[Store in Cache
5min TTL] + RevocationCheck --> AuthContext + SessionStore --> AuthContext + CacheStore --> AuthContext + + AuthContext --> Validation[Request Validator
JSON Schema] + Validation -->|Invalid| ValidationError[400 Bad Request] + Validation -->|Valid| Router{Route Handler} + + Router -->|/health| HealthHandler[Health Check
DB + Cache Status] + Router -->|/api/applications| AppHandler[Application CRUD
HMAC Key Management] + Router -->|/api/tokens| TokenHandler[Token Operations
Create, Verify, Revoke] + Router -->|/api/login| AuthHandler[Authentication
Login, Renewal] + Router -->|/api/oauth2| OAuth2Handler[OAuth2 Callbacks
State Management] + Router -->|/api/saml| SAMLHandler[SAML Callbacks
Assertion Processing] + + HealthHandler --> Service[Service Layer] + AppHandler --> Service + TokenHandler --> Service + AuthHandler --> Service + OAuth2Handler --> Service + SAMLHandler --> Service + + Service --> Repository[Repository Layer
Database Operations] + Repository --> PostgreSQL[(PostgreSQL
ACID Transactions)] + + Service --> CryptoOps[Cryptographic Operations] + CryptoOps --> HMAC[HMAC Signature
Timestamp Validation] + CryptoOps --> BCrypt[BCrypt Hashing
Cost 14] + CryptoOps --> JWT[JWT Generation
RS256 Signing] + + Repository --> AuditLog[Audit Logging
All Operations] + AuditLog --> AuditTable[(audit_logs table)] + + Service --> Response[HTTP Response] + Response --> Metrics[Prometheus Metrics
Port 9090] + Response --> End([Response Sent]) + + %% Error Paths + RateResponse --> End + ValidationError --> End + + %% Styling + classDef middleware fill:#e3f2fd,stroke:#1976d2,stroke-width:2px + classDef auth fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px + classDef handler fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px + classDef data fill:#fff3e0,stroke:#ef6c00,stroke-width:2px + classDef crypto fill:#ffebee,stroke:#c62828,stroke-width:2px + classDef error fill:#fce4ec,stroke:#ad1457,stroke-width:2px + + class Logger,Security,RateLimit,Validation middleware + class Auth,HeaderAuth,JWTAuth,OAuth2Auth,SAMLAuth,AuthContext auth + class HealthHandler,AppHandler,TokenHandler,AuthHandler,OAuth2Handler,SAMLHandler,Service handler + class Repository,PostgreSQL,AuditTable,AuthCache data + class CryptoOps,HMAC,BCrypt,JWT crypto + class RateResponse,ValidationError error +``` + +### Request Processing Pipeline + +1. **Load Balancer**: Nginx receives and routes requests +2. **Static Assets**: React SPA served directly by Nginx +3. **API Gateway**: Go server handles API requests +4. **Middleware Chain**: Security, rate limiting, authentication +5. **Route Handler**: Business logic processing +6. **Service Layer**: Transaction management and orchestration +7. **Repository Layer**: Database operations with audit logging +8. **Response**: JSON response with metrics collection + +--- + +## Authentication Flow + +```mermaid +sequenceDiagram + participant Client as Client App + participant API as API Gateway + participant Auth as Auth Service + participant DB as PostgreSQL + participant Provider as OAuth2/SAML + participant Cache as Redis Cache + + %% Header-based Authentication + rect rgb(240, 248, 255) + Note over Client, Cache: Header-based Authentication Flow + Client->>API: Request with X-User-Email header + API->>Auth: Validate header auth + Auth->>DB: Check user permissions + DB-->>Auth: Return user context + Auth->>Cache: Cache auth result (5min TTL) + Auth-->>API: AuthContext{UserID, Permissions} + API-->>Client: Authenticated response + end + + %% JWT Authentication Flow + rect rgb(245, 255, 245) + Note over Client, Cache: JWT Authentication Flow + Client->>API: Login request {app_id, permissions} + API->>Auth: Generate JWT token + Auth->>DB: Validate app_id and permissions + DB-->>Auth: Application config + Auth->>Auth: Create JWT with claims
{user_id, permissions, exp, iat} + Auth-->>API: JWT token + expires_at + API-->>Client: LoginResponse{token, expires_at} + + Note over Client, API: Subsequent requests with JWT + Client->>API: Request with Bearer JWT + API->>Auth: Verify JWT signature + Auth->>Auth: Check expiration & claims + Auth->>Cache: Check revocation list + Cache-->>Auth: Token status + Auth-->>API: Valid AuthContext + API-->>Client: Authorized response + end + + %% OAuth2/SAML Flow + rect rgb(255, 248, 240) + Note over Client, Provider: OAuth2/SAML Authentication Flow + Client->>API: POST /api/login {app_id, redirect_uri} + API->>Auth: Generate OAuth2 state + Auth->>DB: Store state + app context + Auth-->>API: Redirect URL + state + API-->>Client: {redirect_url, state} + + Client->>Provider: Redirect to OAuth2 provider + Provider-->>Client: Authorization code + state + + Client->>API: GET /api/oauth2/callback?code=xxx&state=yyy + API->>Auth: Validate state and exchange code + Auth->>Provider: Exchange code for tokens + Provider-->>Auth: Access token + ID token + Auth->>Provider: Get user info + Provider-->>Auth: User profile + Auth->>DB: Create/update user session + Auth->>Auth: Generate internal JWT + Auth-->>API: JWT token + user context + API-->>Client: Set-Cookie with JWT + redirect + end + + %% Token Renewal Flow + rect rgb(248, 245, 255) + Note over Client, DB: Token Renewal Flow + Client->>API: POST /api/renew {app_id, user_id, token} + API->>Auth: Validate current token + Auth->>Auth: Check token expiration
and max_valid_at + Auth->>DB: Get application config + DB-->>Auth: TokenRenewalDuration, MaxTokenDuration + Auth->>Auth: Generate new JWT
with extended expiry + Auth->>Cache: Invalidate old token + Auth-->>API: New token + expires_at + API-->>Client: RenewResponse{token, expires_at} + end +``` + +### Authentication Methods + +#### **Header-based Authentication** +- **Use Case**: Service-to-service authentication +- **Security**: HMAC-SHA256 signatures with timestamp validation +- **Replay Protection**: 5-minute timestamp window +- **Caching**: 5-minute Redis cache for performance + +#### **JWT Authentication** +- **Use Case**: User authentication with session management +- **Security**: RSA signatures with revocation checking +- **Token Lifecycle**: Configurable expiration with renewal +- **Claims**: User ID, permissions, application scope + +#### **OAuth2/SAML Authentication** +- **Use Case**: External identity provider integration +- **Security**: Authorization code flow with state validation +- **Session Management**: Database-backed session storage +- **Provider Support**: Configurable provider endpoints + +--- + +## API Design + +### RESTful Endpoints + +#### **Authentication Endpoints** +``` +POST /api/login - Authenticate user, issue JWT +POST /api/renew - Renew JWT token +POST /api/logout - Revoke JWT token +GET /api/verify - Verify token and permissions +``` + +#### **Application Management** +``` +GET /api/applications - List applications (paginated) +POST /api/applications - Create application +GET /api/applications/:id - Get application details +PUT /api/applications/:id - Update application +DELETE /api/applications/:id - Delete application +``` + +#### **Token Management** +``` +GET /api/applications/:id/tokens - List application tokens +POST /api/applications/:id/tokens - Create new token +DELETE /api/tokens/:id - Revoke token +``` + +#### **OAuth2/SAML Integration** +``` +POST /api/oauth2/login - Initiate OAuth2 flow +GET /api/oauth2/callback - OAuth2 callback handler +POST /api/saml/login - Initiate SAML flow +POST /api/saml/callback - SAML assertion handler +``` + +#### **System Endpoints** +``` +GET /health - System health check +GET /ready - Readiness probe +GET /metrics - Prometheus metrics +``` + +### Request/Response Patterns + +#### **Authentication Headers** +```http +X-User-Email: user@example.com +X-Auth-Timestamp: 2024-01-15T10:30:00Z +X-Auth-Signature: sha256=abc123... +Authorization: Bearer eyJhbGciOiJSUzI1NiIs... +Content-Type: application/json +``` + +#### **Error Response Format** +```json +{ + "error": "validation_failed", + "message": "Request validation failed", + "details": [ + { + "field": "permissions", + "message": "Invalid permission format", + "value": "invalid.perm" + } + ] +} +``` + +#### **Success Response Format** +```json +{ + "data": { + "id": "550e8400-e29b-41d4-a716-446655440000", + "token": "ABC123_xyz789...", + "permissions": ["app.read", "token.create"], + "created_at": "2024-01-15T10:30:00Z" + } +} +``` + +--- + +## Technology Stack + +### Backend Technologies +- **Language**: Go 1.21+ with modules +- **Web Framework**: Gin HTTP framework +- **Database**: PostgreSQL 15 with connection pooling +- **Authentication**: JWT-Go library with RSA signing +- **Cryptography**: Go standard crypto libraries +- **Caching**: Redis for session and revocation storage +- **Logging**: Zap structured logging +- **Metrics**: Prometheus metrics collection + +### Frontend Technologies +- **Framework**: React 18 with TypeScript +- **UI Library**: Ant Design components +- **State Management**: React Context API +- **HTTP Client**: Axios with interceptors +- **Routing**: React Router with protected routes +- **Build Tool**: Create React App with TypeScript + +### Infrastructure +- **Containerization**: Docker with multi-stage builds +- **Orchestration**: Docker Compose for local development +- **Reverse Proxy**: Nginx with load balancing +- **Database Migrations**: Custom Go migration system +- **Health Monitoring**: Built-in health check endpoints + +### Security Stack +- **TLS**: TLS 1.3 for all communications +- **Hashing**: BCrypt with cost 14 for production +- **Signatures**: HMAC-SHA256 and RSA signatures +- **Rate Limiting**: Token bucket algorithm +- **CSRF**: Double-submit cookie pattern +- **Headers**: Comprehensive security headers + +--- + +## Deployment Considerations + +### Container Configuration +```yaml +services: + kms-api: + image: kms-api:latest + ports: + - "8080:8080" + environment: + - DB_HOST=postgres + - DB_PORT=5432 + - JWT_SECRET=${JWT_SECRET} + - HMAC_KEY=${HMAC_KEY} + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/health"] + interval: 30s + timeout: 10s + retries: 3 +``` + +### Environment Variables +```bash +# Database Configuration +DB_HOST=localhost +DB_PORT=5432 +DB_NAME=kms +DB_USER=postgres +DB_PASSWORD=postgres + +# Server Configuration +SERVER_HOST=0.0.0.0 +SERVER_PORT=8080 + +# Authentication +AUTH_PROVIDER=header +JWT_SECRET=your-jwt-secret +HMAC_KEY=your-hmac-key + +# Security +RATE_LIMIT_ENABLED=true +RATE_LIMIT_RPS=100 +RATE_LIMIT_BURST=200 +``` + +### Monitoring Setup +```yaml +prometheus: + scrape_configs: + - job_name: 'kms-api' + static_configs: + - targets: ['kms-api:9090'] + scrape_interval: 15s + metrics_path: /metrics +``` + +This architecture documentation provides a comprehensive technical overview of the KMS system, suitable for development teams, system architects, and operations personnel who need to understand, deploy, or maintain the system. \ No newline at end of file diff --git a/docs/DATABASE_SCHEMA.md b/docs/DATABASE_SCHEMA.md new file mode 100644 index 0000000..00ae1cc --- /dev/null +++ b/docs/DATABASE_SCHEMA.md @@ -0,0 +1,852 @@ +# KMS Database Schema Documentation + +## Table of Contents +1. [Schema Overview](#schema-overview) +2. [Entity Relationship Diagram](#entity-relationship-diagram) +3. [Table Definitions](#table-definitions) +4. [Relationships and Constraints](#relationships-and-constraints) +5. [Indexes and Performance](#indexes-and-performance) +6. [Security Considerations](#security-considerations) +7. [Migration Strategy](#migration-strategy) +8. [Query Patterns](#query-patterns) + +--- + +## Schema Overview + +The KMS database schema is designed around core entities that manage applications, tokens, permissions, and user sessions. The schema follows PostgreSQL best practices with proper normalization, constraints, and indexing strategies. + +### Core Entities +- **Applications**: Central configuration for API key management +- **Static Tokens**: Long-lived API tokens with HMAC signatures +- **User Sessions**: JWT token tracking with metadata +- **Available Permissions**: Hierarchical permission catalog +- **Granted Permissions**: Token-permission relationships +- **Audit Logs**: Complete audit trail of all operations + +### Design Principles +- **Normalized Design**: Reduces data redundancy +- **Referential Integrity**: Foreign key constraints ensure consistency +- **Audit Trail**: Complete history of all operations +- **Performance Optimized**: Strategic indexing for common queries +- **Security First**: Sensitive data protection and access controls + +--- + +## Entity Relationship Diagram + +```mermaid +erDiagram + %% Core Application Entity + applications { + string app_id PK "Application identifier" + string app_link "Application URL" + text[] type "static|user application types" + string callback_url "OAuth2 callback URL" + string hmac_key "HMAC signing key" + string token_prefix "Custom token prefix" + bigint token_renewal_duration "Token renewal window (ns)" + bigint max_token_duration "Max token lifetime (ns)" + string owner_type "individual|team" + string owner_name "Owner display name" + string owner_owner "Owner identifier" + timestamp created_at + timestamp updated_at + } + + %% Static Token Management + static_tokens { + uuid id PK + string app_id FK "References applications.app_id" + string owner_type "individual|team" + string owner_name "Token owner name" + string owner_owner "Token owner identifier" + string key_hash "BCrypt hashed token (cost 14)" + string type "Always 'hmac'" + timestamp created_at + timestamp updated_at + } + + %% User Session Tracking + user_sessions { + uuid id PK + string user_id "User identifier" + string app_id FK "References applications.app_id" + string session_token "Hashed session identifier" + text permissions "JSON array of permissions" + timestamp expires_at "Session expiration" + timestamp max_valid_at "Maximum validity window" + string provider "header|jwt|oauth2|saml" + json metadata "Provider-specific data" + boolean active "Session status" + timestamp created_at + timestamp updated_at + timestamp last_used_at + } + + %% Permission Catalog + available_permissions { + uuid id PK + string scope UK "Unique permission scope" + string name "Human-readable name" + text description "Permission description" + string category "Permission category" + string parent_scope FK "References available_permissions.scope" + boolean is_system "System permission flag" + timestamp created_at + string created_by "Creator identifier" + timestamp updated_at + string updated_by "Last updater" + } + + %% Token-Permission Relationships + granted_permissions { + uuid id PK + string token_type "static|user" + uuid token_id FK "References static_tokens.id" + uuid permission_id FK "References available_permissions.id" + string scope "Denormalized permission scope" + timestamp created_at + string created_by "Grant creator" + boolean revoked "Permission revocation status" + timestamp revoked_at "Revocation timestamp" + string revoked_by "Revoker identifier" + } + + %% Audit Trail + audit_logs { + uuid id PK + timestamp timestamp "Event timestamp" + string user_id "Acting user" + string action "Action performed" + string resource_type "Resource type affected" + string resource_id "Resource identifier" + json old_values "Previous values" + json new_values "New values" + string ip_address "Client IP" + string user_agent "Client user agent" + json metadata "Additional context" + } + + %% Relationships + applications ||--o{ static_tokens : "app_id" + applications ||--o{ user_sessions : "app_id" + available_permissions ||--o{ available_permissions : "parent_scope" + available_permissions ||--o{ granted_permissions : "permission_id" + static_tokens ||--o{ granted_permissions : "token_id" + + %% Indexes and Constraints + applications { + index idx_applications_owner_type "owner_type" + index idx_applications_created_at "created_at" + check owner_type_valid "owner_type IN ('individual', 'team')" + check type_not_empty "array_length(type, 1) > 0" + } + + static_tokens { + index idx_static_tokens_app_id "app_id" + index idx_static_tokens_key_hash "key_hash" + unique key_hash_unique "key_hash" + check type_hmac "type = 'hmac'" + } + + user_sessions { + index idx_user_sessions_user_id "user_id" + index idx_user_sessions_app_id "app_id" + index idx_user_sessions_token "session_token" + index idx_user_sessions_expires_at "expires_at" + index idx_user_sessions_active "active" + } + + available_permissions { + index idx_available_permissions_scope "scope" + index idx_available_permissions_category "category" + index idx_available_permissions_parent_scope "parent_scope" + index idx_available_permissions_is_system "is_system" + } + + granted_permissions { + index idx_granted_permissions_token "token_type, token_id" + index idx_granted_permissions_permission_id "permission_id" + index idx_granted_permissions_scope "scope" + index idx_granted_permissions_revoked "revoked" + unique token_permission_unique "token_type, token_id, permission_id" + } +``` + +--- + +## Table Definitions + +### Applications Table + +The central configuration table for all applications in the system. + +```sql +CREATE TABLE applications ( + app_id VARCHAR(100) PRIMARY KEY, + app_link TEXT NOT NULL, + type TEXT[] NOT NULL DEFAULT '{}', + callback_url TEXT, + hmac_key TEXT NOT NULL, + token_prefix VARCHAR(10), + token_renewal_duration BIGINT NOT NULL DEFAULT 3600000000000, -- 1 hour in nanoseconds + max_token_duration BIGINT NOT NULL DEFAULT 86400000000000, -- 24 hours in nanoseconds + owner_type VARCHAR(20) NOT NULL CHECK (owner_type IN ('individual', 'team')), + owner_name VARCHAR(255) NOT NULL, + owner_owner VARCHAR(255) NOT NULL, + created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), + updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), + + -- Constraints + CONSTRAINT app_id_format CHECK (app_id ~ '^[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]$'), + CONSTRAINT type_not_empty CHECK (array_length(type, 1) > 0), + CONSTRAINT token_prefix_format CHECK (token_prefix IS NULL OR token_prefix ~ '^[A-Z]{2,4}$'), + CONSTRAINT valid_durations CHECK ( + token_renewal_duration > 0 AND + max_token_duration > 0 AND + max_token_duration > token_renewal_duration + ) +); +``` + +#### Field Descriptions +- **`app_id`**: Unique application identifier, must follow naming conventions +- **`app_link`**: URL to the application for reference +- **`type`**: Array of supported token types (`static`, `user`) +- **`callback_url`**: OAuth2/SAML callback URL for authentication flows +- **`hmac_key`**: HMAC signing key for static token validation +- **`token_prefix`**: Custom prefix for generated tokens (2-4 uppercase letters) +- **`token_renewal_duration`**: How long tokens can be renewed (nanoseconds) +- **`max_token_duration`**: Maximum token lifetime (nanoseconds) +- **`owner_type`**: Individual or team ownership +- **`owner_name`**: Display name of the owner +- **`owner_owner`**: Identifier of the owner (email for individual, team ID for team) + +### Static Tokens Table + +Long-lived API tokens with HMAC-based authentication. + +```sql +CREATE TABLE static_tokens ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + app_id VARCHAR(100) NOT NULL REFERENCES applications(app_id) ON DELETE CASCADE, + owner_type VARCHAR(20) NOT NULL CHECK (owner_type IN ('individual', 'team')), + owner_name VARCHAR(255) NOT NULL, + owner_owner VARCHAR(255) NOT NULL, + key_hash TEXT NOT NULL UNIQUE, + type VARCHAR(10) NOT NULL DEFAULT 'hmac' CHECK (type = 'hmac'), + created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), + updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW() +); +``` + +#### Security Features +- **`key_hash`**: BCrypt hash of the actual token (cost 14) +- **Unique constraint**: Prevents token duplication +- **Cascade deletion**: Tokens deleted when application is removed +- **Owner tracking**: Links tokens to their creators + +### User Sessions Table + +JWT token session tracking with comprehensive metadata. + +```sql +CREATE TABLE user_sessions ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + user_id VARCHAR(255) NOT NULL, + app_id VARCHAR(100) NOT NULL REFERENCES applications(app_id) ON DELETE CASCADE, + session_token VARCHAR(255) NOT NULL, + permissions TEXT NOT NULL DEFAULT '[]', -- JSON array + expires_at TIMESTAMP WITH TIME ZONE NOT NULL, + max_valid_at TIMESTAMP WITH TIME ZONE NOT NULL, + provider VARCHAR(20) NOT NULL CHECK (provider IN ('header', 'jwt', 'oauth2', 'saml')), + metadata JSONB DEFAULT '{}', + active BOOLEAN NOT NULL DEFAULT true, + created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), + updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), + last_used_at TIMESTAMP WITH TIME ZONE, + + -- Constraints + CONSTRAINT valid_session_times CHECK (max_valid_at >= expires_at), + CONSTRAINT valid_permissions CHECK (permissions::json IS NOT NULL) +); +``` + +#### Session Management Features +- **Session tracking**: Complete session lifecycle management +- **Multi-provider support**: Tracks authentication method +- **Permission storage**: JSON array of granted permissions +- **Activity tracking**: Last used timestamp for session cleanup +- **Flexible metadata**: Provider-specific data storage + +### Available Permissions Table + +Hierarchical permission catalog with system and custom permissions. + +```sql +CREATE TABLE available_permissions ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + scope VARCHAR(255) NOT NULL UNIQUE, + name VARCHAR(255) NOT NULL, + description TEXT, + category VARCHAR(100) NOT NULL DEFAULT 'custom', + parent_scope VARCHAR(255) REFERENCES available_permissions(scope) ON DELETE SET NULL, + is_system BOOLEAN NOT NULL DEFAULT false, + created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), + created_by VARCHAR(255) NOT NULL, + updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), + updated_by VARCHAR(255) NOT NULL, + + -- Constraints + CONSTRAINT scope_format CHECK (scope ~ '^[a-zA-Z][a-zA-Z0-9._]*[a-zA-Z0-9]$'), + CONSTRAINT no_self_reference CHECK (scope != parent_scope) +); +``` + +#### Permission Hierarchy Features +- **Hierarchical structure**: Parent-child permission relationships +- **System permissions**: Built-in permissions that cannot be deleted +- **Flexible scoping**: Dot-notation permission scopes +- **Audit tracking**: Creation and modification history + +### Granted Permissions Table + +Token-permission relationship management with revocation support. + +```sql +CREATE TABLE granted_permissions ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + token_type VARCHAR(10) NOT NULL CHECK (token_type IN ('static', 'user')), + token_id UUID NOT NULL, + permission_id UUID NOT NULL REFERENCES available_permissions(id) ON DELETE CASCADE, + scope VARCHAR(255) NOT NULL, -- Denormalized for performance + created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), + created_by VARCHAR(255) NOT NULL, + revoked BOOLEAN NOT NULL DEFAULT false, + revoked_at TIMESTAMP WITH TIME ZONE, + revoked_by VARCHAR(255), + + -- Constraints + CONSTRAINT unique_token_permission UNIQUE (token_type, token_id, permission_id), + CONSTRAINT valid_revocation CHECK ( + (revoked = false AND revoked_at IS NULL AND revoked_by IS NULL) OR + (revoked = true AND revoked_at IS NOT NULL AND revoked_by IS NOT NULL) + ) +); +``` + +#### Permission Grant Features +- **Multi-token support**: Works with both static and user tokens +- **Denormalized scope**: Performance optimization for common queries +- **Revocation tracking**: Complete audit trail of permission changes +- **Referential integrity**: Maintains consistency with permission catalog + +### Audit Logs Table + +Complete audit trail of all system operations. + +```sql +CREATE TABLE audit_logs ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), + user_id VARCHAR(255) NOT NULL, + action VARCHAR(100) NOT NULL, + resource_type VARCHAR(50) NOT NULL, + resource_id VARCHAR(255), + old_values JSONB, + new_values JSONB, + ip_address INET, + user_agent TEXT, + metadata JSONB DEFAULT '{}', + + -- Constraints + CONSTRAINT valid_action CHECK (action ~ '^[a-z][a-z_]*[a-z]$'), + CONSTRAINT valid_resource_type CHECK (resource_type ~ '^[a-z][a-z_]*[a-z]$') +); +``` + +#### Audit Features +- **Complete coverage**: All operations logged +- **Before/after values**: Full change tracking +- **Client context**: IP address and user agent +- **Flexible metadata**: Additional context storage +- **Time-series data**: Ordered by timestamp for analysis + +--- + +## Relationships and Constraints + +### Primary Relationships + +#### **Application → Static Tokens (1:N)** +```sql +ALTER TABLE static_tokens +ADD CONSTRAINT fk_static_tokens_app_id +FOREIGN KEY (app_id) REFERENCES applications(app_id) ON DELETE CASCADE; +``` + +#### **Application → User Sessions (1:N)** +```sql +ALTER TABLE user_sessions +ADD CONSTRAINT fk_user_sessions_app_id +FOREIGN KEY (app_id) REFERENCES applications(app_id) ON DELETE CASCADE; +``` + +#### **Available Permissions → Self (Hierarchy)** +```sql +ALTER TABLE available_permissions +ADD CONSTRAINT fk_available_permissions_parent +FOREIGN KEY (parent_scope) REFERENCES available_permissions(scope) ON DELETE SET NULL; +``` + +#### **Granted Permissions → Available Permissions (N:1)** +```sql +ALTER TABLE granted_permissions +ADD CONSTRAINT fk_granted_permissions_permission +FOREIGN KEY (permission_id) REFERENCES available_permissions(id) ON DELETE CASCADE; +``` + +### Data Integrity Constraints + +#### **Check Constraints** +```sql +-- Application ID format validation +ALTER TABLE applications +ADD CONSTRAINT app_id_format +CHECK (app_id ~ '^[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]$'); + +-- Owner type validation +ALTER TABLE applications +ADD CONSTRAINT owner_type_valid +CHECK (owner_type IN ('individual', 'team')); + +-- Token type validation +ALTER TABLE static_tokens +ADD CONSTRAINT type_hmac +CHECK (type = 'hmac'); + +-- Permission scope format +ALTER TABLE available_permissions +ADD CONSTRAINT scope_format +CHECK (scope ~ '^[a-zA-Z][a-zA-Z0-9._]*[a-zA-Z0-9]$'); + +-- Session time validation +ALTER TABLE user_sessions +ADD CONSTRAINT valid_session_times +CHECK (max_valid_at >= expires_at); +``` + +#### **Unique Constraints** +```sql +-- Unique token hashes +ALTER TABLE static_tokens +ADD CONSTRAINT key_hash_unique UNIQUE (key_hash); + +-- Unique permission scopes +ALTER TABLE available_permissions +ADD CONSTRAINT scope_unique UNIQUE (scope); + +-- Unique token-permission relationships +ALTER TABLE granted_permissions +ADD CONSTRAINT unique_token_permission +UNIQUE (token_type, token_id, permission_id); +``` + +--- + +## Indexes and Performance + +### Primary Indexes + +#### **Applications Table** +```sql +-- Primary key index (automatic) +CREATE INDEX idx_applications_owner_type ON applications(owner_type); +CREATE INDEX idx_applications_created_at ON applications(created_at); +CREATE INDEX idx_applications_owner_owner ON applications(owner_owner); +``` + +#### **Static Tokens Table** +```sql +-- Foreign key and lookup indexes +CREATE INDEX idx_static_tokens_app_id ON static_tokens(app_id); +CREATE INDEX idx_static_tokens_key_hash ON static_tokens(key_hash); -- For token verification +CREATE INDEX idx_static_tokens_created_at ON static_tokens(created_at); +CREATE INDEX idx_static_tokens_owner_owner ON static_tokens(owner_owner); +``` + +#### **User Sessions Table** +```sql +-- Session lookup indexes +CREATE INDEX idx_user_sessions_user_id ON user_sessions(user_id); +CREATE INDEX idx_user_sessions_app_id ON user_sessions(app_id); +CREATE INDEX idx_user_sessions_token ON user_sessions(session_token); +CREATE INDEX idx_user_sessions_expires_at ON user_sessions(expires_at); +CREATE INDEX idx_user_sessions_active ON user_sessions(active); + +-- Composite index for active session lookup +CREATE INDEX idx_user_sessions_active_lookup +ON user_sessions(user_id, app_id, active) WHERE active = true; + +-- Cleanup index for expired sessions +CREATE INDEX idx_user_sessions_cleanup +ON user_sessions(expires_at) WHERE active = true; +``` + +#### **Available Permissions Table** +```sql +-- Permission lookup indexes +CREATE INDEX idx_available_permissions_scope ON available_permissions(scope); +CREATE INDEX idx_available_permissions_category ON available_permissions(category); +CREATE INDEX idx_available_permissions_parent_scope ON available_permissions(parent_scope); +CREATE INDEX idx_available_permissions_is_system ON available_permissions(is_system); + +-- Hierarchy traversal index +CREATE INDEX idx_available_permissions_hierarchy +ON available_permissions(parent_scope, scope); +``` + +#### **Granted Permissions Table** +```sql +-- Token permission lookup +CREATE INDEX idx_granted_permissions_token ON granted_permissions(token_type, token_id); +CREATE INDEX idx_granted_permissions_permission_id ON granted_permissions(permission_id); +CREATE INDEX idx_granted_permissions_scope ON granted_permissions(scope); +CREATE INDEX idx_granted_permissions_revoked ON granted_permissions(revoked); + +-- Active permissions index +CREATE INDEX idx_granted_permissions_active +ON granted_permissions(token_type, token_id, scope) WHERE revoked = false; + +-- Permission cleanup index +CREATE INDEX idx_granted_permissions_revoked_at ON granted_permissions(revoked_at); +``` + +#### **Audit Logs Table** +```sql +-- Audit query indexes +CREATE INDEX idx_audit_logs_timestamp ON audit_logs(timestamp); +CREATE INDEX idx_audit_logs_user_id ON audit_logs(user_id); +CREATE INDEX idx_audit_logs_action ON audit_logs(action); +CREATE INDEX idx_audit_logs_resource ON audit_logs(resource_type, resource_id); + +-- Time-series partitioning preparation +CREATE INDEX idx_audit_logs_monthly +ON audit_logs(date_trunc('month', timestamp), timestamp); +``` + +### Performance Optimization + +#### **Partial Indexes** +```sql +-- Index only active sessions +CREATE INDEX idx_active_sessions +ON user_sessions(user_id, app_id) WHERE active = true; + +-- Index only non-revoked permissions +CREATE INDEX idx_active_permissions +ON granted_permissions(token_id, scope) WHERE revoked = false; + +-- Index only system permissions +CREATE INDEX idx_system_permissions +ON available_permissions(scope, parent_scope) WHERE is_system = true; +``` + +#### **Composite Indexes** +```sql +-- Application ownership queries +CREATE INDEX idx_applications_ownership +ON applications(owner_type, owner_owner, created_at); + +-- Token verification queries +CREATE INDEX idx_token_verification +ON static_tokens(app_id, key_hash); + +-- Permission evaluation queries +CREATE INDEX idx_permission_evaluation +ON granted_permissions(token_type, token_id, permission_id, revoked); +``` + +--- + +## Security Considerations + +### Data Protection + +#### **Sensitive Data Handling** +```sql +-- HMAC keys should be encrypted at application level +-- Token hashes use BCrypt with cost 14 +-- Audit logs contain no sensitive data in plaintext + +-- Row-level security (RLS) for multi-tenant isolation +ALTER TABLE applications ENABLE ROW LEVEL SECURITY; +ALTER TABLE static_tokens ENABLE ROW LEVEL SECURITY; +ALTER TABLE user_sessions ENABLE ROW LEVEL SECURITY; + +-- Example RLS policy for application isolation +CREATE POLICY app_owner_policy ON applications +FOR ALL TO app_user +USING (owner_owner = current_setting('app.user_id')); +``` + +#### **Access Control** +```sql +-- Database roles for different access patterns +CREATE ROLE kms_api_service; +CREATE ROLE kms_readonly; +CREATE ROLE kms_admin; + +-- Grant appropriate permissions +GRANT SELECT, INSERT, UPDATE, DELETE ON applications TO kms_api_service; +GRANT SELECT, INSERT, UPDATE, DELETE ON static_tokens TO kms_api_service; +GRANT SELECT, INSERT, UPDATE, DELETE ON user_sessions TO kms_api_service; +GRANT SELECT, INSERT, UPDATE, DELETE ON granted_permissions TO kms_api_service; +GRANT SELECT, INSERT ON audit_logs TO kms_api_service; + +-- Read-only access for reporting +GRANT SELECT ON ALL TABLES IN SCHEMA public TO kms_readonly; + +-- Admin access for maintenance +GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO kms_admin; +``` + +### Audit and Compliance + +#### **Audit Triggers** +```sql +-- Automatic audit logging trigger +CREATE OR REPLACE FUNCTION audit_trigger_function() +RETURNS TRIGGER AS $$ +BEGIN + INSERT INTO audit_logs ( + user_id, action, resource_type, resource_id, + old_values, new_values, ip_address + ) VALUES ( + current_setting('app.user_id', true), + TG_OP, + TG_TABLE_NAME, + COALESCE(NEW.id, OLD.id)::text, + CASE WHEN TG_OP = 'DELETE' THEN row_to_json(OLD) ELSE NULL END, + CASE WHEN TG_OP = 'INSERT' THEN row_to_json(NEW) + WHEN TG_OP = 'UPDATE' THEN row_to_json(NEW) + ELSE NULL END, + inet_client_addr() + ); + RETURN COALESCE(NEW, OLD); +END; +$$ LANGUAGE plpgsql SECURITY DEFINER; + +-- Apply audit trigger to sensitive tables +CREATE TRIGGER audit_applications +AFTER INSERT OR UPDATE OR DELETE ON applications +FOR EACH ROW EXECUTE FUNCTION audit_trigger_function(); + +CREATE TRIGGER audit_static_tokens +AFTER INSERT OR UPDATE OR DELETE ON static_tokens +FOR EACH ROW EXECUTE FUNCTION audit_trigger_function(); + +CREATE TRIGGER audit_granted_permissions +AFTER INSERT OR UPDATE OR DELETE ON granted_permissions +FOR EACH ROW EXECUTE FUNCTION audit_trigger_function(); +``` + +--- + +## Migration Strategy + +### Migration Framework + +The KMS uses a custom Go migration system with the following structure: + +``` +migrations/ +├── 001_initial_schema.up.sql +├── 001_initial_schema.down.sql +├── 002_user_sessions.up.sql +├── 002_user_sessions.down.sql +├── 003_add_token_prefix.up.sql +└── 003_add_token_prefix.down.sql +``` + +#### **Migration Management** +```go +// File: internal/database/migrations.go +type Migration struct { + Version int + Name string + UpScript string + DownScript string + AppliedAt time.Time +} + +func RunMigrations(db *sql.DB, migrationsPath string) error { + // Create migration tracking table + createMigrationTable(db) + + // Get applied migrations + applied, err := getAppliedMigrations(db) + if err != nil { + return err + } + + // Run pending migrations + return runPendingMigrations(db, migrationsPath, applied) +} +``` + +### Schema Versioning + +#### **Version Control** +- **Sequential numbering**: 001, 002, 003... +- **Descriptive names**: Clear migration purpose +- **Rollback support**: Down scripts for every migration +- **Atomic operations**: Each migration in a transaction + +#### **Migration Best Practices** +```sql +-- Always start with BEGIN; and end with COMMIT; +BEGIN; + +-- Add new columns with default values +ALTER TABLE applications +ADD COLUMN token_prefix VARCHAR(10) DEFAULT NULL; + +-- Add constraints after data migration +ALTER TABLE applications +ADD CONSTRAINT token_prefix_format +CHECK (token_prefix IS NULL OR token_prefix ~ '^[A-Z]{2,4}$'); + +-- Create indexes concurrently (outside transaction) +COMMIT; + +CREATE INDEX CONCURRENTLY idx_applications_token_prefix +ON applications(token_prefix) WHERE token_prefix IS NOT NULL; +``` + +--- + +## Query Patterns + +### Common Query Patterns + +#### **Application Queries** +```sql +-- Get application with ownership check +SELECT a.* FROM applications a +WHERE a.app_id = $1 + AND (a.owner_owner = $2 OR $2 = 'admin@example.com'); + +-- List applications for user with pagination +SELECT a.app_id, a.app_link, a.owner_name, a.created_at +FROM applications a +WHERE a.owner_owner = $1 +ORDER BY a.created_at DESC +LIMIT $2 OFFSET $3; +``` + +#### **Token Verification Queries** +```sql +-- Verify static token +SELECT st.id, st.app_id, st.key_hash +FROM static_tokens st +WHERE st.app_id = $1; + +-- Get token permissions +SELECT ap.scope +FROM granted_permissions gp +JOIN available_permissions ap ON gp.permission_id = ap.id +WHERE gp.token_type = 'static' + AND gp.token_id = $1 + AND gp.revoked = false; +``` + +#### **Session Management Queries** +```sql +-- Get active user session +SELECT us.* FROM user_sessions us +WHERE us.user_id = $1 + AND us.app_id = $2 + AND us.active = true + AND us.expires_at > NOW() +ORDER BY us.created_at DESC +LIMIT 1; + +-- Clean up expired sessions +UPDATE user_sessions +SET active = false, updated_at = NOW() +WHERE active = true + AND expires_at < NOW(); +``` + +#### **Permission Evaluation Queries** +```sql +-- Check user permission +WITH user_permissions AS ( + SELECT DISTINCT ap.scope + FROM user_sessions us + JOIN granted_permissions gp ON gp.token_type = 'user' + JOIN available_permissions ap ON gp.permission_id = ap.id + WHERE us.user_id = $1 + AND us.app_id = $2 + AND us.active = true + AND us.expires_at > NOW() + AND gp.revoked = false +) +SELECT EXISTS( + SELECT 1 FROM user_permissions + WHERE scope = $3 OR scope = split_part($3, '.', 1) +); + +-- Get permission hierarchy +WITH RECURSIVE permission_tree AS ( + -- Base case: root permissions + SELECT id, scope, name, parent_scope, 0 as level + FROM available_permissions + WHERE parent_scope IS NULL + + UNION ALL + + -- Recursive case: child permissions + SELECT ap.id, ap.scope, ap.name, ap.parent_scope, pt.level + 1 + FROM available_permissions ap + JOIN permission_tree pt ON ap.parent_scope = pt.scope +) +SELECT * FROM permission_tree ORDER BY level, scope; +``` + +### Performance Tuning + +#### **Query Optimization** +```sql +-- Use EXPLAIN ANALYZE for query planning +EXPLAIN (ANALYZE, BUFFERS) +SELECT st.id FROM static_tokens st +WHERE st.app_id = 'test-app' AND st.key_hash = 'hash123'; + +-- Optimize with covering indexes +CREATE INDEX idx_static_tokens_covering +ON static_tokens(app_id, key_hash) +INCLUDE (id, created_at); + +-- Use partial indexes for frequent filters +CREATE INDEX idx_active_sessions_partial +ON user_sessions(user_id, app_id, expires_at) +WHERE active = true; +``` + +#### **Connection Pooling Configuration** +```yaml +database: + host: postgres + port: 5432 + name: kms + user: kms_api_service + max_open_connections: 25 + max_idle_connections: 5 + connection_max_lifetime: 300s + connection_max_idle_time: 60s +``` + +This database schema documentation provides comprehensive coverage of the KMS data model, suitable for developers, database administrators, and system architects who need to understand, maintain, or extend the database layer. \ No newline at end of file diff --git a/docs/DEPLOYMENT_GUIDE.md b/docs/DEPLOYMENT_GUIDE.md new file mode 100644 index 0000000..ef23f82 --- /dev/null +++ b/docs/DEPLOYMENT_GUIDE.md @@ -0,0 +1,1233 @@ +# KMS Deployment Guide + +## Table of Contents +1. [Deployment Overview](#deployment-overview) +2. [Container Architecture](#container-architecture) +3. [Environment Configuration](#environment-configuration) +4. [Docker Compose Setup](#docker-compose-setup) +5. [Production Deployment](#production-deployment) +6. [Monitoring and Health Checks](#monitoring-and-health-checks) +7. [Security Configuration](#security-configuration) +8. [Backup and Recovery](#backup-and-recovery) +9. [Troubleshooting](#troubleshooting) + +--- + +## Deployment Overview + +The KMS is designed as a containerized application using Docker and Docker Compose for orchestration. The system consists of multiple services working together to provide secure API key management capabilities. + +### Service Architecture +- **kms-nginx**: Reverse proxy and load balancer +- **kms-api**: Go backend API service +- **kms-frontend**: React TypeScript frontend +- **kms-postgres**: PostgreSQL database +- **kms-redis**: Redis cache (optional) + +### Deployment Modes +- **Development**: Docker Compose with hot reload +- **Testing**: Isolated container environment +- **Staging**: Production-like with monitoring +- **Production**: High availability with load balancing + +--- + +## Container Architecture + +```mermaid +graph TB + subgraph "External Network" + Internet[Internet] + CDN[Content Delivery Network
Static Assets] + DNS[DNS
Load Balancing] + end + + subgraph "Load Balancer Tier" + LB1[Load Balancer 1
HAProxy/Nginx] + LB2[Load Balancer 2
HAProxy/Nginx] + VIP[Virtual IP
Failover] + end + + subgraph "Container Orchestration - Docker Compose" + subgraph "Reverse Proxy" + Nginx1[Nginx Container
kms-nginx
Port 8081:80] + end + + subgraph "API Tier" + API1[API Service Container
kms-api-service
Port 8080:8080] + Metrics1[Metrics Endpoint
Port 9090:9090
Prometheus] + end + + subgraph "Frontend Tier" + Frontend1[React SPA Container
kms-frontend
Port 3000:80] + StaticAssets[Static Assets
Nginx served] + end + + subgraph "Network" + InternalNet[kms-network
Bridge Network
Internal Communication] + end + end + + subgraph "Data Tier" + subgraph "Primary Database" + PostgreSQL1[PostgreSQL 15
kms-postgres
Port 5432:5432] + DBData[Persistent Volume
postgres_data] + end + + subgraph "Cache Layer" + Redis1[Redis Cache
Optional
Session Store] + end + + subgraph "Migration System" + Migrations[Database Migrations
Auto-run on startup
Volume mounted] + end + end + + subgraph "Monitoring & Observability" + subgraph "Metrics Collection" + Prometheus[Prometheus
Metrics Scraping] + Grafana[Grafana
Dashboards] + end + + subgraph "Logging" + LogAggregator[Log Aggregation
ELK Stack / Loki] + StructuredLogs[Structured Logging
JSON format] + end + + subgraph "Health Monitoring" + HealthCheck[Health Checks
/health endpoint] + Alerting[Alerting
PagerDuty/Slack] + end + end + + subgraph "Security & Compliance" + subgraph "Certificate Management" + TLS[TLS Certificates
Let's Encrypt/Manual] + CertRotation[Certificate Rotation
Automated] + end + + subgraph "Secrets Management" + EnvVars[Environment Variables
Container secrets] + VaultIntegration[Vault Integration
Secret rotation] + end + + subgraph "Backup & Recovery" + DBBackup[Database Backups
Automated daily] + DisasterRecovery[Disaster Recovery
Multi-region] + end + end + + subgraph "Development Environment" + LocalDev[Local Development
docker-compose.yml] + TestEnv[Test Environment
Isolated containers] + CICDPipeline[CI/CD Pipeline
GitHub Actions] + end + + %% External Connections + Internet --> DNS + DNS --> VIP + CDN --> Frontend1 + + %% Load Balancer Configuration + VIP --> LB1 + VIP --> LB2 + LB1 --> Nginx1 + LB2 --> Nginx1 + + %% Container Communication + Nginx1 --> API1 + Nginx1 --> Frontend1 + API1 --> PostgreSQL1 + API1 --> Redis1 + API1 --> Metrics1 + + %% Data Flow + PostgreSQL1 --> DBData + API1 --> Migrations + Migrations --> PostgreSQL1 + + %% Network Isolation + InternalNet --> API1 + InternalNet --> Frontend1 + InternalNet --> PostgreSQL1 + InternalNet --> Redis1 + InternalNet --> Nginx1 + + %% Monitoring Connections + API1 --> Prometheus + Prometheus --> Grafana + API1 --> LogAggregator + LogAggregator --> StructuredLogs + HealthCheck --> API1 + HealthCheck --> PostgreSQL1 + HealthCheck --> Alerting + + %% Security Connections + TLS --> Nginx1 + CertRotation --> TLS + EnvVars --> API1 + VaultIntegration --> EnvVars + DBBackup --> PostgreSQL1 + DisasterRecovery --> DBBackup + + %% Development Flow + CICDPipeline --> LocalDev + LocalDev --> TestEnv + TestEnv --> API1 + + %% Styling + classDef external fill:#ffecb3,stroke:#f57c00,stroke-width:2px + classDef loadbalancer fill:#e1bee7,stroke:#7b1fa2,stroke-width:2px + classDef container fill:#c8e6c9,stroke:#388e3c,stroke-width:2px + classDef data fill:#e3f2fd,stroke:#1976d2,stroke-width:2px + classDef monitoring fill:#fff3e0,stroke:#ef6c00,stroke-width:2px + classDef security fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px + classDef development fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px + + class Internet,CDN,DNS external + class LB1,LB2,VIP loadbalancer + class Nginx1,API1,Frontend1,Metrics1,InternalNet,StaticAssets container + class PostgreSQL1,Redis1,DBData,Migrations data + class Prometheus,Grafana,LogAggregator,StructuredLogs,HealthCheck,Alerting monitoring + class TLS,CertRotation,EnvVars,VaultIntegration,DBBackup,DisasterRecovery security + class LocalDev,TestEnv,CICDPipeline development +``` + +--- + +## Environment Configuration + +### Environment Variables + +#### **Database Configuration** +```bash +# PostgreSQL Database +DB_HOST=postgres +DB_PORT=5432 +DB_NAME=kms +DB_USER=postgres +DB_PASSWORD=secure_password_here +DB_MAX_OPEN_CONNECTIONS=25 +DB_MAX_IDLE_CONNECTIONS=5 +DB_CONNECTION_MAX_LIFETIME=300s +DB_CONNECTION_MAX_IDLE_TIME=60s +``` + +#### **Server Configuration** +```bash +# API Server +SERVER_HOST=0.0.0.0 +SERVER_PORT=8080 +SERVER_READ_TIMEOUT=30s +SERVER_WRITE_TIMEOUT=30s +SERVER_IDLE_TIMEOUT=120s + +# Environment +ENVIRONMENT=production +LOG_LEVEL=info +DEBUG_MODE=false +``` + +#### **Authentication Configuration** +```bash +# Authentication Provider +AUTH_PROVIDER=header +AUTH_HEADER_USER_EMAIL=X-User-Email +AUTH_SIGNING_KEY=your_hmac_signing_key_here + +# JWT Configuration +JWT_SECRET=your_jwt_secret_here +JWT_ISSUER=kms-api-service +JWT_EXPIRY=1h + +# OAuth2 Configuration (optional) +OAUTH2_CLIENT_ID=your_oauth2_client_id +OAUTH2_CLIENT_SECRET=your_oauth2_client_secret +OAUTH2_REDIRECT_URL=https://your-domain.com/api/oauth2/callback +OAUTH2_PROVIDER_URL=https://oauth-provider.com + +# SAML Configuration (optional) +SAML_IDP_URL=https://saml-provider.com/sso +SAML_CERT_PATH=/etc/ssl/certs/saml.crt +SAML_KEY_PATH=/etc/ssl/private/saml.key +``` + +#### **Security Configuration** +```bash +# Rate Limiting +RATE_LIMIT_ENABLED=true +RATE_LIMIT_RPS=100 +RATE_LIMIT_BURST=200 +AUTH_RATE_LIMIT_RPS=5 +AUTH_RATE_LIMIT_BURST=10 + +# Security Settings +CSRF_TOKEN_MAX_AGE=1h +MAX_AUTH_FAILURES=5 +IP_BLOCK_DURATION=1h +AUTH_FAILURE_WINDOW=15m +REQUEST_MAX_AGE=5m + +# CORS Settings +CORS_ALLOWED_ORIGINS=https://your-frontend-domain.com,https://admin.your-domain.com +CORS_ALLOWED_METHODS=GET,POST,PUT,DELETE,OPTIONS +CORS_ALLOWED_HEADERS=Origin,Content-Type,Accept,Authorization,X-User-Email,X-CSRF-Token +``` + +#### **Monitoring Configuration** +```bash +# Metrics +METRICS_ENABLED=true +METRICS_PORT=9090 +METRICS_PATH=/metrics + +# Health Checks +HEALTH_CHECK_ENABLED=true +HEALTH_CHECK_INTERVAL=30s +``` + +### Configuration Management + +#### **Environment Files** +```bash +# Development environment +.env.development + +# Testing environment +.env.test + +# Staging environment +.env.staging + +# Production environment +.env.production +``` + +#### **Docker Environment** +```yaml +# docker-compose.override.yml for local development +version: '3.8' +services: + kms-api: + environment: + - LOG_LEVEL=debug + - DEBUG_MODE=true + volumes: + - ./:/app + command: air -c .air.toml # Hot reload +``` + +--- + +## Docker Compose Setup + +### Development Configuration + +#### **docker-compose.yml** +```yaml +version: '3.8' + +services: + # PostgreSQL Database + kms-postgres: + image: postgres:15-alpine + container_name: kms-postgres + environment: + POSTGRES_DB: ${DB_NAME:-kms} + POSTGRES_USER: ${DB_USER:-postgres} + POSTGRES_PASSWORD: ${DB_PASSWORD:-postgres} + POSTGRES_INITDB_ARGS: "--encoding=UTF-8 --lc-collate=C --lc-ctype=C" + volumes: + - postgres_data:/var/lib/postgresql/data + - ./migrations:/docker-entrypoint-initdb.d:ro + ports: + - "5432:5432" + healthcheck: + test: ["CMD-SHELL", "pg_isready -U ${DB_USER:-postgres}"] + interval: 30s + timeout: 10s + retries: 5 + start_period: 30s + networks: + - kms-network + restart: unless-stopped + + # Redis Cache (Optional) + kms-redis: + image: redis:7-alpine + container_name: kms-redis + command: redis-server --appendonly yes --requirepass ${REDIS_PASSWORD:-redis_password} + volumes: + - redis_data:/data + ports: + - "6379:6379" + healthcheck: + test: ["CMD", "redis-cli", "ping"] + interval: 30s + timeout: 10s + retries: 3 + networks: + - kms-network + restart: unless-stopped + + # API Service + kms-api: + build: + context: . + dockerfile: Dockerfile + target: production + container_name: kms-api + environment: + - DB_HOST=kms-postgres + - DB_PORT=5432 + - DB_NAME=${DB_NAME:-kms} + - DB_USER=${DB_USER:-postgres} + - DB_PASSWORD=${DB_PASSWORD:-postgres} + - REDIS_HOST=kms-redis + - REDIS_PORT=6379 + - REDIS_PASSWORD=${REDIS_PASSWORD:-redis_password} + - SERVER_PORT=8080 + - JWT_SECRET=${JWT_SECRET} + - AUTH_SIGNING_KEY=${AUTH_SIGNING_KEY} + - LOG_LEVEL=${LOG_LEVEL:-info} + - RATE_LIMIT_ENABLED=true + ports: + - "8080:8080" + - "9090:9090" # Metrics port + depends_on: + kms-postgres: + condition: service_healthy + kms-redis: + condition: service_healthy + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/health"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 60s + networks: + - kms-network + restart: unless-stopped + volumes: + - ./logs:/app/logs + + # Frontend Service + kms-frontend: + build: + context: ./kms-frontend + dockerfile: Dockerfile + target: production + container_name: kms-frontend + environment: + - REACT_APP_API_BASE_URL=http://localhost:8081/api + ports: + - "3000:80" + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:80"] + interval: 30s + timeout: 10s + retries: 3 + networks: + - kms-network + restart: unless-stopped + + # Nginx Reverse Proxy + kms-nginx: + image: nginx:alpine + container_name: kms-nginx + ports: + - "8081:80" + - "8443:443" + volumes: + - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro + - ./nginx/default.conf:/etc/nginx/conf.d/default.conf:ro + - ./ssl:/etc/ssl/certs:ro + depends_on: + - kms-api + - kms-frontend + healthcheck: + test: ["CMD", "nginx", "-t"] + interval: 30s + timeout: 10s + retries: 3 + networks: + - kms-network + restart: unless-stopped + +networks: + kms-network: + driver: bridge + ipam: + config: + - subnet: 172.20.0.0/16 + +volumes: + postgres_data: + driver: local + redis_data: + driver: local +``` + +#### **Dockerfile (Multi-stage)** +```dockerfile +# Build stage +FROM golang:1.21-alpine AS builder + +WORKDIR /app + +# Install dependencies +RUN apk add --no-cache git ca-certificates tzdata + +# Copy go mod files +COPY go.mod go.sum ./ +RUN go mod download + +# Copy source code +COPY . . + +# Build the binary +RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main ./cmd/server + +# Production stage +FROM alpine:latest AS production + +RUN apk --no-cache add ca-certificates curl + +WORKDIR /root/ + +# Copy the binary from builder stage +COPY --from=builder /app/main . +COPY --from=builder /app/migrations ./migrations + +# Create non-root user +RUN addgroup -g 1001 appgroup && \ + adduser -D -u 1001 -G appgroup appuser + +USER appuser + +EXPOSE 8080 9090 + +HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \ + CMD curl -f http://localhost:8080/health || exit 1 + +CMD ["./main"] +``` + +### Production Configuration + +#### **docker-compose.prod.yml** +```yaml +version: '3.8' + +services: + kms-postgres: + image: postgres:15-alpine + environment: + POSTGRES_DB: ${DB_NAME} + POSTGRES_USER: ${DB_USER} + POSTGRES_PASSWORD: ${DB_PASSWORD} + POSTGRES_MAX_CONNECTIONS: 100 + volumes: + - postgres_data:/var/lib/postgresql/data + - ./postgres/postgresql.conf:/etc/postgresql/postgresql.conf + - ./postgres/pg_hba.conf:/etc/postgresql/pg_hba.conf + command: postgres -c config_file=/etc/postgresql/postgresql.conf + deploy: + resources: + limits: + memory: 1G + cpus: '0.5' + reservations: + memory: 512M + cpus: '0.25' + restart: always + logging: + driver: "json-file" + options: + max-size: "10m" + max-file: "3" + + kms-api: + image: kms-api:latest + environment: + - DB_HOST=kms-postgres + - DB_NAME=${DB_NAME} + - DB_USER=${DB_USER} + - DB_PASSWORD=${DB_PASSWORD} + - JWT_SECRET=${JWT_SECRET} + - AUTH_SIGNING_KEY=${AUTH_SIGNING_KEY} + - LOG_LEVEL=info + - ENVIRONMENT=production + deploy: + replicas: 3 + resources: + limits: + memory: 512M + cpus: '0.5' + reservations: + memory: 256M + cpus: '0.25' + restart_policy: + condition: on-failure + delay: 5s + max_attempts: 3 + window: 120s + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/health"] + interval: 30s + timeout: 10s + retries: 3 + logging: + driver: "json-file" + options: + max-size: "50m" + max-file: "5" + + kms-nginx: + image: nginx:alpine + ports: + - "80:80" + - "443:443" + volumes: + - ./nginx/nginx.prod.conf:/etc/nginx/nginx.conf:ro + - ./ssl:/etc/ssl/certs:ro + deploy: + resources: + limits: + memory: 256M + cpus: '0.25' + restart: always +``` + +--- + +## Production Deployment + +### Load Balancer Configuration + +#### **HAProxy Configuration** +```bash +# /etc/haproxy/haproxy.cfg +global + daemon + maxconn 4096 + log stdout local0 + +defaults + mode http + timeout connect 5000ms + timeout client 50000ms + timeout server 50000ms + option httplog + +frontend kms_frontend + bind *:80 + bind *:443 ssl crt /etc/ssl/certs/kms.pem + redirect scheme https if !{ ssl_fc } + + # Rate limiting + stick-table type ip size 100k expire 30s store http_req_rate(10s) + http-request track-sc0 src + http-request reject if { sc_http_req_rate(0) gt 20 } + + default_backend kms_api_servers + +backend kms_api_servers + balance roundrobin + option httpchk GET /health + + server api1 kms-api-1:8080 check + server api2 kms-api-2:8080 check + server api3 kms-api-3:8080 check +``` + +#### **Nginx Load Balancer** +```nginx +# /etc/nginx/nginx.conf +upstream kms_api_backend { + least_conn; + server kms-api-1:8080 weight=3 max_fails=3 fail_timeout=30s; + server kms-api-2:8080 weight=3 max_fails=3 fail_timeout=30s; + server kms-api-3:8080 weight=3 max_fails=3 fail_timeout=30s; +} + +upstream kms_frontend_backend { + server kms-frontend-1:80; + server kms-frontend-2:80; +} + +server { + listen 80; + server_name kms.yourdomain.com; + return 301 https://$server_name$request_uri; +} + +server { + listen 443 ssl http2; + server_name kms.yourdomain.com; + + # SSL Configuration + ssl_certificate /etc/ssl/certs/kms.crt; + ssl_certificate_key /etc/ssl/private/kms.key; + ssl_protocols TLSv1.2 TLSv1.3; + ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384; + ssl_prefer_server_ciphers off; + + # Security Headers + add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload"; + add_header X-Frame-Options DENY; + add_header X-Content-Type-Options nosniff; + add_header X-XSS-Protection "1; mode=block"; + add_header Referrer-Policy "strict-origin-when-cross-origin"; + + # Rate Limiting + limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s; + limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/s; + + # API Routes + location /api/ { + limit_req zone=api burst=20 nodelay; + proxy_pass http://kms_api_backend; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + } + + # Authentication Routes (stricter rate limiting) + location /api/login { + limit_req zone=auth burst=10 nodelay; + proxy_pass http://kms_api_backend; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + } + + # Frontend Routes + location / { + proxy_pass http://kms_frontend_backend; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + + # SPA routing support + try_files $uri $uri/ /index.html; + } + + # Health Check + location /health { + access_log off; + proxy_pass http://kms_api_backend; + } + + # Metrics (restricted access) + location /metrics { + allow 10.0.0.0/8; + allow 172.16.0.0/12; + allow 192.168.0.0/16; + deny all; + proxy_pass http://kms_api_backend:9090; + } +} +``` + +### Database High Availability + +#### **PostgreSQL Primary-Replica Setup** +```yaml +# docker-compose.ha.yml +version: '3.8' + +services: + postgres-primary: + image: postgres:15-alpine + environment: + POSTGRES_DB: kms + POSTGRES_USER: postgres + POSTGRES_PASSWORD: ${DB_PASSWORD} + POSTGRES_REPLICATION_USER: replicator + POSTGRES_REPLICATION_PASSWORD: ${REPLICATION_PASSWORD} + volumes: + - postgres_primary_data:/var/lib/postgresql/data + - ./postgres/primary.conf:/etc/postgresql/postgresql.conf + - ./postgres/setup-replication.sh:/docker-entrypoint-initdb.d/setup-replication.sh + command: postgres -c config_file=/etc/postgresql/postgresql.conf + ports: + - "5432:5432" + + postgres-replica: + image: postgres:15-alpine + environment: + POSTGRES_USER: postgres + POSTGRES_PASSWORD: ${DB_PASSWORD} + POSTGRES_PRIMARY_HOST: postgres-primary + POSTGRES_REPLICATION_USER: replicator + POSTGRES_REPLICATION_PASSWORD: ${REPLICATION_PASSWORD} + volumes: + - postgres_replica_data:/var/lib/postgresql/data + - ./postgres/replica.conf:/etc/postgresql/postgresql.conf + - ./postgres/setup-replica.sh:/docker-entrypoint-initdb.d/setup-replica.sh + command: postgres -c config_file=/etc/postgresql/postgresql.conf + ports: + - "5433:5432" + depends_on: + - postgres-primary + +volumes: + postgres_primary_data: + postgres_replica_data: +``` + +--- + +## Monitoring and Health Checks + +### Health Check Configuration + +#### **Application Health Checks** +```go +// File: internal/handlers/health.go +type HealthResponse struct { + Status string `json:"status"` + Timestamp time.Time `json:"timestamp"` + Services map[string]string `json:"services"` + Version string `json:"version"` +} + +func (h *HealthHandler) GetHealth(c *gin.Context) { + response := &HealthResponse{ + Status: "healthy", + Timestamp: time.Now(), + Services: make(map[string]string), + Version: h.version, + } + + // Check database connectivity + if err := h.db.Ping(); err != nil { + response.Status = "unhealthy" + response.Services["database"] = "unhealthy" + } else { + response.Services["database"] = "healthy" + } + + // Check Redis connectivity + if err := h.redis.Ping(); err != nil { + response.Services["cache"] = "unhealthy" + } else { + response.Services["cache"] = "healthy" + } + + statusCode := http.StatusOK + if response.Status == "unhealthy" { + statusCode = http.StatusServiceUnavailable + } + + c.JSON(statusCode, response) +} +``` + +#### **Docker Health Checks** +```dockerfile +# API Service Health Check +HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \ + CMD curl -f http://localhost:8080/health || exit 1 + +# Database Health Check +HEALTHCHECK --interval=30s --timeout=10s --retries=5 \ + CMD pg_isready -U postgres || exit 1 + +# Frontend Health Check +HEALTHCHECK --interval=30s --timeout=10s --retries=3 \ + CMD curl -f http://localhost:80 || exit 1 +``` + +### Prometheus Metrics + +#### **Metrics Configuration** +```yaml +# prometheus.yml +global: + scrape_interval: 15s + evaluation_interval: 15s + +scrape_configs: + - job_name: 'kms-api' + static_configs: + - targets: ['kms-api:9090'] + metrics_path: /metrics + scrape_interval: 15s + + - job_name: 'nginx' + static_configs: + - targets: ['kms-nginx:9113'] + + - job_name: 'postgres' + static_configs: + - targets: ['postgres-exporter:9187'] + + - job_name: 'redis' + static_configs: + - targets: ['redis-exporter:9121'] +``` + +#### **Grafana Dashboards** +```json +{ + "dashboard": { + "title": "KMS System Metrics", + "panels": [ + { + "title": "Request Rate", + "type": "graph", + "targets": [ + { + "expr": "rate(http_requests_total[5m])", + "legendFormat": "{{method}} {{status}}" + } + ] + }, + { + "title": "Response Time", + "type": "graph", + "targets": [ + { + "expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))", + "legendFormat": "95th percentile" + } + ] + }, + { + "title": "Database Connections", + "type": "graph", + "targets": [ + { + "expr": "postgres_stat_database_numbackends", + "legendFormat": "Active connections" + } + ] + }, + { + "title": "Authentication Success Rate", + "type": "stat", + "targets": [ + { + "expr": "rate(auth_attempts_success_total[5m]) / rate(auth_attempts_total[5m])", + "legendFormat": "Success rate" + } + ] + } + ] + } +} +``` + +--- + +## Security Configuration + +### SSL/TLS Configuration + +#### **Certificate Management** +```bash +# Let's Encrypt certificate generation +docker run --rm -it \ + -v /etc/letsencrypt:/etc/letsencrypt \ + -v /var/lib/letsencrypt:/var/lib/letsencrypt \ + certbot/certbot certonly \ + --standalone \ + --email admin@yourdomain.com \ + --agree-tos \ + --no-eff-email \ + -d kms.yourdomain.com + +# Certificate renewal cron job +0 2 * * * docker run --rm -it \ + -v /etc/letsencrypt:/etc/letsencrypt \ + -v /var/lib/letsencrypt:/var/lib/letsencrypt \ + certbot/certbot renew --quiet +``` + +#### **SSL Configuration** +```nginx +# Strong SSL configuration +ssl_protocols TLSv1.2 TLSv1.3; +ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-CHACHA20-POLY1305; +ssl_prefer_server_ciphers off; +ssl_session_cache shared:SSL:10m; +ssl_session_timeout 1d; +ssl_session_tickets off; +ssl_stapling on; +ssl_stapling_verify on; + +# HSTS +add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload"; +``` + +### Secrets Management + +#### **Docker Secrets** +```yaml +version: '3.8' + +secrets: + db_password: + file: ./secrets/db_password.txt + jwt_secret: + file: ./secrets/jwt_secret.txt + auth_signing_key: + file: ./secrets/auth_signing_key.txt + +services: + kms-api: + image: kms-api:latest + secrets: + - db_password + - jwt_secret + - auth_signing_key + environment: + - DB_PASSWORD_FILE=/run/secrets/db_password + - JWT_SECRET_FILE=/run/secrets/jwt_secret + - AUTH_SIGNING_KEY_FILE=/run/secrets/auth_signing_key +``` + +#### **HashiCorp Vault Integration** +```go +// Vault secret retrieval +func getSecretFromVault(path string) (string, error) { + client, err := vault.NewClient(&vault.Config{ + Address: os.Getenv("VAULT_ADDR"), + }) + if err != nil { + return "", err + } + + client.SetToken(os.Getenv("VAULT_TOKEN")) + + secret, err := client.Logical().Read(path) + if err != nil { + return "", err + } + + if secret == nil || secret.Data == nil { + return "", fmt.Errorf("secret not found") + } + + value, ok := secret.Data["value"].(string) + if !ok { + return "", fmt.Errorf("invalid secret format") + } + + return value, nil +} +``` + +--- + +## Backup and Recovery + +### Database Backup Strategy + +#### **Automated Backup Script** +```bash +#!/bin/bash +# backup.sh + +set -e + +# Configuration +BACKUP_DIR="/backups" +POSTGRES_CONTAINER="kms-postgres" +S3_BUCKET="kms-backups" +RETENTION_DAYS=30 + +# Create backup directory +mkdir -p $BACKUP_DIR + +# Generate backup filename with timestamp +BACKUP_NAME="kms_backup_$(date +%Y%m%d_%H%M%S).sql" +BACKUP_PATH="$BACKUP_DIR/$BACKUP_NAME" + +# Create database dump +echo "Creating database backup..." +docker exec $POSTGRES_CONTAINER pg_dump -U postgres kms > $BACKUP_PATH + +# Compress backup +gzip $BACKUP_PATH + +# Upload to S3 +echo "Uploading backup to S3..." +aws s3 cp "$BACKUP_PATH.gz" "s3://$S3_BUCKET/$(date +%Y/%m/%d)/$BACKUP_NAME.gz" + +# Clean up local files older than retention period +find $BACKUP_DIR -name "*.gz" -mtime +$RETENTION_DAYS -delete + +# Clean up S3 files older than retention period +aws s3 ls "s3://$S3_BUCKET/" --recursive | while read -r line; do + createDate=$(echo $line | awk '{print $1" "$2}') + createDate=$(date -d "$createDate" +%s) + olderThan=$(date -d "$RETENTION_DAYS days ago" +%s) + + if [[ $createDate -lt $olderThan ]]; then + fileName=$(echo $line | awk '{$1=$2=$3=""; print $0}' | sed 's/^[ \t]*//') + aws s3 rm "s3://$S3_BUCKET/$fileName" + fi +done + +echo "Backup completed successfully" +``` + +#### **Backup Cron Job** +```bash +# Daily backup at 2 AM +0 2 * * * /opt/kms/scripts/backup.sh >> /var/log/kms-backup.log 2>&1 + +# Weekly full backup at 1 AM on Sundays +0 1 * * 0 /opt/kms/scripts/full-backup.sh >> /var/log/kms-backup.log 2>&1 +``` + +### Disaster Recovery + +#### **Recovery Procedure** +```bash +#!/bin/bash +# restore.sh + +set -e + +BACKUP_FILE=$1 +POSTGRES_CONTAINER="kms-postgres" + +if [ -z "$BACKUP_FILE" ]; then + echo "Usage: $0 " + exit 1 +fi + +# Download backup from S3 if needed +if [[ $BACKUP_FILE == s3://* ]]; then + LOCAL_FILE="/tmp/$(basename $BACKUP_FILE)" + aws s3 cp "$BACKUP_FILE" "$LOCAL_FILE" + BACKUP_FILE=$LOCAL_FILE +fi + +# Extract if compressed +if [[ $BACKUP_FILE == *.gz ]]; then + gunzip -c "$BACKUP_FILE" > "/tmp/backup.sql" + BACKUP_FILE="/tmp/backup.sql" +fi + +# Stop API services +echo "Stopping API services..." +docker-compose stop kms-api kms-frontend + +# Drop and recreate database +echo "Recreating database..." +docker exec $POSTGRES_CONTAINER psql -U postgres -c "DROP DATABASE IF EXISTS kms;" +docker exec $POSTGRES_CONTAINER psql -U postgres -c "CREATE DATABASE kms;" + +# Restore database +echo "Restoring database..." +docker exec -i $POSTGRES_CONTAINER psql -U postgres kms < "$BACKUP_FILE" + +# Start services +echo "Starting services..." +docker-compose up -d + +# Verify restoration +echo "Verifying restoration..." +sleep 30 +curl -f http://localhost:8081/health || { + echo "Health check failed after restoration" + exit 1 +} + +echo "Database restoration completed successfully" +``` + +--- + +## Troubleshooting + +### Common Issues + +#### **Database Connection Issues** +```bash +# Check database container logs +docker logs kms-postgres + +# Test database connectivity +docker exec kms-postgres pg_isready -U postgres + +# Check connection pool status +curl http://localhost:8080/health | jq '.services.database' +``` + +#### **Authentication Problems** +```bash +# Check JWT secret configuration +docker exec kms-api env | grep JWT_SECRET + +# Verify HMAC key configuration +docker exec kms-api env | grep AUTH_SIGNING_KEY + +# Test authentication endpoint +curl -X POST http://localhost:8081/api/login \ + -H "Content-Type: application/json" \ + -d '{"app_id": "test-app", "user_id": "test@example.com"}' +``` + +#### **Performance Issues** +```bash +# Check API response times +curl -w "@curl-format.txt" -s -o /dev/null http://localhost:8081/api/health + +# Monitor database connections +docker exec kms-postgres psql -U postgres -c "SELECT count(*) FROM pg_stat_activity;" + +# Check memory usage +docker stats kms-api +``` + +#### **SSL/TLS Issues** +```bash +# Test SSL certificate +openssl s_client -connect kms.yourdomain.com:443 -servername kms.yourdomain.com + +# Check certificate expiration +curl -vI https://kms.yourdomain.com 2>&1 | grep -i expire + +# Verify certificate chain +curl -I https://kms.yourdomain.com +``` + +### Debugging Commands + +#### **Container Debugging** +```bash +# View container logs +docker logs -f kms-api + +# Execute shell in container +docker exec -it kms-api /bin/sh + +# Inspect container configuration +docker inspect kms-api + +# Check resource usage +docker stats +``` + +#### **Network Debugging** +```bash +# Test inter-container connectivity +docker exec kms-api ping kms-postgres + +# Check port binding +netstat -tlnp | grep :8080 + +# Inspect Docker network +docker network inspect kms_kms-network +``` + +This deployment guide provides comprehensive instructions for deploying, configuring, and maintaining the KMS in various environments, from development to production-scale deployments. \ No newline at end of file diff --git a/docs/SECURITY_ARCHITECTURE.md b/docs/SECURITY_ARCHITECTURE.md new file mode 100644 index 0000000..997e5b1 --- /dev/null +++ b/docs/SECURITY_ARCHITECTURE.md @@ -0,0 +1,828 @@ +# KMS Security Architecture + +## Table of Contents +1. [Security Overview](#security-overview) +2. [Multi-Layered Security Model](#multi-layered-security-model) +3. [Authentication Security](#authentication-security) +4. [Token Security Models](#token-security-models) +5. [Authorization Framework](#authorization-framework) +6. [Data Protection](#data-protection) +7. [Security Middleware Chain](#security-middleware-chain) +8. [Threat Model](#threat-model) +9. [Security Controls](#security-controls) +10. [Compliance Framework](#compliance-framework) + +--- + +## Security Overview + +The KMS implements a comprehensive security architecture based on defense-in-depth principles. Every layer of the system includes security controls designed to protect against both external threats and insider risks. + +### Security Objectives +- **Confidentiality**: Protect sensitive data and credentials +- **Integrity**: Ensure data accuracy and prevent tampering +- **Availability**: Maintain service availability under attack +- **Accountability**: Complete audit trail of all operations +- **Non-repudiation**: Cryptographic proof of operations + +### Security Principles +- **Zero Trust**: Verify every request regardless of source +- **Fail Secure**: Safe defaults when systems fail +- **Defense in Depth**: Multiple security layers +- **Least Privilege**: Minimal permission grants +- **Security by Design**: Security integrated into architecture + +--- + +## Multi-Layered Security Model + +```mermaid +graph TB + subgraph "External Threats" + DDoS[DDoS Attacks] + XSS[XSS Attacks] + CSRF[CSRF Attacks] + Injection[SQL Injection] + AuthBypass[Auth Bypass] + TokenReplay[Token Replay] + BruteForce[Brute Force] + end + + subgraph "Security Perimeter" + subgraph "Load Balancer Security" + NginxSec[Nginx Security
- Rate limiting
- SSL termination
- Request filtering] + end + + subgraph "Application Security Layer" + SecurityMW[Security Middleware] + + subgraph "Security Headers" + HSTS[HSTS Header
max-age=31536000] + ContentType[Content-Type-Options
nosniff] + FrameOptions[X-Frame-Options
DENY] + XSSProtection[XSS-Protection
1; mode=block] + CSP[Content-Security-Policy
Restrictive policy] + end + + subgraph "CORS Protection" + Origins[Allowed Origins
Whitelist validation] + Methods[Allowed Methods
GET, POST, PUT, DELETE] + Headers[Allowed Headers
Authorization, Content-Type] + Credentials[Credentials Policy
Same-origin only] + end + + subgraph "CSRF Protection" + CSRFToken[CSRF Token
Double-submit cookie] + CSRFHeader[X-CSRF-Token
Header validation] + SameSite[SameSite Cookie
Strict policy] + end + end + + subgraph "Rate Limiting Defense" + GlobalLimit[Global Rate Limit
100 RPS per IP] + EndpointLimit[Endpoint-specific
Limits per route] + BurstControl[Burst Control
200 request burst] + Backoff[Exponential Backoff
Failed attempts] + end + + subgraph "Authentication Security" + MultiAuth[Multi-Provider Auth
Header, JWT, OAuth2, SAML] + + subgraph "Token Security" + JWTSigning[JWT Signing
RS256 Algorithm] + TokenExpiry[Token Expiration
Configurable TTL] + TokenRevocation[Token Revocation
Blacklist in Redis] + TokenRotation[Token Rotation
Refresh mechanism] + end + + subgraph "Static Token Security" + HMACValidation[HMAC Validation
SHA-256 signature] + TimestampCheck[Timestamp Validation
5-minute window] + ReplayProtection[Replay Protection
Nonce tracking] + KeyRotation[Key Rotation
Configurable schedule] + end + + subgraph "Session Security" + SessionEncryption[Session Encryption
AES-256-GCM] + SessionTimeout[Session Timeout
Idle timeout] + SessionFixation[Session Fixation
Protection] + end + end + + subgraph "Authorization Security" + RBAC[Role-Based Access
Hierarchical permissions] + PermissionCheck[Permission Validation
Request-level checks] + ScopeValidation[Scope Validation
Token scope matching] + PrincipleOfLeastPrivilege[Least Privilege
Minimal permissions] + end + + subgraph "Data Protection" + Encryption[Data Encryption] + + subgraph "Encryption at Rest" + DBEncryption[Database Encryption
AES-256 column encryption] + KeyStorage[Key Management
Environment variables] + SaltedHashing[Salted Hashing
BCrypt cost 14] + end + + subgraph "Encryption in Transit" + TLS13[TLS 1.3
All communications] + CertPinning[Certificate Pinning
OAuth2/SAML providers] + MTLS[Mutual TLS
Service-to-service] + end + end + + subgraph "Input Validation" + JSONValidation[JSON Schema
Request validation] + SQLProtection[SQL Injection
Parameterized queries] + XSSProtectionValidation[XSS Protection
Input sanitization] + PathTraversal[Path Traversal
Prevention] + end + + subgraph "Monitoring & Detection" + SecurityAudit[Security Audit Log
All operations logged] + FailureDetection[Failure Detection
Failed auth attempts] + AnomalyDetection[Anomaly Detection
Unusual patterns] + AlertSystem[Alert System
Security incidents] + end + end + + subgraph "Database Security" + DBSecurity[PostgreSQL Security
- Connection encryption
- User isolation
- Query logging
- Backup encryption] + end + + subgraph "Infrastructure Security" + ContainerSecurity[Container Security
- Non-root user
- Minimal images
- Security scanning] + NetworkSecurity[Network Security
- Internal networks
- Firewall rules
- VPN access] + end + + %% Threat Mitigation Connections + DDoS -.->|Mitigated by| NginxSec + XSS -.->|Prevented by| SecurityMW + CSRF -.->|Blocked by| CSRFToken + Injection -.->|Stopped by| JSONValidation + AuthBypass -.->|Blocked by| MultiAuth + TokenReplay -.->|Prevented by| TimestampCheck + BruteForce -.->|Limited by| GlobalLimit + + %% Security Layer Flow + NginxSec --> SecurityMW + SecurityMW --> GlobalLimit + GlobalLimit --> MultiAuth + MultiAuth --> RBAC + RBAC --> Encryption + Encryption --> JSONValidation + JSONValidation --> SecurityAudit + + SecurityAudit --> DBSecurity + SecurityAudit --> ContainerSecurity + + %% Styling + classDef threat fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px + classDef security fill:#c8e6c9,stroke:#388e3c,stroke-width:2px + classDef auth fill:#e1bee7,stroke:#7b1fa2,stroke-width:2px + classDef data fill:#fff9c4,stroke:#f57f17,stroke-width:2px + classDef infra fill:#e3f2fd,stroke:#1976d2,stroke-width:2px + + class DDoS,XSS,CSRF,Injection,AuthBypass,TokenReplay,BruteForce threat + class SecurityMW,HSTS,ContentType,FrameOptions,XSSProtection,CSP,Origins,Methods,Headers,Credentials,CSRFToken,CSRFHeader,SameSite security + class MultiAuth,JWTSigning,TokenExpiry,TokenRevocation,TokenRotation,HMACValidation,TimestampCheck,ReplayProtection,KeyRotation,SessionEncryption,SessionTimeout,SessionFixation,RBAC,PermissionCheck,ScopeValidation,PrincipleOfLeastPrivilege auth + class Encryption,DBEncryption,KeyStorage,SaltedHashing,TLS13,CertPinning,MTLS,JSONValidation,SQLProtection,XSSProtectionValidation,PathTraversal,SecurityAudit,FailureDetection,AnomalyDetection,AlertSystem,DBSecurity data + class ContainerSecurity,NetworkSecurity,NginxSec,GlobalLimit,EndpointLimit,BurstControl,Backoff infra +``` + +### Security Layer Breakdown + +1. **Perimeter Security**: Load balancer, rate limiting, DDoS protection +2. **Application Security**: Security headers, CORS, CSRF protection +3. **Authentication Security**: Multi-provider authentication with strong cryptography +4. **Authorization Security**: RBAC with hierarchical permissions +5. **Data Protection**: Encryption at rest and in transit +6. **Input Validation**: Comprehensive input sanitization and validation +7. **Monitoring**: Security event detection and alerting + +--- + +## Authentication Security + +### Multi-Provider Authentication Framework + +The KMS supports four authentication providers, each with specific security controls: + +#### **Header-based Authentication** +```go +// File: internal/auth/header_validator.go:42 +func (hv *HeaderValidator) ValidateAuthenticationHeaders(r *http.Request) (*ValidatedUserContext, error) +``` + +**Security Features:** +- **HMAC-SHA256 Signatures**: Request integrity validation +- **Timestamp Validation**: 5-minute window prevents replay attacks +- **Constant-time Comparison**: Prevents timing attacks +- **Email Format Validation**: Prevents injection attacks + +**Implementation Details:** +```go +// HMAC signature validation with constant-time comparison +func (hv *HeaderValidator) validateSignature(userEmail, timestamp, signature string) bool { + mac := hmac.New(sha256.New, []byte(signingKey)) + mac.Write([]byte(signingString)) + expectedSignature := hex.EncodeToString(mac.Sum(nil)) + return hmac.Equal([]byte(signature), []byte(expectedSignature)) +} +``` + +#### **JWT Authentication** +```go +// File: internal/auth/jwt.go:47 +func (j *JWTManager) GenerateToken(userToken *domain.UserToken) (string, error) +``` + +**Security Features:** +- **RSA Signatures**: Strong cryptographic signing +- **Token Revocation**: Redis-backed blacklist +- **Secure JTI Generation**: Cryptographically secure token IDs +- **Claims Validation**: Complete token verification + +**Token Structure:** +```json +{ + "iss": "kms-api-service", + "sub": "user@example.com", + "aud": ["app-id"], + "exp": 1642781234, + "iat": 1642694834, + "nbf": 1642694834, + "jti": "secure-random-id", + "user_id": "user@example.com", + "app_id": "app-id", + "permissions": ["app.read", "token.create"], + "token_type": "user", + "max_valid_at": 1643299634 +} +``` + +#### **OAuth2 Authentication** +```go +// File: internal/auth/oauth2.go +``` + +**Security Features:** +- **Authorization Code Flow**: Secure OAuth2 flow implementation +- **State Parameter**: CSRF protection for OAuth2 +- **Token Exchange**: Secure credential handling +- **Provider Validation**: Certificate pinning support + +#### **SAML Authentication** +```go +// File: internal/auth/saml.go +``` + +**Security Features:** +- **XML Signature Validation**: Assertion integrity verification +- **Certificate Validation**: Provider certificate verification +- **Attribute Extraction**: Secure claim processing +- **Replay Protection**: Timestamp and nonce validation + +--- + +## Token Security Models + +### Static Token Security (HMAC-based) + +```mermaid +stateDiagram-v2 + [*] --> TokenRequest: Client requests token + + state TokenRequest { + [*] --> ValidateApp: Validate app_id + ValidateApp --> ValidatePerms: Check permissions + ValidatePerms --> GenerateToken: Create token + GenerateToken --> [*] + } + + TokenRequest --> StaticToken: type=static + + state StaticToken { + [*] --> GenerateHMAC: Generate HMAC key + GenerateHMAC --> HashKey: BCrypt hash (cost 14) + HashKey --> StoreDB: Store in static_tokens table + StoreDB --> GrantPerms: Assign permissions + GrantPerms --> Active: Token ready + + Active --> Verify: Incoming request + Verify --> ValidateHMAC: Check HMAC signature + ValidateHMAC --> CheckTimestamp: Replay protection + CheckTimestamp --> CheckPerms: Validate permissions + CheckPerms --> Authorized: Permission check + CheckPerms --> Denied: Permission denied + + Active --> Revoke: Admin action + Revoke --> Revoked: Update granted_permissions.revoked=true + } + + Authorized --> TokenResponse: Success response + Denied --> ErrorResponse: Error response + Revoked --> [*]: Token lifecycle end + + note right of StaticToken + Static tokens use HMAC signatures + - Timestamp-based replay protection + - BCrypt hashed storage + - Permission-based access control + - No expiration (until revoked) + end note +``` + +#### **Token Format** +``` +Format: {PREFIX}_{BASE64_DATA} +Example: ABC_xyz123base64encodeddata... +Validation: HMAC-SHA256(request_data, hmac_key) +Storage: BCrypt hash (cost 14) +Lifetime: No expiration (until revoked) +``` + +#### **Security Implementation** +```go +// File: internal/crypto/token.go:95 +func (tg *TokenGenerator) HashToken(token string) (string, error) { + hash, err := bcrypt.GenerateFromPassword([]byte(token), tg.bcryptCost) + if err != nil { + return "", fmt.Errorf("failed to hash token with bcrypt cost %d: %w", tg.bcryptCost, err) + } + return string(hash), nil +} +``` + +### User Token Security (JWT-based) + +```mermaid +stateDiagram-v2 + [*] --> UserTokenRequest: User requests token + + state UserTokenRequest { + [*] --> AuthUser: Authenticate user + AuthUser --> GenerateJWT: Create JWT with claims + GenerateJWT --> SetExpiry: Set expires_at, max_valid_at + SetExpiry --> StoreSession: Store user session + StoreSession --> Active: JWT issued + } + + UserTokenRequest --> UserToken: type=user + + state UserToken { + Active --> Verify: Incoming request + Verify --> ValidateJWT: Check JWT signature + ValidateJWT --> CheckExpiry: Verify expiration + CheckExpiry --> CheckRevoked: Check revocation list + CheckRevoked --> Authorized: Valid token + CheckRevoked --> Denied: Token invalid + CheckExpiry --> Expired: Token expired + + Active --> Renew: Renewal request + Renew --> CheckRenewal: Within renewal window? + CheckRenewal --> GenerateNew: Issue new JWT + CheckRenewal --> Denied: Renewal denied + GenerateNew --> Active: New token active + + Active --> Logout: User logout + Logout --> Revoked: Add to revocation list + + Expired --> Renew: Attempt renewal + } + + Authorized --> TokenResponse: Success response + Denied --> ErrorResponse: Error response + Revoked --> [*]: Token lifecycle end + + note right of UserToken + User tokens are JWT-based + - Configurable expiration + - Renewable within limits + - Session tracking + - Hierarchical permissions + end note +``` + +#### **Token Format** +``` +Format: {CUSTOM_PREFIX}UT-{JWT_TOKEN} +Example: ABC123UT-eyJhbGciOiJSUzI1NiIs... +Validation: RSA signature + claims validation +Storage: Revocation list in Redis +Lifetime: Configurable with renewal window +``` + +#### **Security Features** +- **RSA-256 Signatures**: Strong cryptographic validation +- **Token Revocation**: Redis blacklist with TTL +- **Renewal Controls**: Limited renewal window +- **Session Tracking**: Database session management + +--- + +## Authorization Framework + +### Role-Based Access Control (RBAC) + +The KMS implements a hierarchical permission system with role-based access control: + +#### **Permission Hierarchy** +``` +admin +├── app.admin +│ ├── app.read +│ ├── app.write +│ │ ├── app.create +│ │ ├── app.update +│ │ └── app.delete +├── token.admin +│ ├── token.read +│ │ └── token.verify +│ ├── token.write +│ │ ├── token.create +│ │ └── token.revoke +├── permission.admin +│ ├── permission.read +│ ├── permission.write +│ │ ├── permission.grant +│ │ └── permission.revoke +└── user.admin + ├── user.read + └── user.write +``` + +#### **Role Definitions** +```go +// File: internal/auth/permissions.go:151 +func (h *PermissionHierarchy) initializeDefaultRoles() { + defaultRoles := []*Role{ + { + Name: "super_admin", + Description: "Super administrator with full access", + Permissions: []string{"admin"}, + Metadata: map[string]string{"level": "system"}, + }, + { + Name: "app_admin", + Description: "Application administrator", + Permissions: []string{"app.admin", "token.admin", "user.read"}, + Metadata: map[string]string{"level": "application"}, + }, + // Additional roles... + } +} +``` + +#### **Permission Evaluation** +```go +// File: internal/auth/permissions.go:201 +func (pm *PermissionManager) HasPermission(ctx context.Context, userID, appID, permission string) (*PermissionEvaluation, error) { + // Cache lookup first + cacheKey := cache.CacheKey(cache.KeyPrefixPermission, fmt.Sprintf("%s:%s:%s", userID, appID, permission)) + + // Evaluate permission with hierarchy + evaluation := pm.evaluatePermission(ctx, userID, appID, permission) + + // Cache result for 5 minutes + pm.cacheManager.SetJSON(ctx, cacheKey, evaluation, 5*time.Minute) + + return evaluation, nil +} +``` + +### Authorization Controls + +#### **Resource Ownership** +```go +// File: internal/authorization/rbac.go:100 +func (a *AuthorizationService) AuthorizeApplicationOwnership(userID string, app *domain.Application) error { + // System admins can access any application + if a.isSystemAdmin(userID) { + return nil + } + + // Check if user is the owner + if a.isOwner(userID, &app.Owner) { + return nil + } + + return errors.NewForbiddenError("You do not have permission to access this application") +} +``` + +#### **Token Scoping** +```go +// File: internal/services/token_service.go:400 +func (s *tokenService) verifyStaticToken(ctx context.Context, req *domain.VerifyRequest, app *domain.Application) (*domain.VerifyResponse, error) { + // Get granted permissions for this token + permissions, err := s.grantRepo.GetGrantedPermissionScopes(ctx, domain.TokenTypeStatic, matchedToken.ID) + + // Check specific permissions if requested + if len(req.Permissions) > 0 { + permissionResults, err := s.grantRepo.HasAnyPermission(ctx, domain.TokenTypeStatic, matchedToken.ID, req.Permissions) + // Validate all requested permissions are granted + } +} +``` + +--- + +## Data Protection + +### Encryption at Rest + +#### **Database Encryption** +```sql +-- Sensitive data encryption in PostgreSQL +CREATE TABLE applications ( + app_id VARCHAR(100) PRIMARY KEY, + hmac_key TEXT, -- Encrypted with application-level encryption + created_at TIMESTAMP DEFAULT NOW() +); + +CREATE TABLE static_tokens ( + id UUID PRIMARY KEY, + key_hash TEXT NOT NULL, -- BCrypt hashed with cost 14 + created_at TIMESTAMP DEFAULT NOW() +); +``` + +#### **BCrypt Hashing** +```go +// File: internal/crypto/token.go:22 +const BcryptCost = 14 // 2025 security standards (minimum 14) + +func (tg *TokenGenerator) HashToken(token string) (string, error) { + hash, err := bcrypt.GenerateFromPassword([]byte(token), tg.bcryptCost) + if err != nil { + return "", fmt.Errorf("failed to hash token with bcrypt cost %d: %w", tg.bcryptCost, err) + } + return string(hash), nil +} +``` + +### Encryption in Transit + +#### **TLS Configuration** +```go +// TLS 1.3 configuration for all communications +tlsConfig := &tls.Config{ + MinVersion: tls.VersionTLS13, + CipherSuites: []uint16{ + tls.TLS_AES_256_GCM_SHA384, + tls.TLS_CHACHA20_POLY1305_SHA256, + tls.TLS_AES_128_GCM_SHA256, + }, +} +``` + +#### **Certificate Pinning** +```go +// OAuth2/SAML provider certificate pinning +func (o *OAuth2Provider) validateCertificate(cert *x509.Certificate) error { + expectedFingerprints := o.config.GetStringSlice("OAUTH2_CERT_FINGERPRINTS") + + fingerprint := sha256.Sum256(cert.Raw) + fingerprintHex := hex.EncodeToString(fingerprint[:]) + + for _, expected := range expectedFingerprints { + if fingerprintHex == expected { + return nil + } + } + + return fmt.Errorf("certificate fingerprint mismatch") +} +``` + +--- + +## Security Middleware Chain + +### Middleware Processing Order + +1. **Request Logger**: Structured logging with security context +2. **Security Headers**: XSS, clickjacking, MIME-type protection +3. **CORS Handler**: Cross-origin request validation +4. **CSRF Protection**: Double-submit token validation +5. **Rate Limiter**: DDoS and abuse protection +6. **Authentication**: Multi-provider authentication +7. **Authorization**: Permission validation +8. **Input Validation**: Request sanitization and validation + +### Rate Limiting Implementation + +```go +// File: internal/middleware/security.go:48 +func (s *SecurityMiddleware) RateLimitMiddleware(next http.Handler) http.Handler { + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + clientIP := s.getClientIP(r) + limiter := s.getRateLimiter(clientIP) + + if !limiter.Allow() { + s.logger.Warn("Rate limit exceeded", + zap.String("client_ip", clientIP), + zap.String("path", r.URL.Path)) + + s.trackRateLimitViolation(clientIP) + + http.Error(w, `{"error":"rate_limit_exceeded"}`, http.StatusTooManyRequests) + return + } + + next.ServeHTTP(w, r) + }) +} +``` + +### Brute Force Protection + +```go +// File: internal/middleware/security.go:365 +func (s *SecurityMiddleware) checkAndBlockIP(clientIP string) { + var count int + err := s.cacheManager.GetJSON(ctx, key, &count) + + maxFailures := s.config.GetInt("MAX_AUTH_FAILURES") + if maxFailures <= 0 { + maxFailures = 5 // Default + } + + if count >= maxFailures { + // Block the IP for 1 hour + blockInfo := map[string]interface{}{ + "blocked_at": time.Now().Unix(), + "failure_count": count, + "reason": "excessive_auth_failures", + } + + s.cacheManager.SetJSON(ctx, blockKey, blockInfo, blockDuration) + } +} +``` + +--- + +## Threat Model + +### External Threats + +#### **Network-based Attacks** +- **DDoS**: Rate limiting and load balancer protection +- **Man-in-the-Middle**: TLS 1.3 encryption +- **Packet Sniffing**: End-to-end encryption + +#### **Application-level Attacks** +- **SQL Injection**: Parameterized queries only +- **XSS**: Input validation and CSP headers +- **CSRF**: Double-submit token protection +- **Session Hijacking**: Secure session management + +#### **Authentication Attacks** +- **Brute Force**: Rate limiting and account lockout +- **Credential Stuffing**: Multi-factor authentication support +- **Token Replay**: Timestamp validation and nonces +- **JWT Attacks**: Strong cryptographic signing + +### Insider Threats + +#### **Privileged User Abuse** +- **Audit Logging**: Complete audit trail +- **Role Separation**: Principle of least privilege +- **Session Monitoring**: Abnormal activity detection + +#### **Data Exfiltration** +- **Access Controls**: Resource-level authorization +- **Data Classification**: Sensitive data identification +- **Export Monitoring**: Large data access alerts + +--- + +## Security Controls + +### Preventive Controls + +#### **Authentication Controls** +- Multi-factor authentication support +- Strong password policies (where applicable) +- Account lockout mechanisms +- Session timeout enforcement + +#### **Authorization Controls** +- Role-based access control (RBAC) +- Resource-level permissions +- API endpoint protection +- Ownership validation + +#### **Data Protection Controls** +- Encryption at rest (AES-256) +- Encryption in transit (TLS 1.3) +- Key management procedures +- Secure data disposal + +### Detective Controls + +#### **Monitoring and Logging** +```go +// Comprehensive audit logging +type AuditLog struct { + ID uuid.UUID `json:"id"` + Timestamp time.Time `json:"timestamp"` + UserID string `json:"user_id"` + Action string `json:"action"` + ResourceType string `json:"resource_type"` + ResourceID string `json:"resource_id"` + IPAddress string `json:"ip_address"` + UserAgent string `json:"user_agent"` + Metadata map[string]interface{} `json:"metadata"` +} +``` + +#### **Security Event Detection** +- Failed authentication attempts +- Privilege escalation attempts +- Unusual access patterns +- Token misuse detection + +### Responsive Controls + +#### **Incident Response** +- Automated threat response +- Token revocation procedures +- User account suspension +- Alert escalation workflows + +#### **Recovery Controls** +- Backup and restore procedures +- Disaster recovery planning +- Business continuity measures +- Data integrity verification + +--- + +## Compliance Framework + +### OWASP Top 10 Protection + +1. **Injection**: Parameterized queries, input validation +2. **Broken Authentication**: Multi-provider auth, strong crypto +3. **Sensitive Data Exposure**: Encryption, secure storage +4. **XML External Entities**: JSON-only API +5. **Broken Access Control**: RBAC, authorization checks +6. **Security Misconfiguration**: Secure defaults +7. **Cross-Site Scripting**: Input validation, CSP +8. **Insecure Deserialization**: Safe JSON handling +9. **Known Vulnerabilities**: Regular dependency updates +10. **Logging & Monitoring**: Comprehensive audit trails + +### Regulatory Compliance + +#### **SOC 2 Type II Controls** +- **Security**: Access controls, encryption, monitoring +- **Availability**: High availability, disaster recovery +- **Processing Integrity**: Data validation, error handling +- **Confidentiality**: Data classification, access restrictions +- **Privacy**: Data minimization, consent management + +#### **GDPR Compliance** (where applicable) +- **Data Protection by Design**: Privacy-first architecture +- **Consent Management**: Granular consent controls +- **Right to Erasure**: Data deletion procedures +- **Data Portability**: Export capabilities +- **Breach Notification**: Automated incident response + +#### **NIST Cybersecurity Framework** +- **Identify**: Asset inventory, risk assessment +- **Protect**: Access controls, data security +- **Detect**: Monitoring, anomaly detection +- **Respond**: Incident response procedures +- **Recover**: Business continuity planning + +### Security Metrics + +#### **Key Security Indicators** +- Authentication success/failure rates +- Authorization denial rates +- Token usage patterns +- Security event frequency +- Incident response times + +#### **Security Monitoring** +```yaml +alerts: + - name: "High Authentication Failure Rate" + condition: "auth_failures > 10 per minute per IP" + severity: "warning" + + - name: "Privilege Escalation Attempt" + condition: "permission_denied + elevated_permission" + severity: "critical" + + - name: "Token Abuse Detection" + condition: "token_usage_anomaly" + severity: "warning" +``` + +This security architecture document provides comprehensive coverage of the KMS security model, suitable for security audits, compliance reviews, and security team training. \ No newline at end of file