org

2025-08-26 19:16:41 -04:00
parent 7ca61eb712
commit 6725529b01
113 changed files with 0 additions and 337 deletions
--- a/kms/docs/API.md
+++ b/kms/docs/API.md
@ -0,0 +1,613 @@
+# API Key Management Service - API Documentation
+
+This document describes the REST API endpoints for the API Key Management Service.
+
+## Base URL
+
+```
+http://localhost:8080
+```
+
+## Authentication
+
+All protected endpoints require authentication via the `X-User-Email` header (when using the HeaderAuthenticationProvider).
+
+```
+X-User-Email: user@example.com
+```
+
+## Content Type
+
+All endpoints accept and return JSON data:
+
+```
+Content-Type: application/json
+```
+
+## Error Response Format
+
+All error responses follow this format:
+
+```json
+{
+  "error": "Error Type",
+  "message": "Detailed error message"
+}
+```
+
+Common HTTP status codes:
+- `400` - Bad Request (invalid input)
+- `401` - Unauthorized (authentication required)
+- `404` - Not Found (resource not found)
+- `500` - Internal Server Error
+
+## Health Check Endpoints
+
+### Health Check
+```
+GET /health
+```
+
+Basic health check for load balancers.
+
+**Response:**
+```json
+{
+  "status": "healthy",
+  "timestamp": "2023-01-01T00:00:00Z"
+}
+```
+
+### Readiness Check
+```
+GET /ready
+```
+
+Comprehensive readiness check including database connectivity.
+
+**Response:**
+```json
+{
+  "status": "ready",
+  "timestamp": "2023-01-01T00:00:00Z",
+  "checks": {
+    "database": "healthy"
+  }
+}
+```
+
+## Authentication Endpoints
+
+### User Login
+```
+POST /api/login
+```
+
+Initiates user authentication flow.
+
+**Headers:**
+- `X-User-Email: user@example.com` (required for HeaderAuthenticationProvider)
+
+**Request Body:**
+```json
+{
+  "app_id": "com.example.app",
+  "permissions": ["repo.read", "repo.write"],
+  "redirect_uri": "https://example.com/callback"
+}
+```
+
+**Response:**
+```json
+{
+  "redirect_url": "https://example.com/callback?token=user-token-abc123"
+}
+```
+
+Or if no redirect_uri provided:
+```json
+{
+  "token": "user-token-abc123",
+  "user_id": "user@example.com",
+  "app_id": "com.example.app",
+  "expires_in": 604800
+}
+```
+
+### Token Verification
+```
+POST /api/verify
+```
+
+Verifies a token and returns its permissions.
+
+**Request Body:**
+```json
+{
+  "app_id": "com.example.app",
+  "type": "user",
+  "user_id": "user@example.com",
+  "token": "token-to-verify",
+  "permissions": ["repo.read", "repo.write"]
+}
+```
+
+**Response:**
+```json
+{
+  "valid": true,
+  "user_id": "user@example.com",
+  "permissions": ["repo.read", "repo.write"],
+  "permission_results": {
+    "repo.read": true,
+    "repo.write": true
+  },
+  "expires_at": "2023-01-08T00:00:00Z",
+  "max_valid_at": "2023-01-31T00:00:00Z",
+  "token_type": "user"
+}
+```
+
+### Token Renewal
+```
+POST /api/renew
+```
+
+Renews a user token with extended expiration.
+
+**Request Body:**
+```json
+{
+  "app_id": "com.example.app",
+  "user_id": "user@example.com",
+  "token": "current-token"
+}
+```
+
+**Response:**
+```json
+{
+  "token": "new-renewed-token",
+  "expires_at": "2023-01-15T00:00:00Z",
+  "max_valid_at": "2023-01-31T00:00:00Z"
+}
+```
+
+## Application Management
+
+### List Applications
+```
+GET /api/applications?limit=50&offset=0
+```
+
+Retrieves a paginated list of applications.
+
+**Query Parameters:**
+- `limit` (optional): Number of results to return (default: 50, max: 100)
+- `offset` (optional): Number of results to skip (default: 0)
+
+**Headers:**
+- `X-User-Email: user@example.com` (required)
+
+**Response:**
+```json
+{
+  "data": [
+    {
+      "app_id": "com.example.app",
+      "app_link": "https://example.com",
+      "type": ["static", "user"],
+      "callback_url": "https://example.com/callback",
+      "hmac_key": "hmac-key-hidden-in-responses",
+      "token_renewal_duration": 604800000000000,
+      "max_token_duration": 2592000000000000,
+      "owner": {
+        "type": "team",
+        "name": "Example Team",
+        "owner": "example-org"
+      },
+      "created_at": "2023-01-01T00:00:00Z",
+      "updated_at": "2023-01-01T00:00:00Z"
+    }
+  ],
+  "limit": 50,
+  "offset": 0,
+  "count": 1
+}
+```
+
+### Create Application
+```
+POST /api/applications
+```
+
+Creates a new application.
+
+**Headers:**
+- `X-User-Email: user@example.com` (required)
+
+**Request Body:**
+```json
+{
+  "app_id": "com.example.newapp",
+  "app_link": "https://newapp.example.com",
+  "type": ["static", "user"],
+  "callback_url": "https://newapp.example.com/callback",
+  "token_renewal_duration": "168h",
+  "max_token_duration": "720h",
+  "owner": {
+    "type": "team",
+    "name": "Development Team",
+    "owner": "example-org"
+  }
+}
+```
+
+**Response:**
+```json
+{
+  "app_id": "com.example.newapp",
+  "app_link": "https://newapp.example.com",
+  "type": ["static", "user"],
+  "callback_url": "https://newapp.example.com/callback",
+  "hmac_key": "generated-hmac-key",
+  "token_renewal_duration": 604800000000000,
+  "max_token_duration": 2592000000000000,
+  "owner": {
+    "type": "team",
+    "name": "Development Team",
+    "owner": "example-org"
+  },
+  "created_at": "2023-01-01T00:00:00Z",
+  "updated_at": "2023-01-01T00:00:00Z"
+}
+```
+
+### Get Application
+```
+GET /api/applications/{app_id}
+```
+
+Retrieves a specific application by ID.
+
+**Headers:**
+- `X-User-Email: user@example.com` (required)
+
+**Response:**
+```json
+{
+  "app_id": "com.example.app",
+  "app_link": "https://example.com",
+  "type": ["static", "user"],
+  "callback_url": "https://example.com/callback",
+  "hmac_key": "hmac-key-value",
+  "token_renewal_duration": 604800000000000,
+  "max_token_duration": 2592000000000000,
+  "owner": {
+    "type": "team",
+    "name": "Example Team",
+    "owner": "example-org"
+  },
+  "created_at": "2023-01-01T00:00:00Z",
+  "updated_at": "2023-01-01T00:00:00Z"
+}
+```
+
+### Update Application
+```
+PUT /api/applications/{app_id}
+```
+
+Updates an existing application. Only provided fields will be updated.
+
+**Headers:**
+- `X-User-Email: user@example.com` (required)
+
+**Request Body:**
+```json
+{
+  "app_link": "https://updated.example.com",
+  "callback_url": "https://updated.example.com/callback",
+  "owner": {
+    "type": "individual",
+    "name": "John Doe",
+    "owner": "john.doe@example.com"
+  }
+}
+```
+
+**Response:**
+```json
+{
+  "app_id": "com.example.app",
+  "app_link": "https://updated.example.com",
+  "type": ["static", "user"],
+  "callback_url": "https://updated.example.com/callback",
+  "hmac_key": "existing-hmac-key",
+  "token_renewal_duration": 604800000000000,
+  "max_token_duration": 2592000000000000,
+  "owner": {
+    "type": "individual",
+    "name": "John Doe",
+    "owner": "john.doe@example.com"
+  },
+  "created_at": "2023-01-01T00:00:00Z",
+  "updated_at": "2023-01-01T12:00:00Z"
+}
+```
+
+### Delete Application
+```
+DELETE /api/applications/{app_id}
+```
+
+Deletes an application and all associated tokens.
+
+**Headers:**
+- `X-User-Email: user@example.com` (required)
+
+**Response:**
+```
+HTTP 204 No Content
+```
+
+## Static Token Management
+
+### List Tokens for Application
+```
+GET /api/applications/{app_id}/tokens?limit=50&offset=0
+```
+
+Retrieves all static tokens for a specific application.
+
+**Query Parameters:**
+- `limit` (optional): Number of results to return (default: 50, max: 100)
+- `offset` (optional): Number of results to skip (default: 0)
+
+**Headers:**
+- `X-User-Email: user@example.com` (required)
+
+**Response:**
+```json
+{
+  "data": [
+    {
+      "id": "123e4567-e89b-12d3-a456-426614174000",
+      "app_id": "com.example.app",
+      "owner": {
+        "type": "individual",
+        "name": "John Doe",
+        "owner": "john.doe@example.com"
+      },
+      "type": "hmac",
+      "created_at": "2023-01-01T00:00:00Z",
+      "updated_at": "2023-01-01T00:00:00Z"
+    }
+  ],
+  "limit": 50,
+  "offset": 0,
+  "count": 1
+}
+```
+
+### Create Static Token
+```
+POST /api/applications/{app_id}/tokens
+```
+
+Creates a new static token for an application.
+
+**Headers:**
+- `X-User-Email: user@example.com` (required)
+
+**Request Body:**
+```json
+{
+  "owner": {
+    "type": "individual",
+    "name": "API Client",
+    "owner": "api-client@example.com"
+  },
+  "permissions": ["repo.read", "repo.write", "app.read"]
+}
+```
+
+**Response:**
+```json
+{
+  "id": "123e4567-e89b-12d3-a456-426614174000",
+  "token": "static-token-abc123xyz789",
+  "permissions": ["repo.read", "repo.write", "app.read"],
+  "created_at": "2023-01-01T00:00:00Z"
+}
+```
+
+**Note:** The `token` field is only returned once during creation for security reasons.
+
+### Delete Static Token
+```
+DELETE /api/tokens/{token_id}
+```
+
+Deletes a static token and revokes all its permissions.
+
+**Headers:**
+- `X-User-Email: user@example.com` (required)
+
+**Response:**
+```
+HTTP 204 No Content
+```
+
+## Permission Scopes
+
+The following permission scopes are available:
+
+### System Permissions
+- `internal` - Full access to internal system operations (system only)
+- `internal.read` - Read access to internal system data (system only)
+- `internal.write` - Write access to internal system data (system only)
+- `internal.admin` - Administrative access to internal system (system only)
+
+### Application Management
+- `app` - Access to application management
+- `app.read` - Read application information
+- `app.write` - Create and update applications
+- `app.delete` - Delete applications
+
+### Token Management
+- `token` - Access to token management
+- `token.read` - Read token information
+- `token.create` - Create new tokens
+- `token.revoke` - Revoke existing tokens
+
+### Permission Management
+- `permission` - Access to permission management
+- `permission.read` - Read permission information
+- `permission.write` - Create and update permissions
+- `permission.grant` - Grant permissions to tokens
+- `permission.revoke` - Revoke permissions from tokens
+
+### Repository Access (Example)
+- `repo` - Access to repository operations
+- `repo.read` - Read repository data
+- `repo.write` - Write to repositories
+- `repo.admin` - Administrative access to repositories
+
+## Rate Limiting
+
+The API implements rate limiting with the following limits:
+
+- **General API endpoints**: 100 requests per minute with burst of 20
+- **Authentication endpoints** (`/login`, `/verify`, `/renew`): 10 requests per minute with burst of 5
+
+Rate limit headers are included in responses:
+- `X-RateLimit-Limit`: Request limit per window
+- `X-RateLimit-Remaining`: Remaining requests in current window
+- `X-RateLimit-Reset`: Unix timestamp when the window resets
+
+When rate limited:
+```json
+{
+  "error": "Rate limit exceeded",
+  "message": "Too many requests. Please try again later."
+}
+```
+
+## Testing Endpoints
+
+For testing purposes, different user scenarios are available through different ports:
+
+- **Port 80**: Regular user (`test@example.com`)
+- **Port 8081**: Admin user (`admin@example.com`) with higher rate limits
+- **Port 8082**: Limited user (`limited@example.com`) with lower rate limits
+
+## Example Workflows
+
+### Creating an Application and Static Token
+
+1. **Create Application:**
+```bash
+curl -X POST http://localhost/api/applications \
+  -H "Content-Type: application/json" \
+  -H "X-User-Email: admin@example.com" \
+  -d '{
+    "app_id": "com.mycompany.api",
+    "app_link": "https://api.mycompany.com",
+    "type": ["static", "user"],
+    "callback_url": "https://api.mycompany.com/callback",
+    "token_renewal_duration": "168h",
+    "max_token_duration": "720h",
+    "owner": {
+      "type": "team",
+      "name": "API Team",
+      "owner": "api-team@mycompany.com"
+    }
+  }'
+```
+
+2. **Create Static Token:**
+```bash
+curl -X POST http://localhost/api/applications/com.mycompany.api/tokens \
+  -H "Content-Type: application/json" \
+  -H "X-User-Email: admin@example.com" \
+  -d '{
+    "owner": {
+      "type": "individual",
+      "name": "Service Account",
+      "owner": "service@mycompany.com"
+    },
+    "permissions": ["repo.read", "repo.write"]
+  }'
+```
+
+3. **Verify Token:**
+```bash
+curl -X POST http://localhost/api/verify \
+  -H "Content-Type: application/json" \
+  -d '{
+    "app_id": "com.mycompany.api",
+    "type": "static",
+    "token": "static-token-abc123xyz789",
+    "permissions": ["repo.read"]
+  }'
+```
+
+### User Authentication Flow
+
+1. **Initiate Login:**
+```bash
+curl -X POST http://localhost/api/login \
+  -H "Content-Type: application/json" \
+  -H "X-User-Email: user@example.com" \
+  -d '{
+    "app_id": "com.mycompany.api",
+    "permissions": ["repo.read"],
+    "redirect_uri": "https://myapp.com/callback"
+  }'
+```
+
+2. **Verify User Token:**
+```bash
+curl -X POST http://localhost/api/verify \
+  -H "Content-Type: application/json" \
+  -d '{
+    "app_id": "com.mycompany.api",
+    "type": "user",
+    "user_id": "user@example.com",
+    "token": "user-token-from-login",
+    "permissions": ["repo.read"]
+  }'
+```
+
+3. **Renew Token Before Expiry:**
+```bash
+curl -X POST http://localhost/api/renew \
+  -H "Content-Type: application/json" \
+  -d '{
+    "app_id": "com.mycompany.api",
+    "user_id": "user@example.com",
+    "token": "current-user-token"
+  }'
+```
+
+## Security Considerations
+
+- All tokens should be transmitted over HTTPS in production
+- Static tokens are returned only once during creation - store them securely
+- User tokens have both renewal and maximum validity periods
+- HMAC keys are used for token signing and should be rotated regularly
+- Rate limiting helps prevent abuse
+- Permission scopes follow hierarchical structure
+- All operations are logged with user attribution
+
+## Development and Testing
+
+The service includes comprehensive health checks, detailed logging, and metrics collection. When running with `LOG_LEVEL=debug`, additional debugging information is available in the logs.
+
+For local development, the service can be started with:
+```bash
+docker-compose up -d
+```
+
+This starts PostgreSQL, the API service, and Nginx proxy with test user headers configured.
--- a/kms/docs/ARCHITECTURE.md
+++ b/kms/docs/ARCHITECTURE.md
@ -0,0 +1,677 @@
+# API Key Management Service (KMS) - System Architecture
+
+## Table of Contents
+1. [System Overview](#system-overview)
+2. [Architecture Principles](#architecture-principles)
+3. [Component Architecture](#component-architecture)
+4. [System Architecture Diagram](#system-architecture-diagram)
+5. [Request Flow Pipeline](#request-flow-pipeline)
+6. [Authentication Flow](#authentication-flow)
+7. [API Design](#api-design)
+8. [Technology Stack](#technology-stack)
+
+---
+
+## System Overview
+
+The API Key Management Service (KMS) is a secure, scalable platform for managing API authentication tokens across applications. Built with Go backend and React TypeScript frontend, it provides centralized token lifecycle management with enterprise-grade security features.
+
+### Key Capabilities
+- **Multi-Provider Authentication**: Header, JWT, OAuth2, SAML support
+- **Dual Token System**: Static HMAC tokens and renewable JWT user tokens
+- **Hierarchical Permissions**: Role-based access control with inheritance
+- **Enterprise Security**: Rate limiting, brute force protection, audit logging
+- **High Availability**: Containerized deployment with load balancing
+
+### Core Features
+- **Token Lifecycle Management**: Create, verify, renew, and revoke tokens
+- **Application Management**: Multi-tenant application configuration
+- **User Session Tracking**: Comprehensive session management
+- **Audit Logging**: Complete audit trail of all operations
+- **Health Monitoring**: Built-in health checks and metrics
+
+---
+
+## Architecture Principles
+
+### Clean Architecture
+The system follows clean architecture principles with clear separation of concerns:
+
+```
+┌─────────────────┐
+│   Handlers      │ ← HTTP request handling
+├─────────────────┤
+│   Services      │ ← Business logic
+├─────────────────┤
+│  Repositories   │ ← Data access
+├─────────────────┤
+│    Database     │ ← Data persistence
+└─────────────────┘
+```
+
+### Design Principles
+- **Dependency Injection**: All services receive dependencies through constructors
+- **Interface Segregation**: Repository interfaces enable testing and flexibility
+- **Single Responsibility**: Each component has one clear purpose
+- **Fail-Safe Defaults**: Security-first configuration with safe fallbacks
+- **Immutable Operations**: Database transactions and audit logging
+- **Defense in Depth**: Multiple security layers throughout the stack
+
+---
+
+## Component Architecture
+
+### Backend Components (`internal/`)
+
+#### **Handlers Layer** (`internal/handlers/`)
+HTTP request processors implementing REST API endpoints:
+
+- **`application.go`**: Application CRUD operations
+  - Create, read, update, delete applications
+  - HMAC key management
+  - Ownership validation
+
+- **`auth.go`**: Authentication workflows
+  - User login and logout
+  - Token renewal and validation
+  - Multi-provider authentication
+
+- **`token.go`**: Token operations
+  - Static token creation
+  - Token verification
+  - Token revocation
+
+- **`health.go`**: System health checks
+  - Database connectivity
+  - Cache availability
+  - Service status
+
+- **`oauth2.go`, `saml.go`**: External authentication providers
+  - OAuth2 authorization code flow
+  - SAML assertion validation
+  - Provider callback handling
+
+#### **Services Layer** (`internal/services/`)
+Business logic implementation with transaction management:
+
+- **`auth_service.go`**: Authentication provider orchestration
+  - Multi-provider authentication
+  - Session management
+  - User context creation
+
+- **`token_service.go`**: Token lifecycle management
+  - Static token generation and validation
+  - JWT user token management
+  - Permission assignment
+
+- **`application_service.go`**: Application configuration management
+  - Application CRUD operations
+  - Configuration validation
+  - HMAC key rotation
+
+- **`session_service.go`**: User session tracking
+  - Session creation and validation
+  - Session timeout handling
+  - Cross-provider session management
+
+#### **Repository Layer** (`internal/repository/postgres/`)
+Data access with ACID transaction support:
+
+- **`application_repository.go`**: Application persistence
+  - Secure dynamic query building
+  - Parameterized queries
+  - Ownership validation
+
+- **`token_repository.go`**: Static token management
+  - BCrypt token hashing
+  - Token lookup and validation
+  - Permission relationship management
+
+- **`permission_repository.go`**: Permission catalog
+  - Hierarchical permission structure
+  - Permission validation
+  - Bulk permission operations
+
+- **`session_repository.go`**: User session storage
+  - Session persistence
+  - Expiration management
+  - Provider metadata storage
+
+#### **Authentication Providers** (`internal/auth/`)
+Pluggable authentication system:
+
+- **`header_validator.go`**: HMAC signature validation
+  - Timestamp-based replay protection
+  - Constant-time signature comparison
+  - Email format validation
+
+- **`jwt.go`**: JWT token management
+  - Token generation with secure JTI
+  - Signature validation
+  - Revocation list management
+
+- **`oauth2.go`**: OAuth2 authorization code flow
+  - State management
+  - Token exchange
+  - Provider integration
+
+- **`saml.go`**: SAML assertion validation
+  - XML signature validation
+  - Attribute extraction
+  - Provider configuration
+
+- **`permissions.go`**: Hierarchical permission evaluation
+  - Role-based access control
+  - Permission inheritance
+  - Bulk permission evaluation
+
+---
+
+## System Architecture Diagram
+
+```mermaid
+graph TB
+    %% External Components
+    Client[Client Applications]
+    Browser[Web Browser]
+    AuthProvider[OAuth2/SAML Provider]
+    
+    %% Load Balancer & Proxy
+    subgraph "Load Balancer Layer"
+        Nginx[Nginx Proxy<br/>:80, :8081]
+    end
+    
+    %% Frontend Layer
+    subgraph "Frontend Layer"
+        React[React TypeScript SPA<br/>Ant Design UI<br/>:3000]
+        AuthContext[Authentication Context]
+        APIService[API Service Client]
+    end
+    
+    %% API Gateway & Middleware
+    subgraph "API Layer"
+        API[Go API Server<br/>:8080]
+        
+        subgraph "Middleware Chain"
+            Logger[Request Logger]
+            Security[Security Headers<br/>CORS, CSRF]
+            RateLimit[Rate Limiter<br/>100 RPS]
+            Auth[Authentication<br/>Header/JWT/OAuth2/SAML]
+            Validation[Request Validator]
+        end
+    end
+    
+    %% Business Logic Layer
+    subgraph "Service Layer"
+        AuthService[Authentication Service]
+        TokenService[Token Service]
+        AppService[Application Service]
+        SessionService[Session Service]
+        PermService[Permission Service]
+    end
+    
+    %% Data Access Layer
+    subgraph "Repository Layer"
+        AppRepo[Application Repository]
+        TokenRepo[Token Repository]
+        PermRepo[Permission Repository]
+        SessionRepo[Session Repository]
+    end
+    
+    %% Infrastructure Layer
+    subgraph "Infrastructure"
+        PostgreSQL[(PostgreSQL 15<br/>:5432)]
+        Redis[(Redis Cache<br/>Optional)]
+        Metrics[Prometheus Metrics<br/>:9090]
+    end
+    
+    %% External Security
+    subgraph "Security & Crypto"
+        HMAC[HMAC Signature<br/>Validation]
+        BCrypt[BCrypt Hashing<br/>Cost 14]
+        JWT[JWT Token<br/>Generation]
+    end
+    
+    %% Flow Connections
+    Client -->|API Requests| Nginx
+    Browser -->|HTTPS| Nginx
+    AuthProvider -->|OAuth2/SAML| API
+    
+    Nginx -->|Proxy| React
+    Nginx -->|API Proxy| API
+    
+    React --> AuthContext
+    React --> APIService
+    APIService -->|REST API| API
+    
+    API --> Logger
+    Logger --> Security
+    Security --> RateLimit
+    RateLimit --> Auth
+    Auth --> Validation
+    Validation --> AuthService
+    Validation --> TokenService
+    Validation --> AppService
+    Validation --> SessionService
+    Validation --> PermService
+    
+    AuthService --> AppRepo
+    TokenService --> TokenRepo
+    AppService --> AppRepo
+    SessionService --> SessionRepo
+    PermService --> PermRepo
+    
+    AppRepo --> PostgreSQL
+    TokenRepo --> PostgreSQL
+    PermRepo --> PostgreSQL
+    SessionRepo --> PostgreSQL
+    
+    TokenService --> HMAC
+    TokenService --> BCrypt
+    TokenService --> JWT
+    AuthService --> Redis
+    
+    API --> Metrics
+    
+    %% Styling
+    classDef frontend fill:#e1f5fe,stroke:#0277bd,stroke-width:2px
+    classDef api fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
+    classDef service fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
+    classDef data fill:#fff3e0,stroke:#ef6c00,stroke-width:2px
+    classDef security fill:#ffebee,stroke:#c62828,stroke-width:2px
+    
+    class React,AuthContext,APIService frontend
+    class API,Logger,Security,RateLimit,Auth,Validation api
+    class AuthService,TokenService,AppService,SessionService,PermService service
+    class PostgreSQL,Redis,Metrics,AppRepo,TokenRepo,PermRepo,SessionRepo data
+    class HMAC,BCrypt,JWT security
+```
+
+---
+
+## Request Flow Pipeline
+
+```mermaid
+flowchart TD
+    Start([HTTP Request]) --> Nginx{Nginx Proxy<br/>Load Balancer}
+    
+    Nginx -->|Static Assets| Frontend[React SPA<br/>Port 3000]
+    Nginx -->|API Routes| API[Go API Server<br/>Port 8080]
+    
+    API --> Logger[Request Logger<br/>Structured Logging]
+    Logger --> Security[Security Middleware<br/>Headers, CORS, CSRF]
+    Security --> RateLimit{Rate Limiter<br/>100 RPS, 200 Burst}
+    
+    RateLimit -->|Exceeded| RateResponse[429 Too Many Requests]
+    RateLimit -->|Within Limits| Auth[Authentication<br/>Middleware]
+    
+    Auth --> AuthHeader{Auth Provider}
+    AuthHeader -->|header| HeaderAuth[Header Validator<br/>X-User-Email]
+    AuthHeader -->|jwt| JWTAuth[JWT Validator<br/>Signature + Claims]
+    AuthHeader -->|oauth2| OAuth2Auth[OAuth2 Flow<br/>Authorization Code]
+    AuthHeader -->|saml| SAMLAuth[SAML Assertion<br/>XML Validation]
+    
+    HeaderAuth --> AuthCache{Check Cache<br/>Redis 5min TTL}
+    JWTAuth --> JWTValidation[Signature Validation<br/>Expiry Check]
+    OAuth2Auth --> OAuth2Exchange[Token Exchange<br/>User Info Retrieval]
+    SAMLAuth --> SAMLValidation[Assertion Validation<br/>Signature Check]
+    
+    AuthCache -->|Hit| AuthContext[Create AuthContext]
+    AuthCache -->|Miss| DBAuth[Database Lookup<br/>User Permissions]
+    JWTValidation --> RevocationCheck[Check Revocation List<br/>Redis Cache]
+    OAuth2Exchange --> SessionStore[Store User Session<br/>PostgreSQL]
+    SAMLValidation --> SessionStore
+    
+    DBAuth --> CacheStore[Store in Cache<br/>5min TTL]
+    RevocationCheck --> AuthContext
+    SessionStore --> AuthContext
+    CacheStore --> AuthContext
+    
+    AuthContext --> Validation[Request Validator<br/>JSON Schema]
+    Validation -->|Invalid| ValidationError[400 Bad Request]
+    Validation -->|Valid| Router{Route Handler}
+    
+    Router -->|/health| HealthHandler[Health Check<br/>DB + Cache Status]
+    Router -->|/api/applications| AppHandler[Application CRUD<br/>HMAC Key Management]
+    Router -->|/api/tokens| TokenHandler[Token Operations<br/>Create, Verify, Revoke]
+    Router -->|/api/login| AuthHandler[Authentication<br/>Login, Renewal]
+    Router -->|/api/oauth2| OAuth2Handler[OAuth2 Callbacks<br/>State Management]
+    Router -->|/api/saml| SAMLHandler[SAML Callbacks<br/>Assertion Processing]
+    
+    HealthHandler --> Service[Service Layer]
+    AppHandler --> Service
+    TokenHandler --> Service
+    AuthHandler --> Service
+    OAuth2Handler --> Service
+    SAMLHandler --> Service
+    
+    Service --> Repository[Repository Layer<br/>Database Operations]
+    Repository --> PostgreSQL[(PostgreSQL<br/>ACID Transactions)]
+    
+    Service --> CryptoOps[Cryptographic Operations]
+    CryptoOps --> HMAC[HMAC Signature<br/>Timestamp Validation]
+    CryptoOps --> BCrypt[BCrypt Hashing<br/>Cost 14]
+    CryptoOps --> JWT[JWT Generation<br/>RS256 Signing]
+    
+    Repository --> AuditLog[Audit Logging<br/>All Operations]
+    AuditLog --> AuditTable[(audit_logs table)]
+    
+    Service --> Response[HTTP Response]
+    Response --> Metrics[Prometheus Metrics<br/>Port 9090]
+    Response --> End([Response Sent])
+    
+    %% Error Paths
+    RateResponse --> End
+    ValidationError --> End
+    
+    %% Styling
+    classDef middleware fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
+    classDef auth fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
+    classDef handler fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
+    classDef data fill:#fff3e0,stroke:#ef6c00,stroke-width:2px
+    classDef crypto fill:#ffebee,stroke:#c62828,stroke-width:2px
+    classDef error fill:#fce4ec,stroke:#ad1457,stroke-width:2px
+    
+    class Logger,Security,RateLimit,Validation middleware
+    class Auth,HeaderAuth,JWTAuth,OAuth2Auth,SAMLAuth,AuthContext auth
+    class HealthHandler,AppHandler,TokenHandler,AuthHandler,OAuth2Handler,SAMLHandler,Service handler
+    class Repository,PostgreSQL,AuditTable,AuthCache data
+    class CryptoOps,HMAC,BCrypt,JWT crypto
+    class RateResponse,ValidationError error
+```
+
+### Request Processing Pipeline
+
+1. **Load Balancer**: Nginx receives and routes requests
+2. **Static Assets**: React SPA served directly by Nginx
+3. **API Gateway**: Go server handles API requests
+4. **Middleware Chain**: Security, rate limiting, authentication
+5. **Route Handler**: Business logic processing
+6. **Service Layer**: Transaction management and orchestration
+7. **Repository Layer**: Database operations with audit logging
+8. **Response**: JSON response with metrics collection
+
+---
+
+## Authentication Flow
+
+```mermaid
+sequenceDiagram
+    participant Client as Client App
+    participant API as API Gateway
+    participant Auth as Auth Service
+    participant DB as PostgreSQL
+    participant Provider as OAuth2/SAML
+    participant Cache as Redis Cache
+    
+    %% Header-based Authentication
+    rect rgb(240, 248, 255)
+        Note over Client, Cache: Header-based Authentication Flow
+        Client->>API: Request with X-User-Email header
+        API->>Auth: Validate header auth
+        Auth->>DB: Check user permissions
+        DB-->>Auth: Return user context
+        Auth->>Cache: Cache auth result (5min TTL)
+        Auth-->>API: AuthContext{UserID, Permissions}
+        API-->>Client: Authenticated response
+    end
+    
+    %% JWT Authentication Flow
+    rect rgb(245, 255, 245)
+        Note over Client, Cache: JWT Authentication Flow
+        Client->>API: Login request {app_id, permissions}
+        API->>Auth: Generate JWT token
+        Auth->>DB: Validate app_id and permissions
+        DB-->>Auth: Application config
+        Auth->>Auth: Create JWT with claims<br/>{user_id, permissions, exp, iat}
+        Auth-->>API: JWT token + expires_at
+        API-->>Client: LoginResponse{token, expires_at}
+        
+        Note over Client, API: Subsequent requests with JWT
+        Client->>API: Request with Bearer JWT
+        API->>Auth: Verify JWT signature
+        Auth->>Auth: Check expiration & claims
+        Auth->>Cache: Check revocation list
+        Cache-->>Auth: Token status
+        Auth-->>API: Valid AuthContext
+        API-->>Client: Authorized response
+    end
+    
+    %% OAuth2/SAML Flow
+    rect rgb(255, 248, 240)
+        Note over Client, Provider: OAuth2/SAML Authentication Flow
+        Client->>API: POST /api/login {app_id, redirect_uri}
+        API->>Auth: Generate OAuth2 state
+        Auth->>DB: Store state + app context
+        Auth-->>API: Redirect URL + state
+        API-->>Client: {redirect_url, state}
+        
+        Client->>Provider: Redirect to OAuth2 provider
+        Provider-->>Client: Authorization code + state
+        
+        Client->>API: GET /api/oauth2/callback?code=xxx&state=yyy
+        API->>Auth: Validate state and exchange code
+        Auth->>Provider: Exchange code for tokens
+        Provider-->>Auth: Access token + ID token
+        Auth->>Provider: Get user info
+        Provider-->>Auth: User profile
+        Auth->>DB: Create/update user session
+        Auth->>Auth: Generate internal JWT
+        Auth-->>API: JWT token + user context
+        API-->>Client: Set-Cookie with JWT + redirect
+    end
+    
+    %% Token Renewal Flow
+    rect rgb(248, 245, 255)
+        Note over Client, DB: Token Renewal Flow
+        Client->>API: POST /api/renew {app_id, user_id, token}
+        API->>Auth: Validate current token
+        Auth->>Auth: Check token expiration<br/>and max_valid_at
+        Auth->>DB: Get application config
+        DB-->>Auth: TokenRenewalDuration, MaxTokenDuration
+        Auth->>Auth: Generate new JWT<br/>with extended expiry
+        Auth->>Cache: Invalidate old token
+        Auth-->>API: New token + expires_at
+        API-->>Client: RenewResponse{token, expires_at}
+    end
+```
+
+### Authentication Methods
+
+#### **Header-based Authentication**
+- **Use Case**: Service-to-service authentication
+- **Security**: HMAC-SHA256 signatures with timestamp validation
+- **Replay Protection**: 5-minute timestamp window
+- **Caching**: 5-minute Redis cache for performance
+
+#### **JWT Authentication**
+- **Use Case**: User authentication with session management
+- **Security**: RSA signatures with revocation checking
+- **Token Lifecycle**: Configurable expiration with renewal
+- **Claims**: User ID, permissions, application scope
+
+#### **OAuth2/SAML Authentication**
+- **Use Case**: External identity provider integration
+- **Security**: Authorization code flow with state validation
+- **Session Management**: Database-backed session storage
+- **Provider Support**: Configurable provider endpoints
+
+---
+
+## API Design
+
+### RESTful Endpoints
+
+#### **Authentication Endpoints**
+```
+POST /api/login          - Authenticate user, issue JWT
+POST /api/renew          - Renew JWT token
+POST /api/logout         - Revoke JWT token
+GET  /api/verify         - Verify token and permissions
+```
+
+#### **Application Management**
+```
+GET    /api/applications     - List applications (paginated)
+POST   /api/applications     - Create application
+GET    /api/applications/:id - Get application details
+PUT    /api/applications/:id - Update application
+DELETE /api/applications/:id - Delete application
+```
+
+#### **Token Management**
+```
+GET    /api/applications/:id/tokens - List application tokens
+POST   /api/applications/:id/tokens - Create new token
+DELETE /api/tokens/:id              - Revoke token
+```
+
+#### **OAuth2/SAML Integration**
+```
+POST /api/oauth2/login          - Initiate OAuth2 flow
+GET  /api/oauth2/callback       - OAuth2 callback handler
+POST /api/saml/login            - Initiate SAML flow
+POST /api/saml/callback         - SAML assertion handler
+```
+
+#### **System Endpoints**
+```
+GET /health   - System health check
+GET /ready    - Readiness probe
+GET /metrics  - Prometheus metrics
+```
+
+### Request/Response Patterns
+
+#### **Authentication Headers**
+```http
+X-User-Email: user@example.com
+X-Auth-Timestamp: 2024-01-15T10:30:00Z
+X-Auth-Signature: sha256=abc123...
+Authorization: Bearer eyJhbGciOiJSUzI1NiIs...
+Content-Type: application/json
+```
+
+#### **Error Response Format**
+```json
+{
+  "error": "validation_failed",
+  "message": "Request validation failed",
+  "details": [
+    {
+      "field": "permissions",
+      "message": "Invalid permission format",
+      "value": "invalid.perm"
+    }
+  ]
+}
+```
+
+#### **Success Response Format**
+```json
+{
+  "data": {
+    "id": "550e8400-e29b-41d4-a716-446655440000",
+    "token": "ABC123_xyz789...",
+    "permissions": ["app.read", "token.create"],
+    "created_at": "2024-01-15T10:30:00Z"
+  }
+}
+```
+
+---
+
+## Technology Stack
+
+### Backend Technologies
+- **Language**: Go 1.21+ with modules
+- **Web Framework**: Gin HTTP framework
+- **Database**: PostgreSQL 15 with connection pooling
+- **Authentication**: JWT-Go library with RSA signing
+- **Cryptography**: Go standard crypto libraries
+- **Caching**: Redis for session and revocation storage
+- **Logging**: Zap structured logging
+- **Metrics**: Prometheus metrics collection
+
+### Frontend Technologies
+- **Framework**: React 18 with TypeScript
+- **UI Library**: Ant Design components
+- **State Management**: React Context API
+- **HTTP Client**: Axios with interceptors
+- **Routing**: React Router with protected routes
+- **Build Tool**: Create React App with TypeScript
+
+### Infrastructure
+- **Containerization**: Docker with multi-stage builds
+- **Orchestration**: Docker Compose for local development
+- **Reverse Proxy**: Nginx with load balancing
+- **Database Migrations**: Custom Go migration system
+- **Health Monitoring**: Built-in health check endpoints
+
+### Security Stack
+- **TLS**: TLS 1.3 for all communications
+- **Hashing**: BCrypt with cost 14 for production
+- **Signatures**: HMAC-SHA256 and RSA signatures
+- **Rate Limiting**: Token bucket algorithm
+- **CSRF**: Double-submit cookie pattern
+- **Headers**: Comprehensive security headers
+
+---
+
+## Deployment Considerations
+
+### Container Configuration
+```yaml
+services:
+  kms-api:
+    image: kms-api:latest
+    ports:
+      - "8080:8080"
+    environment:
+      - DB_HOST=postgres
+      - DB_PORT=5432
+      - JWT_SECRET=${JWT_SECRET}
+      - HMAC_KEY=${HMAC_KEY}
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+```
+
+### Environment Variables
+```bash
+# Database Configuration
+DB_HOST=localhost
+DB_PORT=5432
+DB_NAME=kms
+DB_USER=postgres
+DB_PASSWORD=postgres
+
+# Server Configuration
+SERVER_HOST=0.0.0.0
+SERVER_PORT=8080
+
+# Authentication
+AUTH_PROVIDER=header
+JWT_SECRET=your-jwt-secret
+HMAC_KEY=your-hmac-key
+
+# Security
+RATE_LIMIT_ENABLED=true
+RATE_LIMIT_RPS=100
+RATE_LIMIT_BURST=200
+```
+
+### Monitoring Setup
+```yaml
+prometheus:
+  scrape_configs:
+    - job_name: 'kms-api'
+      static_configs:
+        - targets: ['kms-api:9090']
+      scrape_interval: 15s
+      metrics_path: /metrics
+```
+
+This architecture documentation provides a comprehensive technical overview of the KMS system, suitable for development teams, system architects, and operations personnel who need to understand, deploy, or maintain the system.
--- a/kms/docs/DEPLOYMENT_GUIDE.md
+++ b/kms/docs/DEPLOYMENT_GUIDE.md
--- a/kms/docs/PRODUCTION_ROADMAP.md
+++ b/kms/docs/PRODUCTION_ROADMAP.md
@ -0,0 +1,319 @@
+# KMS API Service - Production Roadmap
+
+This document outlines the complete roadmap for making the API Key Management Service fully production-ready. Use the checkboxes to track progress and refer to the implementation notes at the bottom.
+
+## 🏗️ Core Infrastructure (COMPLETED)
+
+### Repository Layer
+- [x] Complete token repository implementation (CRUD operations)
+- [x] Complete permission repository implementation (core methods)
+- [x] Implement granted permission repository (authorization logic)
+- [x] Add database transaction support
+- [x] Implement proper error handling in repositories
+
+### Security & Cryptography
+- [x] Implement secure token generation using crypto/rand
+- [x] Add bcrypt-based token hashing for storage
+- [x] Implement HMAC token signing and verification
+- [x] Create token format validation utilities
+- [x] Add cryptographic key management
+
+### Service Layer
+- [x] Update token service with secure generation
+- [x] Implement permission validation in token creation
+- [x] Add application validation before token operations
+- [x] Implement proper error propagation
+- [x] Add comprehensive logging throughout services
+
+### Middleware & Validation
+- [x] Create comprehensive input validation middleware
+- [x] Implement struct-based validation with detailed errors
+- [x] Add UUID parameter validation
+- [x] Create permission scope format validation
+- [x] Implement request sanitization
+
+### Error Handling
+- [x] Create structured error framework with typed codes
+- [x] Implement HTTP status code mapping
+- [x] Add error context and chaining support
+- [x] Create consistent JSON error responses
+- [x] Add retry logic indicators
+
+### Monitoring & Metrics
+- [x] Implement comprehensive metrics collection
+- [x] Add Prometheus-compatible metrics export
+- [x] Create HTTP request monitoring middleware
+- [x] Add business metrics tracking
+- [x] Implement system health metrics
+
+## 🔐 Authentication & Authorization (HIGH PRIORITY)
+
+### JWT Implementation
+- [x] Complete JWT token generation and validation
+- [x] Implement token expiration and renewal logic
+- [x] Add JWT claims management
+- [x] Create token blacklisting mechanism
+- [x] Implement refresh token rotation
+- [x] Add comprehensive JWT unit tests with benchmarks
+- [x] Implement cache-based token revocation system
+
+### SSO Integration
+- [x] Implement OAuth2/OIDC provider integration
+- [x] Add OAuth2 authentication handlers with PKCE support
+- [x] Create OAuth2 discovery document fetching
+- [x] Implement authorization code exchange and token refresh
+- [x] Add user info retrieval from OAuth2 providers
+- [x] Create comprehensive OAuth2 unit tests with benchmarks
+- [x] Add SAML authentication support
+- [x] Create user session management
+- [x] Implement role-based access control (RBAC)
+- [x] Add multi-tenant authentication support
+
+### Permission System Enhancement
+- [x] Implement hierarchical permission inheritance
+- [x] Add dynamic permission evaluation
+- [x] Create permission caching mechanism
+- [x] Add bulk permission operations
+- [x] Implement default permission hierarchy (admin, read, write, app.*, token.*, etc.)
+- [x] Create role-based permission system with inheritance
+- [x] Add comprehensive permission unit tests with benchmarks
+- [ ] Implement permission audit logging
+
+## 🚀 Performance & Scalability (MEDIUM PRIORITY)
+
+### Caching Layer
+- [x] Implement basic caching layer with memory provider
+- [x] Add JSON serialization/deserialization support
+- [x] Create cache manager with TTL support
+- [x] Add cache key management and prefixes
+- [x] Implement Redis integration for caching
+- [x] Add token blacklist caching for revocation
+- [ ] Add permission result caching
+- [ ] Create application metadata caching
+- [ ] Implement token validation result caching
+- [ ] Add cache invalidation strategies
+
+### Database Optimization
+- [ ] Implement database connection pool tuning
+- [ ] Add query performance monitoring
+- [ ] Create database migration rollback procedures
+- [ ] Implement read replica support
+- [ ] Add database backup and recovery procedures
+
+### Load Balancing & Clustering
+- [ ] Implement horizontal scaling support
+- [ ] Add load balancer health checks
+- [ ] Create session affinity handling
+- [ ] Implement distributed rate limiting
+- [ ] Add circuit breaker patterns
+
+## 🔒 Security Hardening (HIGH PRIORITY)
+
+### Advanced Security Features
+- [ ] Implement API key rotation mechanisms
+- [x] Add brute force protection
+- [x] Create account lockout mechanisms
+- [x] Implement IP whitelisting/blacklisting
+- [x] Add request signing validation
+- [x] Implement rate limiting middleware
+- [x] Add security headers middleware
+- [x] Create authentication failure tracking
+
+### Audit & Compliance
+- [x] Implement comprehensive audit logging
+- [ ] Add compliance reporting features
+- [ ] Create data retention policies
+- [ ] Implement GDPR compliance features
+- [ ] Add security event alerting
+
+### Secrets Management
+- [ ] Integrate with HashiCorp Vault or similar
+- [ ] Implement automatic key rotation
+- [ ] Add encrypted configuration storage
+- [ ] Create secure backup procedures
+- [ ] Implement key escrow mechanisms
+
+## 🧪 Testing & Quality Assurance (MEDIUM PRIORITY)
+
+### Unit Testing
+- [x] Add comprehensive JWT authentication unit tests
+- [x] Create caching layer unit tests with benchmarks
+- [x] Implement authentication service unit tests
+- [x] Add comprehensive permission system unit tests
+- [ ] Add comprehensive unit tests for repositories
+- [ ] Create service layer unit tests
+- [ ] Implement middleware unit tests
+- [ ] Add crypto utility unit tests
+- [ ] Create error handling unit tests
+
+### Integration Testing
+- [ ] Expand integration test coverage
+- [ ] Add database integration tests
+- [ ] Create API endpoint integration tests
+- [ ] Implement authentication flow tests
+- [ ] Add permission validation tests
+
+### Performance Testing
+- [ ] Implement load testing scenarios
+- [ ] Add stress testing for concurrent operations
+- [ ] Create database performance benchmarks
+- [ ] Implement memory leak detection
+- [ ] Add latency and throughput testing
+
+### Security Testing
+- [ ] Implement penetration testing scenarios
+- [ ] Add vulnerability scanning automation
+- [ ] Create security regression tests
+- [ ] Implement fuzzing tests
+- [ ] Add compliance validation tests
+
+## 📦 Deployment & Operations (MEDIUM PRIORITY)
+
+### Containerization & Orchestration
+- [ ] Create optimized Docker images
+- [ ] Implement Kubernetes manifests
+- [ ] Add Helm charts for deployment
+- [ ] Create deployment automation scripts
+- [ ] Implement blue-green deployment strategy
+
+### Infrastructure as Code
+- [ ] Create Terraform configurations
+- [ ] Implement AWS/GCP/Azure resource definitions
+- [ ] Add infrastructure testing
+- [ ] Create disaster recovery procedures
+- [ ] Implement infrastructure monitoring
+
+### CI/CD Pipeline
+- [ ] Implement automated testing pipeline
+- [ ] Add security scanning in CI/CD
+- [ ] Create automated deployment pipeline
+- [ ] Implement rollback mechanisms
+- [ ] Add deployment notifications
+
+## 📊 Observability & Monitoring (LOW PRIORITY)
+
+### Advanced Monitoring
+- [ ] Implement distributed tracing
+- [ ] Add application performance monitoring (APM)
+- [ ] Create custom dashboards
+- [ ] Implement alerting rules
+- [ ] Add log aggregation and analysis
+
+### Business Intelligence
+- [ ] Create usage analytics
+- [ ] Implement cost tracking
+- [ ] Add capacity planning metrics
+- [ ] Create business KPI dashboards
+- [ ] Implement trend analysis
+
+## 🔧 Maintenance & Operations (ONGOING)
+
+### Documentation
+- [ ] Create comprehensive API documentation
+- [ ] Add deployment guides
+- [ ] Create troubleshooting runbooks
+- [ ] Implement architecture decision records (ADRs)
+- [ ] Add security best practices guide
+
+### Maintenance Procedures
+- [ ] Create backup and restore procedures
+- [ ] Implement log rotation and archival
+- [ ] Add database maintenance scripts
+- [ ] Create performance tuning guides
+- [ ] Implement capacity planning procedures
+
+---
+
+## 📝 Implementation Notes for Future Development
+
+### Code Organization Principles
+1. **Maintain Clean Architecture**: Keep clear separation between domain, service, and infrastructure layers
+2. **Interface-First Design**: Always define interfaces before implementations for better testability
+3. **Error Handling**: Use the established error framework (`internal/errors`) for consistent error handling
+4. **Logging**: Use structured logging with zap throughout the application
+5. **Configuration**: Add new config options to `internal/config/config.go` with proper validation
+
+### Security Guidelines
+1. **Input Validation**: Always validate inputs using the validation middleware (`internal/middleware/validation.go`)
+2. **Token Security**: Use the crypto utilities (`internal/crypto/token.go`) for all token operations
+3. **Permission Checks**: Always validate permissions using the repository layer before operations
+4. **Audit Logging**: Log all security-relevant operations with user context
+5. **Secrets**: Never hardcode secrets; use environment variables or secret management systems
+
+### Database Guidelines
+1. **Migrations**: Always create both up and down migrations for schema changes
+2. **Transactions**: Use database transactions for multi-step operations
+3. **Indexing**: Add appropriate indexes for query performance
+4. **Connection Management**: Use the existing connection pool configuration
+5. **Error Handling**: Wrap database errors with the application error framework
+
+### Testing Guidelines
+1. **Test Structure**: Follow the existing test structure in `test/` directory
+2. **Mock Dependencies**: Use interfaces for easy mocking in tests
+3. **Test Data**: Use the test helpers for consistent test data creation
+4. **Integration Tests**: Test against real database instances when possible
+5. **Coverage**: Aim for >80% test coverage for critical paths
+
+### Performance Guidelines
+1. **Metrics**: Use the metrics system (`internal/metrics`) to track performance
+2. **Caching**: Implement caching at the service layer, not repository layer
+3. **Database Queries**: Optimize queries and use appropriate indexes
+4. **Memory Management**: Be mindful of memory allocations in hot paths
+5. **Concurrency**: Use proper synchronization for shared resources
+
+### Deployment Guidelines
+1. **Environment Variables**: Use environment-based configuration for all deployments
+2. **Health Checks**: Ensure health endpoints are properly configured
+3. **Graceful Shutdown**: Implement proper shutdown procedures for all services
+4. **Resource Limits**: Set appropriate CPU and memory limits
+5. **Monitoring**: Ensure metrics and logging are properly configured
+
+### Code Quality Standards
+1. **Go Standards**: Follow standard Go conventions and best practices
+2. **Documentation**: Document all public APIs and complex business logic
+3. **Error Messages**: Provide clear, actionable error messages
+4. **Code Reviews**: Require code reviews for all changes
+5. **Static Analysis**: Use tools like golangci-lint for code quality
+
+### Security Best Practices
+1. **Principle of Least Privilege**: Grant minimum necessary permissions
+2. **Defense in Depth**: Implement multiple layers of security
+3. **Regular Updates**: Keep dependencies updated for security patches
+4. **Secure Defaults**: Use secure configurations by default
+5. **Security Testing**: Include security testing in the development process
+
+### Operational Considerations
+1. **Monitoring**: Implement comprehensive monitoring and alerting
+2. **Backup Strategy**: Ensure regular backups and test restore procedures
+3. **Disaster Recovery**: Have documented disaster recovery procedures
+4. **Capacity Planning**: Monitor resource usage and plan for growth
+5. **Documentation**: Keep operational documentation up to date
+
+---
+
+## 🎯 Priority Matrix
+
+### Immediate (Next Sprint)
+- Complete JWT implementation
+- Add comprehensive unit tests
+- Implement caching layer basics
+
+### Short Term (1-2 Months)
+- SSO integration
+- Security hardening features
+- Performance optimization
+
+### Medium Term (3-6 Months)
+- Advanced monitoring and observability
+- Deployment automation
+- Compliance features
+
+### Long Term (6+ Months)
+- Advanced analytics
+- Multi-region deployment
+- Advanced security features
+
+---
+
+*Last Updated: [Current Date]*
+*Version: 1.0*
--- a/kms/docs/SECURITY_ARCHITECTURE.md
+++ b/kms/docs/SECURITY_ARCHITECTURE.md
@ -0,0 +1,828 @@
+# KMS Security Architecture
+
+## Table of Contents
+1. [Security Overview](#security-overview)
+2. [Multi-Layered Security Model](#multi-layered-security-model)
+3. [Authentication Security](#authentication-security)
+4. [Token Security Models](#token-security-models)
+5. [Authorization Framework](#authorization-framework)
+6. [Data Protection](#data-protection)
+7. [Security Middleware Chain](#security-middleware-chain)
+8. [Threat Model](#threat-model)
+9. [Security Controls](#security-controls)
+10. [Compliance Framework](#compliance-framework)
+
+---
+
+## Security Overview
+
+The KMS implements a comprehensive security architecture based on defense-in-depth principles. Every layer of the system includes security controls designed to protect against both external threats and insider risks.
+
+### Security Objectives
+- **Confidentiality**: Protect sensitive data and credentials
+- **Integrity**: Ensure data accuracy and prevent tampering
+- **Availability**: Maintain service availability under attack
+- **Accountability**: Complete audit trail of all operations
+- **Non-repudiation**: Cryptographic proof of operations
+
+### Security Principles
+- **Zero Trust**: Verify every request regardless of source
+- **Fail Secure**: Safe defaults when systems fail
+- **Defense in Depth**: Multiple security layers
+- **Least Privilege**: Minimal permission grants
+- **Security by Design**: Security integrated into architecture
+
+---
+
+## Multi-Layered Security Model
+
+```mermaid
+graph TB
+    subgraph "External Threats"
+        DDoS[DDoS Attacks]
+        XSS[XSS Attacks]
+        CSRF[CSRF Attacks]
+        Injection[SQL Injection]
+        AuthBypass[Auth Bypass]
+        TokenReplay[Token Replay]
+        BruteForce[Brute Force]
+    end
+    
+    subgraph "Security Perimeter"
+        subgraph "Load Balancer Security"
+            NginxSec[Nginx Security<br/>- Rate limiting<br/>- SSL termination<br/>- Request filtering]
+        end
+        
+        subgraph "Application Security Layer"
+            SecurityMW[Security Middleware]
+            
+            subgraph "Security Headers"
+                HSTS[HSTS Header<br/>max-age=31536000]
+                ContentType[Content-Type-Options<br/>nosniff]
+                FrameOptions[X-Frame-Options<br/>DENY]
+                XSSProtection[XSS-Protection<br/>1; mode=block]
+                CSP[Content-Security-Policy<br/>Restrictive policy]
+            end
+            
+            subgraph "CORS Protection"
+                Origins[Allowed Origins<br/>Whitelist validation]
+                Methods[Allowed Methods<br/>GET, POST, PUT, DELETE]
+                Headers[Allowed Headers<br/>Authorization, Content-Type]
+                Credentials[Credentials Policy<br/>Same-origin only]
+            end
+            
+            subgraph "CSRF Protection"
+                CSRFToken[CSRF Token<br/>Double-submit cookie]
+                CSRFHeader[X-CSRF-Token<br/>Header validation]
+                SameSite[SameSite Cookie<br/>Strict policy]
+            end
+        end
+        
+        subgraph "Rate Limiting Defense"
+            GlobalLimit[Global Rate Limit<br/>100 RPS per IP]
+            EndpointLimit[Endpoint-specific<br/>Limits per route]
+            BurstControl[Burst Control<br/>200 request burst]
+            Backoff[Exponential Backoff<br/>Failed attempts]
+        end
+        
+        subgraph "Authentication Security"
+            MultiAuth[Multi-Provider Auth<br/>Header, JWT, OAuth2, SAML]
+            
+            subgraph "Token Security"
+                JWTSigning[JWT Signing<br/>RS256 Algorithm]
+                TokenExpiry[Token Expiration<br/>Configurable TTL]
+                TokenRevocation[Token Revocation<br/>Blacklist in Redis]
+                TokenRotation[Token Rotation<br/>Refresh mechanism]
+            end
+            
+            subgraph "Static Token Security"
+                HMACValidation[HMAC Validation<br/>SHA-256 signature]
+                TimestampCheck[Timestamp Validation<br/>5-minute window]
+                ReplayProtection[Replay Protection<br/>Nonce tracking]
+                KeyRotation[Key Rotation<br/>Configurable schedule]
+            end
+            
+            subgraph "Session Security"
+                SessionEncryption[Session Encryption<br/>AES-256-GCM]
+                SessionTimeout[Session Timeout<br/>Idle timeout]
+                SessionFixation[Session Fixation<br/>Protection]
+            end
+        end
+        
+        subgraph "Authorization Security"
+            RBAC[Role-Based Access<br/>Hierarchical permissions]
+            PermissionCheck[Permission Validation<br/>Request-level checks]
+            ScopeValidation[Scope Validation<br/>Token scope matching]
+            PrincipleOfLeastPrivilege[Least Privilege<br/>Minimal permissions]
+        end
+        
+        subgraph "Data Protection"
+            Encryption[Data Encryption]
+            
+            subgraph "Encryption at Rest"
+                DBEncryption[Database Encryption<br/>AES-256 column encryption]
+                KeyStorage[Key Management<br/>Environment variables]
+                SaltedHashing[Salted Hashing<br/>BCrypt cost 14]
+            end
+            
+            subgraph "Encryption in Transit"
+                TLS13[TLS 1.3<br/>All communications]
+                CertPinning[Certificate Pinning<br/>OAuth2/SAML providers]
+                MTLS[Mutual TLS<br/>Service-to-service]
+            end
+        end
+        
+        subgraph "Input Validation"
+            JSONValidation[JSON Schema<br/>Request validation]
+            SQLProtection[SQL Injection<br/>Parameterized queries]
+            XSSProtectionValidation[XSS Protection<br/>Input sanitization]
+            PathTraversal[Path Traversal<br/>Prevention]
+        end
+        
+        subgraph "Monitoring & Detection"
+            SecurityAudit[Security Audit Log<br/>All operations logged]
+            FailureDetection[Failure Detection<br/>Failed auth attempts]
+            AnomalyDetection[Anomaly Detection<br/>Unusual patterns]
+            AlertSystem[Alert System<br/>Security incidents]
+        end
+    end
+    
+    subgraph "Database Security"
+        DBSecurity[PostgreSQL Security<br/>- Connection encryption<br/>- User isolation<br/>- Query logging<br/>- Backup encryption]
+    end
+    
+    subgraph "Infrastructure Security"
+        ContainerSecurity[Container Security<br/>- Non-root user<br/>- Minimal images<br/>- Security scanning]
+        NetworkSecurity[Network Security<br/>- Internal networks<br/>- Firewall rules<br/>- VPN access]
+    end
+    
+    %% Threat Mitigation Connections
+    DDoS -.->|Mitigated by| NginxSec
+    XSS -.->|Prevented by| SecurityMW
+    CSRF -.->|Blocked by| CSRFToken
+    Injection -.->|Stopped by| JSONValidation
+    AuthBypass -.->|Blocked by| MultiAuth
+    TokenReplay -.->|Prevented by| TimestampCheck
+    BruteForce -.->|Limited by| GlobalLimit
+    
+    %% Security Layer Flow
+    NginxSec --> SecurityMW
+    SecurityMW --> GlobalLimit
+    GlobalLimit --> MultiAuth
+    MultiAuth --> RBAC
+    RBAC --> Encryption
+    Encryption --> JSONValidation
+    JSONValidation --> SecurityAudit
+    
+    SecurityAudit --> DBSecurity
+    SecurityAudit --> ContainerSecurity
+    
+    %% Styling
+    classDef threat fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px
+    classDef security fill:#c8e6c9,stroke:#388e3c,stroke-width:2px
+    classDef auth fill:#e1bee7,stroke:#7b1fa2,stroke-width:2px
+    classDef data fill:#fff9c4,stroke:#f57f17,stroke-width:2px
+    classDef infra fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
+    
+    class DDoS,XSS,CSRF,Injection,AuthBypass,TokenReplay,BruteForce threat
+    class SecurityMW,HSTS,ContentType,FrameOptions,XSSProtection,CSP,Origins,Methods,Headers,Credentials,CSRFToken,CSRFHeader,SameSite security
+    class MultiAuth,JWTSigning,TokenExpiry,TokenRevocation,TokenRotation,HMACValidation,TimestampCheck,ReplayProtection,KeyRotation,SessionEncryption,SessionTimeout,SessionFixation,RBAC,PermissionCheck,ScopeValidation,PrincipleOfLeastPrivilege auth
+    class Encryption,DBEncryption,KeyStorage,SaltedHashing,TLS13,CertPinning,MTLS,JSONValidation,SQLProtection,XSSProtectionValidation,PathTraversal,SecurityAudit,FailureDetection,AnomalyDetection,AlertSystem,DBSecurity data
+    class ContainerSecurity,NetworkSecurity,NginxSec,GlobalLimit,EndpointLimit,BurstControl,Backoff infra
+```
+
+### Security Layer Breakdown
+
+1. **Perimeter Security**: Load balancer, rate limiting, DDoS protection
+2. **Application Security**: Security headers, CORS, CSRF protection
+3. **Authentication Security**: Multi-provider authentication with strong cryptography
+4. **Authorization Security**: RBAC with hierarchical permissions
+5. **Data Protection**: Encryption at rest and in transit
+6. **Input Validation**: Comprehensive input sanitization and validation
+7. **Monitoring**: Security event detection and alerting
+
+---
+
+## Authentication Security
+
+### Multi-Provider Authentication Framework
+
+The KMS supports four authentication providers, each with specific security controls:
+
+#### **Header-based Authentication**
+```go
+// File: internal/auth/header_validator.go:42
+func (hv *HeaderValidator) ValidateAuthenticationHeaders(r *http.Request) (*ValidatedUserContext, error)
+```
+
+**Security Features:**
+- **HMAC-SHA256 Signatures**: Request integrity validation
+- **Timestamp Validation**: 5-minute window prevents replay attacks
+- **Constant-time Comparison**: Prevents timing attacks
+- **Email Format Validation**: Prevents injection attacks
+
+**Implementation Details:**
+```go
+// HMAC signature validation with constant-time comparison
+func (hv *HeaderValidator) validateSignature(userEmail, timestamp, signature string) bool {
+    mac := hmac.New(sha256.New, []byte(signingKey))
+    mac.Write([]byte(signingString))
+    expectedSignature := hex.EncodeToString(mac.Sum(nil))
+    return hmac.Equal([]byte(signature), []byte(expectedSignature))
+}
+```
+
+#### **JWT Authentication**
+```go
+// File: internal/auth/jwt.go:47
+func (j *JWTManager) GenerateToken(userToken *domain.UserToken) (string, error)
+```
+
+**Security Features:**
+- **RSA Signatures**: Strong cryptographic signing
+- **Token Revocation**: Redis-backed blacklist
+- **Secure JTI Generation**: Cryptographically secure token IDs
+- **Claims Validation**: Complete token verification
+
+**Token Structure:**
+```json
+{
+  "iss": "kms-api-service",
+  "sub": "user@example.com",
+  "aud": ["app-id"],
+  "exp": 1642781234,
+  "iat": 1642694834,
+  "nbf": 1642694834,
+  "jti": "secure-random-id",
+  "user_id": "user@example.com",
+  "app_id": "app-id",
+  "permissions": ["app.read", "token.create"],
+  "token_type": "user",
+  "max_valid_at": 1643299634
+}
+```
+
+#### **OAuth2 Authentication**
+```go
+// File: internal/auth/oauth2.go
+```
+
+**Security Features:**
+- **Authorization Code Flow**: Secure OAuth2 flow implementation
+- **State Parameter**: CSRF protection for OAuth2
+- **Token Exchange**: Secure credential handling
+- **Provider Validation**: Certificate pinning support
+
+#### **SAML Authentication**
+```go
+// File: internal/auth/saml.go
+```
+
+**Security Features:**
+- **XML Signature Validation**: Assertion integrity verification
+- **Certificate Validation**: Provider certificate verification
+- **Attribute Extraction**: Secure claim processing
+- **Replay Protection**: Timestamp and nonce validation
+
+---
+
+## Token Security Models
+
+### Static Token Security (HMAC-based)
+
+```mermaid
+stateDiagram-v2
+    [*] --> TokenRequest: Client requests token
+    
+    state TokenRequest {
+        [*] --> ValidateApp: Validate app_id
+        ValidateApp --> ValidatePerms: Check permissions
+        ValidatePerms --> GenerateToken: Create token
+        GenerateToken --> [*]
+    }
+    
+    TokenRequest --> StaticToken: type=static
+    
+    state StaticToken {
+        [*] --> GenerateHMAC: Generate HMAC key
+        GenerateHMAC --> HashKey: BCrypt hash (cost 14)
+        HashKey --> StoreDB: Store in static_tokens table
+        StoreDB --> GrantPerms: Assign permissions
+        GrantPerms --> Active: Token ready
+        
+        Active --> Verify: Incoming request
+        Verify --> ValidateHMAC: Check HMAC signature
+        ValidateHMAC --> CheckTimestamp: Replay protection
+        CheckTimestamp --> CheckPerms: Validate permissions
+        CheckPerms --> Authorized: Permission check
+        CheckPerms --> Denied: Permission denied
+        
+        Active --> Revoke: Admin action
+        Revoke --> Revoked: Update granted_permissions.revoked=true
+    }
+    
+    Authorized --> TokenResponse: Success response
+    Denied --> ErrorResponse: Error response
+    Revoked --> [*]: Token lifecycle end
+    
+    note right of StaticToken
+        Static tokens use HMAC signatures
+        - Timestamp-based replay protection
+        - BCrypt hashed storage
+        - Permission-based access control
+        - No expiration (until revoked)
+    end note
+```
+
+#### **Token Format**
+```
+Format: {PREFIX}_{BASE64_DATA}
+Example: ABC_xyz123base64encodeddata...
+Validation: HMAC-SHA256(request_data, hmac_key)
+Storage: BCrypt hash (cost 14)
+Lifetime: No expiration (until revoked)
+```
+
+#### **Security Implementation**
+```go
+// File: internal/crypto/token.go:95
+func (tg *TokenGenerator) HashToken(token string) (string, error) {
+    hash, err := bcrypt.GenerateFromPassword([]byte(token), tg.bcryptCost)
+    if err != nil {
+        return "", fmt.Errorf("failed to hash token with bcrypt cost %d: %w", tg.bcryptCost, err)
+    }
+    return string(hash), nil
+}
+```
+
+### User Token Security (JWT-based)
+
+```mermaid
+stateDiagram-v2
+    [*] --> UserTokenRequest: User requests token
+    
+    state UserTokenRequest {
+        [*] --> AuthUser: Authenticate user
+        AuthUser --> GenerateJWT: Create JWT with claims
+        GenerateJWT --> SetExpiry: Set expires_at, max_valid_at
+        SetExpiry --> StoreSession: Store user session
+        StoreSession --> Active: JWT issued
+    }
+    
+    UserTokenRequest --> UserToken: type=user
+    
+    state UserToken {
+        Active --> Verify: Incoming request
+        Verify --> ValidateJWT: Check JWT signature
+        ValidateJWT --> CheckExpiry: Verify expiration
+        CheckExpiry --> CheckRevoked: Check revocation list
+        CheckRevoked --> Authorized: Valid token
+        CheckRevoked --> Denied: Token invalid
+        CheckExpiry --> Expired: Token expired
+        
+        Active --> Renew: Renewal request
+        Renew --> CheckRenewal: Within renewal window?
+        CheckRenewal --> GenerateNew: Issue new JWT
+        CheckRenewal --> Denied: Renewal denied
+        GenerateNew --> Active: New token active
+        
+        Active --> Logout: User logout
+        Logout --> Revoked: Add to revocation list
+        
+        Expired --> Renew: Attempt renewal
+    }
+    
+    Authorized --> TokenResponse: Success response
+    Denied --> ErrorResponse: Error response
+    Revoked --> [*]: Token lifecycle end
+    
+    note right of UserToken
+        User tokens are JWT-based
+        - Configurable expiration
+        - Renewable within limits
+        - Session tracking
+        - Hierarchical permissions
+    end note
+```
+
+#### **Token Format**
+```
+Format: {CUSTOM_PREFIX}UT-{JWT_TOKEN}
+Example: ABC123UT-eyJhbGciOiJSUzI1NiIs...
+Validation: RSA signature + claims validation
+Storage: Revocation list in Redis
+Lifetime: Configurable with renewal window
+```
+
+#### **Security Features**
+- **RSA-256 Signatures**: Strong cryptographic validation
+- **Token Revocation**: Redis blacklist with TTL
+- **Renewal Controls**: Limited renewal window
+- **Session Tracking**: Database session management
+
+---
+
+## Authorization Framework
+
+### Role-Based Access Control (RBAC)
+
+The KMS implements a hierarchical permission system with role-based access control:
+
+#### **Permission Hierarchy**
+```
+admin
+├── app.admin
+│   ├── app.read
+│   ├── app.write
+│   │   ├── app.create
+│   │   ├── app.update
+│   │   └── app.delete
+├── token.admin
+│   ├── token.read
+│   │   └── token.verify
+│   ├── token.write
+│   │   ├── token.create
+│   │   └── token.revoke
+├── permission.admin
+│   ├── permission.read
+│   ├── permission.write
+│   │   ├── permission.grant
+│   │   └── permission.revoke
+└── user.admin
+    ├── user.read
+    └── user.write
+```
+
+#### **Role Definitions**
+```go
+// File: internal/auth/permissions.go:151
+func (h *PermissionHierarchy) initializeDefaultRoles() {
+    defaultRoles := []*Role{
+        {
+            Name:        "super_admin",
+            Description: "Super administrator with full access",
+            Permissions: []string{"admin"},
+            Metadata:    map[string]string{"level": "system"},
+        },
+        {
+            Name:        "app_admin", 
+            Description: "Application administrator",
+            Permissions: []string{"app.admin", "token.admin", "user.read"},
+            Metadata:    map[string]string{"level": "application"},
+        },
+        // Additional roles...
+    }
+}
+```
+
+#### **Permission Evaluation**
+```go
+// File: internal/auth/permissions.go:201
+func (pm *PermissionManager) HasPermission(ctx context.Context, userID, appID, permission string) (*PermissionEvaluation, error) {
+    // Cache lookup first
+    cacheKey := cache.CacheKey(cache.KeyPrefixPermission, fmt.Sprintf("%s:%s:%s", userID, appID, permission))
+    
+    // Evaluate permission with hierarchy
+    evaluation := pm.evaluatePermission(ctx, userID, appID, permission)
+    
+    // Cache result for 5 minutes
+    pm.cacheManager.SetJSON(ctx, cacheKey, evaluation, 5*time.Minute)
+    
+    return evaluation, nil
+}
+```
+
+### Authorization Controls
+
+#### **Resource Ownership**
+```go
+// File: internal/authorization/rbac.go:100
+func (a *AuthorizationService) AuthorizeApplicationOwnership(userID string, app *domain.Application) error {
+    // System admins can access any application
+    if a.isSystemAdmin(userID) {
+        return nil
+    }
+    
+    // Check if user is the owner
+    if a.isOwner(userID, &app.Owner) {
+        return nil
+    }
+    
+    return errors.NewForbiddenError("You do not have permission to access this application")
+}
+```
+
+#### **Token Scoping**
+```go
+// File: internal/services/token_service.go:400
+func (s *tokenService) verifyStaticToken(ctx context.Context, req *domain.VerifyRequest, app *domain.Application) (*domain.VerifyResponse, error) {
+    // Get granted permissions for this token
+    permissions, err := s.grantRepo.GetGrantedPermissionScopes(ctx, domain.TokenTypeStatic, matchedToken.ID)
+    
+    // Check specific permissions if requested
+    if len(req.Permissions) > 0 {
+        permissionResults, err := s.grantRepo.HasAnyPermission(ctx, domain.TokenTypeStatic, matchedToken.ID, req.Permissions)
+        // Validate all requested permissions are granted
+    }
+}
+```
+
+---
+
+## Data Protection
+
+### Encryption at Rest
+
+#### **Database Encryption**
+```sql
+-- Sensitive data encryption in PostgreSQL
+CREATE TABLE applications (
+    app_id VARCHAR(100) PRIMARY KEY,
+    hmac_key TEXT, -- Encrypted with application-level encryption
+    created_at TIMESTAMP DEFAULT NOW()
+);
+
+CREATE TABLE static_tokens (
+    id UUID PRIMARY KEY,
+    key_hash TEXT NOT NULL, -- BCrypt hashed with cost 14
+    created_at TIMESTAMP DEFAULT NOW()
+);
+```
+
+#### **BCrypt Hashing**
+```go
+// File: internal/crypto/token.go:22
+const BcryptCost = 14 // 2025 security standards (minimum 14)
+
+func (tg *TokenGenerator) HashToken(token string) (string, error) {
+    hash, err := bcrypt.GenerateFromPassword([]byte(token), tg.bcryptCost)
+    if err != nil {
+        return "", fmt.Errorf("failed to hash token with bcrypt cost %d: %w", tg.bcryptCost, err)
+    }
+    return string(hash), nil
+}
+```
+
+### Encryption in Transit
+
+#### **TLS Configuration**
+```go
+// TLS 1.3 configuration for all communications
+tlsConfig := &tls.Config{
+    MinVersion: tls.VersionTLS13,
+    CipherSuites: []uint16{
+        tls.TLS_AES_256_GCM_SHA384,
+        tls.TLS_CHACHA20_POLY1305_SHA256,
+        tls.TLS_AES_128_GCM_SHA256,
+    },
+}
+```
+
+#### **Certificate Pinning**
+```go
+// OAuth2/SAML provider certificate pinning
+func (o *OAuth2Provider) validateCertificate(cert *x509.Certificate) error {
+    expectedFingerprints := o.config.GetStringSlice("OAUTH2_CERT_FINGERPRINTS")
+    
+    fingerprint := sha256.Sum256(cert.Raw)
+    fingerprintHex := hex.EncodeToString(fingerprint[:])
+    
+    for _, expected := range expectedFingerprints {
+        if fingerprintHex == expected {
+            return nil
+        }
+    }
+    
+    return fmt.Errorf("certificate fingerprint mismatch")
+}
+```
+
+---
+
+## Security Middleware Chain
+
+### Middleware Processing Order
+
+1. **Request Logger**: Structured logging with security context
+2. **Security Headers**: XSS, clickjacking, MIME-type protection
+3. **CORS Handler**: Cross-origin request validation
+4. **CSRF Protection**: Double-submit token validation
+5. **Rate Limiter**: DDoS and abuse protection
+6. **Authentication**: Multi-provider authentication
+7. **Authorization**: Permission validation
+8. **Input Validation**: Request sanitization and validation
+
+### Rate Limiting Implementation
+
+```go
+// File: internal/middleware/security.go:48
+func (s *SecurityMiddleware) RateLimitMiddleware(next http.Handler) http.Handler {
+    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+        clientIP := s.getClientIP(r)
+        limiter := s.getRateLimiter(clientIP)
+        
+        if !limiter.Allow() {
+            s.logger.Warn("Rate limit exceeded", 
+                zap.String("client_ip", clientIP),
+                zap.String("path", r.URL.Path))
+            
+            s.trackRateLimitViolation(clientIP)
+            
+            http.Error(w, `{"error":"rate_limit_exceeded"}`, http.StatusTooManyRequests)
+            return
+        }
+        
+        next.ServeHTTP(w, r)
+    })
+}
+```
+
+### Brute Force Protection
+
+```go
+// File: internal/middleware/security.go:365
+func (s *SecurityMiddleware) checkAndBlockIP(clientIP string) {
+    var count int
+    err := s.cacheManager.GetJSON(ctx, key, &count)
+    
+    maxFailures := s.config.GetInt("MAX_AUTH_FAILURES")
+    if maxFailures <= 0 {
+        maxFailures = 5 // Default
+    }
+    
+    if count >= maxFailures {
+        // Block the IP for 1 hour
+        blockInfo := map[string]interface{}{
+            "blocked_at":     time.Now().Unix(),
+            "failure_count":  count,
+            "reason":         "excessive_auth_failures",
+        }
+        
+        s.cacheManager.SetJSON(ctx, blockKey, blockInfo, blockDuration)
+    }
+}
+```
+
+---
+
+## Threat Model
+
+### External Threats
+
+#### **Network-based Attacks**
+- **DDoS**: Rate limiting and load balancer protection
+- **Man-in-the-Middle**: TLS 1.3 encryption
+- **Packet Sniffing**: End-to-end encryption
+
+#### **Application-level Attacks**
+- **SQL Injection**: Parameterized queries only
+- **XSS**: Input validation and CSP headers
+- **CSRF**: Double-submit token protection
+- **Session Hijacking**: Secure session management
+
+#### **Authentication Attacks**
+- **Brute Force**: Rate limiting and account lockout
+- **Credential Stuffing**: Multi-factor authentication support
+- **Token Replay**: Timestamp validation and nonces
+- **JWT Attacks**: Strong cryptographic signing
+
+### Insider Threats
+
+#### **Privileged User Abuse**
+- **Audit Logging**: Complete audit trail
+- **Role Separation**: Principle of least privilege
+- **Session Monitoring**: Abnormal activity detection
+
+#### **Data Exfiltration**
+- **Access Controls**: Resource-level authorization
+- **Data Classification**: Sensitive data identification
+- **Export Monitoring**: Large data access alerts
+
+---
+
+## Security Controls
+
+### Preventive Controls
+
+#### **Authentication Controls**
+- Multi-factor authentication support
+- Strong password policies (where applicable)
+- Account lockout mechanisms
+- Session timeout enforcement
+
+#### **Authorization Controls**
+- Role-based access control (RBAC)
+- Resource-level permissions
+- API endpoint protection
+- Ownership validation
+
+#### **Data Protection Controls**
+- Encryption at rest (AES-256)
+- Encryption in transit (TLS 1.3)
+- Key management procedures
+- Secure data disposal
+
+### Detective Controls
+
+#### **Monitoring and Logging**
+```go
+// Comprehensive audit logging
+type AuditLog struct {
+    ID           uuid.UUID         `json:"id"`
+    Timestamp    time.Time         `json:"timestamp"`
+    UserID       string            `json:"user_id"`
+    Action       string            `json:"action"`
+    ResourceType string            `json:"resource_type"`
+    ResourceID   string            `json:"resource_id"`
+    IPAddress    string            `json:"ip_address"`
+    UserAgent    string            `json:"user_agent"`
+    Metadata     map[string]interface{} `json:"metadata"`
+}
+```
+
+#### **Security Event Detection**
+- Failed authentication attempts
+- Privilege escalation attempts
+- Unusual access patterns
+- Token misuse detection
+
+### Responsive Controls
+
+#### **Incident Response**
+- Automated threat response
+- Token revocation procedures
+- User account suspension
+- Alert escalation workflows
+
+#### **Recovery Controls**
+- Backup and restore procedures
+- Disaster recovery planning
+- Business continuity measures
+- Data integrity verification
+
+---
+
+## Compliance Framework
+
+### OWASP Top 10 Protection
+
+1. **Injection**: Parameterized queries, input validation
+2. **Broken Authentication**: Multi-provider auth, strong crypto
+3. **Sensitive Data Exposure**: Encryption, secure storage
+4. **XML External Entities**: JSON-only API
+5. **Broken Access Control**: RBAC, authorization checks
+6. **Security Misconfiguration**: Secure defaults
+7. **Cross-Site Scripting**: Input validation, CSP
+8. **Insecure Deserialization**: Safe JSON handling
+9. **Known Vulnerabilities**: Regular dependency updates
+10. **Logging & Monitoring**: Comprehensive audit trails
+
+### Regulatory Compliance
+
+#### **SOC 2 Type II Controls**
+- **Security**: Access controls, encryption, monitoring
+- **Availability**: High availability, disaster recovery
+- **Processing Integrity**: Data validation, error handling
+- **Confidentiality**: Data classification, access restrictions
+- **Privacy**: Data minimization, consent management
+
+#### **GDPR Compliance** (where applicable)
+- **Data Protection by Design**: Privacy-first architecture
+- **Consent Management**: Granular consent controls
+- **Right to Erasure**: Data deletion procedures
+- **Data Portability**: Export capabilities
+- **Breach Notification**: Automated incident response
+
+#### **NIST Cybersecurity Framework**
+- **Identify**: Asset inventory, risk assessment
+- **Protect**: Access controls, data security
+- **Detect**: Monitoring, anomaly detection
+- **Respond**: Incident response procedures
+- **Recover**: Business continuity planning
+
+### Security Metrics
+
+#### **Key Security Indicators**
+- Authentication success/failure rates
+- Authorization denial rates
+- Token usage patterns
+- Security event frequency
+- Incident response times
+
+#### **Security Monitoring**
+```yaml
+alerts:
+  - name: "High Authentication Failure Rate"
+    condition: "auth_failures > 10 per minute per IP"
+    severity: "warning"
+    
+  - name: "Privilege Escalation Attempt"
+    condition: "permission_denied + elevated_permission"
+    severity: "critical"
+    
+  - name: "Token Abuse Detection"
+    condition: "token_usage_anomaly"
+    severity: "warning"
+```
+
+This security architecture document provides comprehensive coverage of the KMS security model, suitable for security audits, compliance reviews, and security team training.
--- a/kms/docs/SYSTEM_IMPLEMENTATION_GUIDE.md
+++ b/kms/docs/SYSTEM_IMPLEMENTATION_GUIDE.md
@ -0,0 +1,879 @@
+# KMS System Implementation Guide
+
+This document provides detailed implementation guidance for the KMS system, covering areas not extensively documented in other files. It serves as a comprehensive reference for developers working on system components.
+
+## Table of Contents
+1. [Documentation Consistency Analysis](#documentation-consistency-analysis)
+2. [Audit System Implementation](#audit-system-implementation)
+3. [Multi-Tenancy Support](#multi-tenancy-support)
+4. [Cache Implementation Details](#cache-implementation-details)
+5. [Error Handling Framework](#error-handling-framework)
+6. [Validation System](#validation-system)
+7. [Metrics and Monitoring](#metrics-and-monitoring)
+8. [Database Migration System](#database-migration-system)
+9. [Frontend Architecture](#frontend-architecture)
+10. [Configuration Management](#configuration-management)
+
+---
+
+## Documentation Consistency Analysis
+
+### Current State Assessment
+
+The existing documentation is comprehensive but has some minor inconsistencies with the actual codebase:
+
+#### ✅ Accurate Documentation Areas:
+- **API endpoints** match the implementation in handlers
+- **Database schema** aligns with migrations (especially the new audit_events table)
+- **Authentication flows** are correctly documented
+- **Docker compose setup** matches actual configuration
+- **Security architecture** accurately reflects implementation
+- **Permission system** documentation is consistent with code
+
+#### ⚠️ Minor Inconsistencies Found:
+1. **Port references**: Some docs mention port 80 but actual nginx runs on 8081
+2. **Container names**: Documentation uses generic names, actual compose uses specific names like `kms-postgres`
+3. **Rate limiting values**: Docs show different values than actual middleware implementation
+4. **Frontend build process**: React version mentioned as 18, but package.json shows 19+
+
+#### ✨ Recently Added Features (Not in Original Docs):
+- **Audit system** with comprehensive event logging
+- **Multi-tenancy support** in database schema
+- **Advanced caching layer** with Redis integration
+- **SAML authentication** implementation
+- **Advanced security middleware** with brute force protection
+
+---
+
+## Audit System Implementation
+
+### Overview
+
+The KMS implements a comprehensive audit logging system that tracks all system events, user actions, and security-related activities.
+
+### Core Components
+
+#### Audit Event Structure
+```go
+// File: internal/audit/audit.go
+type AuditEvent struct {
+    ID           uuid.UUID              `json:"id"`
+    Type         EventType              `json:"type"`
+    Severity     Severity               `json:"severity"`
+    Status       Status                 `json:"status"`
+    Timestamp    time.Time              `json:"timestamp"`
+    
+    // Actor information
+    ActorID      string                 `json:"actor_id"`
+    ActorType    ActorType              `json:"actor_type"`
+    ActorIP      string                 `json:"actor_ip"`
+    UserAgent    string                 `json:"user_agent"`
+    
+    // Multi-tenancy support
+    TenantID     *uuid.UUID             `json:"tenant_id,omitempty"`
+    
+    // Resource information
+    ResourceID   string                 `json:"resource_id"`
+    ResourceType string                 `json:"resource_type"`
+    
+    // Event details
+    Action       string                 `json:"action"`
+    Description  string                 `json:"description"`
+    Details      map[string]interface{} `json:"details"`
+    
+    // Request context
+    RequestID    string                 `json:"request_id"`
+    SessionID    string                 `json:"session_id"`
+    
+    // Metadata
+    Tags         []string               `json:"tags"`
+    Metadata     map[string]interface{} `json:"metadata"`
+}
+```
+
+#### Event Types Taxonomy
+```
+auth.* - Authentication events
+├── auth.login - Successful user login
+├── auth.login_failed - Failed login attempt
+├── auth.logout - User logout
+├── auth.token_created - Token generation
+├── auth.token_revoked - Token revocation
+└── auth.token_validated - Token validation
+
+session.* - Session management
+├── session.created - New session created
+├── session.revoked - Session terminated
+└── session.expired - Session timeout
+
+app.* - Application management
+├── app.created - Application created
+├── app.updated - Application modified
+└── app.deleted - Application removed
+
+permission.* - Permission operations
+├── permission.granted - Permission assigned
+├── permission.revoked - Permission removed
+└── permission.denied - Access denied
+
+tenant.* - Multi-tenant operations
+├── tenant.created - New tenant
+├── tenant.updated - Tenant modified
+├── tenant.suspended - Tenant suspended
+└── tenant.activated - Tenant reactivated
+```
+
+#### Database Schema
+```sql
+-- File: migrations/004_add_audit_events.up.sql
+CREATE TABLE audit_events (
+    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
+    type VARCHAR(50) NOT NULL,
+    severity VARCHAR(20) NOT NULL CHECK (severity IN ('info', 'warning', 'error', 'critical')),
+    status VARCHAR(20) NOT NULL CHECK (status IN ('success', 'failure', 'pending')),
+    timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
+    
+    -- Actor information
+    actor_id VARCHAR(255),
+    actor_type VARCHAR(50) CHECK (actor_type IN ('user', 'system', 'service')),
+    actor_ip INET,
+    user_agent TEXT,
+    
+    -- Multi-tenancy
+    tenant_id UUID,
+    
+    -- Resource tracking
+    resource_id VARCHAR(255),
+    resource_type VARCHAR(100),
+    action VARCHAR(100) NOT NULL,
+    description TEXT NOT NULL,
+    details JSONB DEFAULT '{}',
+    
+    -- Request context
+    request_id VARCHAR(100),
+    session_id VARCHAR(255),
+    
+    -- Metadata
+    tags TEXT[],
+    metadata JSONB DEFAULT '{}'
+);
+```
+
+#### Frontend Integration
+```typescript
+// File: kms-frontend/src/components/Audit.tsx
+interface AuditEvent {
+  id: string;
+  type: string;
+  severity: 'info' | 'warning' | 'error' | 'critical';
+  status: 'success' | 'failure' | 'pending';
+  timestamp: string;
+  actor_id: string;
+  actor_type: string;
+  resource_type: string;
+  action: string;
+  description: string;
+}
+
+const Audit: React.FC = () => {
+  // Real-time audit log viewing with filtering
+  // Timeline view for event sequences
+  // Statistics dashboard for audit metrics
+};
+```
+
+### Implementation Guidelines
+
+#### Logging Best Practices
+1. **Log all security-relevant events**
+2. **Include sufficient context** for forensic analysis
+3. **Use structured logging** with consistent fields
+4. **Implement log retention policies**
+5. **Ensure tamper-evident logging**
+
+#### Performance Considerations
+1. **Asynchronous logging** to avoid blocking operations
+2. **Batch inserts** for high-volume events
+3. **Proper indexing** on commonly queried fields
+4. **Archival strategy** for historical data
+
+---
+
+## Multi-Tenancy Support
+
+### Architecture
+
+The KMS implements a multi-tenant architecture where each tenant has isolated data and permissions while sharing the same application instance.
+
+### Database Design
+
+#### Tenant Model
+```go
+// File: internal/domain/tenant.go
+type Tenant struct {
+    ID              uuid.UUID              `json:"id" db:"id"`
+    Name            string                 `json:"name" db:"name"`
+    Slug            string                 `json:"slug" db:"slug"`
+    Status          TenantStatus           `json:"status" db:"status"`
+    Settings        TenantSettings         `json:"settings" db:"settings"`
+    Metadata        map[string]interface{} `json:"metadata" db:"metadata"`
+    CreatedAt       time.Time              `json:"created_at" db:"created_at"`
+    UpdatedAt       time.Time              `json:"updated_at" db:"updated_at"`
+}
+
+type TenantStatus string
+
+const (
+    TenantStatusActive    TenantStatus = "active"
+    TenantStatusSuspended TenantStatus = "suspended"
+    TenantStatusPending   TenantStatus = "pending"
+)
+```
+
+#### Data Isolation Strategy
+```sql
+-- All tenant-specific tables include tenant_id
+ALTER TABLE applications ADD COLUMN tenant_id UUID REFERENCES tenants(id);
+ALTER TABLE static_tokens ADD COLUMN tenant_id UUID REFERENCES tenants(id);
+ALTER TABLE user_sessions ADD COLUMN tenant_id UUID REFERENCES tenants(id);
+ALTER TABLE audit_events ADD COLUMN tenant_id UUID;
+
+-- Row-level security policies
+ALTER TABLE applications ENABLE ROW LEVEL SECURITY;
+CREATE POLICY tenant_isolation ON applications 
+    FOR ALL TO kms_user 
+    USING (tenant_id = current_setting('app.current_tenant')::UUID);
+```
+
+### Implementation Pattern
+
+#### Tenant Context Middleware
+```go
+// File: internal/middleware/tenant.go
+func TenantMiddleware() gin.HandlerFunc {
+    return func(c *gin.Context) {
+        tenantID := extractTenantID(c)
+        if tenantID == "" {
+            c.AbortWithStatusJSON(400, gin.H{"error": "tenant_required"})
+            return
+        }
+        
+        // Set tenant context
+        c.Set("tenant_id", tenantID)
+        
+        // Set database session variable
+        db := c.MustGet("db").(*sql.DB)
+        _, err := db.Exec("SELECT set_config('app.current_tenant', $1, true)", tenantID)
+        if err != nil {
+            c.AbortWithStatusJSON(500, gin.H{"error": "tenant_setup_failed"})
+            return
+        }
+        
+        c.Next()
+    }
+}
+```
+
+### Usage Guidelines
+
+1. **Always include tenant_id** in database queries
+2. **Validate tenant access** in middleware
+3. **Implement tenant-aware caching**
+4. **Audit cross-tenant operations**
+5. **Test tenant isolation thoroughly**
+
+---
+
+## Cache Implementation Details
+
+### Architecture
+
+The KMS implements a layered caching system with multiple providers and configurable TTL policies.
+
+### Cache Interface
+```go
+// File: internal/cache/cache.go
+type CacheManager interface {
+    Get(ctx context.Context, key string) ([]byte, error)
+    Set(ctx context.Context, key string, value []byte, ttl time.Duration) error
+    GetJSON(ctx context.Context, key string, dest interface{}) error
+    SetJSON(ctx context.Context, key string, value interface{}, ttl time.Duration) error
+    Delete(ctx context.Context, key string) error
+    Clear(ctx context.Context) error
+    Exists(ctx context.Context, key string) (bool, error)
+}
+```
+
+### Redis Implementation
+```go
+// File: internal/cache/redis.go
+type RedisCacheManager struct {
+    client     redis.Client
+    keyPrefix  string
+    serializer JSONSerializer
+    logger     *zap.Logger
+}
+
+func (r *RedisCacheManager) GetJSON(ctx context.Context, key string, dest interface{}) error {
+    prefixedKey := r.keyPrefix + key
+    data, err := r.client.Get(ctx, prefixedKey).Bytes()
+    if err != nil {
+        if err == redis.Nil {
+            return ErrCacheMiss
+        }
+        return fmt.Errorf("failed to get key %s: %w", prefixedKey, err)
+    }
+    
+    return r.serializer.Deserialize(data, dest)
+}
+```
+
+### Cache Key Management
+```go
+type CacheKey string
+
+const (
+    KeyPrefixAuth       = "auth:"
+    KeyPrefixToken      = "token:"
+    KeyPrefixPermission = "perm:"
+    KeyPrefixSession    = "sess:"
+    KeyPrefixApp        = "app:"
+)
+
+func CacheKey(prefix, suffix string) string {
+    return fmt.Sprintf("%s%s", prefix, suffix)
+}
+```
+
+### Usage Patterns
+
+#### Authentication Caching
+```go
+// Cache authentication results for 5 minutes
+cacheKey := cache.CacheKey(cache.KeyPrefixAuth, fmt.Sprintf("%s:%s", userID, appID))
+err := cacheManager.SetJSON(ctx, cacheKey, authResult, 5*time.Minute)
+```
+
+#### Token Revocation List
+```go
+// Cache revoked tokens until their expiration
+revokedKey := cache.CacheKey(cache.KeyPrefixToken, "revoked:"+tokenID)
+err := cacheManager.Set(ctx, revokedKey, []byte("1"), tokenExpiry.Sub(time.Now()))
+```
+
+### Configuration
+```bash
+# Cache configuration
+CACHE_ENABLED=true
+CACHE_PROVIDER=redis  # or memory
+REDIS_ADDR=localhost:6379
+REDIS_PASSWORD=
+REDIS_DB=0
+CACHE_DEFAULT_TTL=5m
+```
+
+---
+
+## Error Handling Framework
+
+### Error Type Hierarchy
+```go
+// File: internal/errors/errors.go
+type ErrorCode string
+
+const (
+    ErrorCodeValidation      ErrorCode = "validation_error"
+    ErrorCodeAuthentication  ErrorCode = "authentication_error"
+    ErrorCodeAuthorization   ErrorCode = "authorization_error"
+    ErrorCodeNotFound        ErrorCode = "not_found"
+    ErrorCodeConflict        ErrorCode = "conflict"
+    ErrorCodeInternal        ErrorCode = "internal_error"
+    ErrorCodeRateLimit       ErrorCode = "rate_limit_exceeded"
+    ErrorCodeBadRequest      ErrorCode = "bad_request"
+)
+
+type APIError struct {
+    Code       ErrorCode   `json:"code"`
+    Message    string      `json:"message"`
+    Details    interface{} `json:"details,omitempty"`
+    HTTPStatus int         `json:"-"`
+    Cause      error       `json:"-"`
+}
+```
+
+### Error Factory Functions
+```go
+func NewValidationError(message string, details interface{}) *APIError {
+    return &APIError{
+        Code:       ErrorCodeValidation,
+        Message:    message,
+        Details:    details,
+        HTTPStatus: http.StatusBadRequest,
+    }
+}
+
+func NewAuthenticationError(message string) *APIError {
+    return &APIError{
+        Code:       ErrorCodeAuthentication,
+        Message:    message,
+        HTTPStatus: http.StatusUnauthorized,
+    }
+}
+```
+
+### Error Handler Middleware
+```go
+// File: internal/errors/secure_responses.go
+func (e *ErrorHandler) HandleError(c *gin.Context, err error) {
+    var apiErr *APIError
+    if errors.As(err, &apiErr) {
+        // Log error with context
+        e.logger.Error("API error",
+            zap.String("error_code", string(apiErr.Code)),
+            zap.String("message", apiErr.Message),
+            zap.Int("http_status", apiErr.HTTPStatus),
+            zap.Error(apiErr.Cause))
+        
+        c.JSON(apiErr.HTTPStatus, gin.H{
+            "error":   apiErr.Code,
+            "message": apiErr.Message,
+            "details": apiErr.Details,
+        })
+        return
+    }
+    
+    // Handle unexpected errors
+    e.logger.Error("Unexpected error", zap.Error(err))
+    c.JSON(http.StatusInternalServerError, gin.H{
+        "error":   ErrorCodeInternal,
+        "message": "An internal error occurred",
+    })
+}
+```
+
+---
+
+## Validation System
+
+### Validator Implementation
+```go
+// File: internal/validation/validator.go
+type Validator struct {
+    validator *validator.Validate
+    logger    *zap.Logger
+}
+
+func NewValidator(logger *zap.Logger) *Validator {
+    v := validator.New()
+    
+    // Register custom validators
+    v.RegisterValidation("app_id", validateAppID)
+    v.RegisterValidation("token_type", validateTokenType)
+    v.RegisterValidation("permission_scope", validatePermissionScope)
+    
+    return &Validator{
+        validator: v,
+        logger:    logger,
+    }
+}
+```
+
+### Custom Validation Rules
+```go
+func validateAppID(fl validator.FieldLevel) bool {
+    appID := fl.Field().String()
+    // App ID format: domain.app (e.g., com.example.app)
+    pattern := `^[a-z0-9]+(\.[a-z0-9]+)*\.[a-z0-9]+$`
+    match, _ := regexp.MatchString(pattern, appID)
+    return match && len(appID) >= 3 && len(appID) <= 100
+}
+
+func validatePermissionScope(fl validator.FieldLevel) bool {
+    scope := fl.Field().String()
+    // Permission format: domain.action (e.g., app.read)
+    pattern := `^[a-z_]+(\.[a-z_]+)*$`
+    match, _ := regexp.MatchString(pattern, scope)
+    return match && len(scope) >= 1 && len(scope) <= 50
+}
+```
+
+### Middleware Integration
+```go
+// File: internal/middleware/validation.go
+func (v *ValidationMiddleware) ValidateJSON(schema interface{}) gin.HandlerFunc {
+    return gin.HandlerFunc(func(c *gin.Context) {
+        if err := c.ShouldBindJSON(schema); err != nil {
+            var validationErrors []ValidationError
+            
+            if errs, ok := err.(validator.ValidationErrors); ok {
+                for _, e := range errs {
+                    validationErrors = append(validationErrors, ValidationError{
+                        Field:   e.Field(),
+                        Message: e.Tag(),
+                        Value:   e.Value(),
+                    })
+                }
+            }
+            
+            apiErr := errors.NewValidationError("Request validation failed", validationErrors)
+            v.errorHandler.HandleError(c, apiErr)
+            return
+        }
+        
+        c.Next()
+    })
+}
+```
+
+---
+
+## Metrics and Monitoring
+
+### Prometheus Integration
+```go
+// File: internal/metrics/metrics.go
+type Metrics struct {
+    // HTTP metrics
+    httpRequestsTotal     *prometheus.CounterVec
+    httpRequestDuration   *prometheus.HistogramVec
+    httpRequestsInFlight  prometheus.Gauge
+    
+    // Auth metrics
+    authAttemptsTotal     *prometheus.CounterVec
+    authSuccessTotal      *prometheus.CounterVec
+    authFailuresTotal     *prometheus.CounterVec
+    
+    // Token metrics
+    tokensIssuedTotal     *prometheus.CounterVec
+    tokenValidationsTotal *prometheus.CounterVec
+    
+    // Business metrics
+    applicationsTotal     prometheus.Gauge
+    activeSessionsTotal   prometheus.Gauge
+}
+```
+
+### Metrics Collection
+```go
+func (m *Metrics) RecordHTTPRequest(method, path string, statusCode int, duration time.Duration) {
+    m.httpRequestsTotal.WithLabelValues(method, path, strconv.Itoa(statusCode)).Inc()
+    m.httpRequestDuration.WithLabelValues(method, path).Observe(duration.Seconds())
+}
+
+func (m *Metrics) RecordAuthAttempt(provider, result string) {
+    m.authAttemptsTotal.WithLabelValues(provider, result).Inc()
+    if result == "success" {
+        m.authSuccessTotal.WithLabelValues(provider).Inc()
+    } else {
+        m.authFailuresTotal.WithLabelValues(provider).Inc()
+    }
+}
+```
+
+### Dashboard Configuration
+```yaml
+# Grafana dashboard config
+panels:
+  - title: "Request Rate"
+    type: "graph"
+    targets:
+      - expr: "rate(http_requests_total[5m])"
+        legendFormat: "{{method}} {{path}}"
+  
+  - title: "Authentication Success Rate"
+    type: "stat"
+    targets:
+      - expr: "rate(auth_success_total[5m]) / rate(auth_attempts_total[5m]) * 100"
+        legendFormat: "Success Rate %"
+  
+  - title: "Active Applications"
+    type: "stat"
+    targets:
+      - expr: "applications_total"
+        legendFormat: "Applications"
+```
+
+---
+
+## Database Migration System
+
+### Migration Structure
+```
+migrations/
+├── 001_initial_schema.up.sql
+├── 001_initial_schema.down.sql
+├── 002_user_sessions.up.sql
+├── 002_user_sessions.down.sql
+├── 003_add_token_prefix.up.sql
+├── 003_add_token_prefix.down.sql
+├── 004_add_audit_events.up.sql
+└── 004_add_audit_events.down.sql
+```
+
+### Migration Runner
+```go
+// File: internal/database/postgres.go
+func RunMigrations(db *sql.DB, migrationPath string) error {
+    driver, err := postgres.WithInstance(db, &postgres.Config{})
+    if err != nil {
+        return fmt.Errorf("failed to create migration driver: %w", err)
+    }
+    
+    m, err := migrate.NewWithDatabaseInstance(
+        fmt.Sprintf("file://%s", migrationPath),
+        "postgres", driver)
+    if err != nil {
+        return fmt.Errorf("failed to create migration instance: %w", err)
+    }
+    
+    if err := m.Up(); err != nil && err != migrate.ErrNoChange {
+        return fmt.Errorf("failed to run migrations: %w", err)
+    }
+    
+    return nil
+}
+```
+
+### Migration Best Practices
+
+1. **Always create both up and down migrations**
+2. **Test migrations on copy of production data**
+3. **Make migrations idempotent**
+4. **Add proper indexes for performance**
+5. **Include rollback procedures**
+
+### Example Migration
+```sql
+-- 005_add_oauth_providers.up.sql
+CREATE TABLE IF NOT EXISTS oauth_providers (
+    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
+    name VARCHAR(100) NOT NULL UNIQUE,
+    client_id VARCHAR(255) NOT NULL,
+    client_secret_encrypted TEXT NOT NULL,
+    authorization_url TEXT NOT NULL,
+    token_url TEXT NOT NULL,
+    user_info_url TEXT NOT NULL,
+    scopes TEXT[] DEFAULT ARRAY['openid', 'profile', 'email'],
+    enabled BOOLEAN DEFAULT true,
+    created_at TIMESTAMP DEFAULT NOW(),
+    updated_at TIMESTAMP DEFAULT NOW()
+);
+
+CREATE INDEX idx_oauth_providers_name ON oauth_providers(name);
+CREATE INDEX idx_oauth_providers_enabled ON oauth_providers(enabled) WHERE enabled = true;
+```
+
+---
+
+## Frontend Architecture
+
+### Component Structure
+```
+src/
+├── components/
+│   ├── Applications.tsx      # Application management
+│   ├── Tokens.tsx           # Token operations  
+│   ├── Users.tsx            # User management
+│   ├── Audit.tsx            # Audit log viewer
+│   ├── Dashboard.tsx        # Main dashboard
+│   ├── Login.tsx            # Authentication
+│   ├── TokenTester.tsx      # Token testing utility
+│   └── TokenTesterCallback.tsx
+├── contexts/
+│   └── AuthContext.tsx      # Authentication state
+├── services/
+│   └── apiService.ts        # API client
+├── App.tsx                  # Main application
+└── index.tsx                # Entry point
+```
+
+### API Service Implementation
+```typescript
+// File: kms-frontend/src/services/apiService.ts
+class APIService {
+  private baseURL: string;
+  private token: string | null = null;
+
+  constructor(baseURL: string) {
+    this.baseURL = baseURL;
+  }
+
+  async request<T>(endpoint: string, options: RequestInit = {}): Promise<T> {
+    const url = `${this.baseURL}${endpoint}`;
+    const headers = {
+      'Content-Type': 'application/json',
+      'X-User-Email': this.getUserEmail(),
+      ...options.headers,
+    };
+
+    const response = await fetch(url, {
+      ...options,
+      headers,
+    });
+
+    if (!response.ok) {
+      const error = await response.json().catch(() => ({}));
+      throw new APIError(error.message || 'Request failed', response.status);
+    }
+
+    return response.json();
+  }
+
+  // Application management
+  async getApplications(): Promise<Application[]> {
+    return this.request<Application[]>('/api/applications');
+  }
+
+  // Audit log access
+  async getAuditEvents(params: AuditQueryParams): Promise<AuditEvent[]> {
+    const queryString = new URLSearchParams(params).toString();
+    return this.request<AuditEvent[]>(`/api/audit/events?${queryString}`);
+  }
+}
+```
+
+### Authentication Context
+```typescript
+// File: kms-frontend/src/contexts/AuthContext.tsx
+interface AuthContextType {
+  user: User | null;
+  login: (email: string) => Promise<void>;
+  logout: () => void;
+  isAuthenticated: boolean;
+  isLoading: boolean;
+}
+
+export const AuthContext = React.createContext<AuthContextType | null>(null);
+
+export const AuthProvider: React.FC<{ children: React.ReactNode }> = ({ children }) => {
+  const [user, setUser] = useState<User | null>(null);
+  const [isLoading, setIsLoading] = useState(true);
+
+  const login = async (email: string) => {
+    try {
+      setIsLoading(true);
+      const response = await apiService.login(email);
+      setUser({ email, token: response.token });
+      localStorage.setItem('kms_user', JSON.stringify({ email }));
+    } catch (error) {
+      throw error;
+    } finally {
+      setIsLoading(false);
+    }
+  };
+
+  // ... rest of implementation
+};
+```
+
+---
+
+## Configuration Management
+
+### Configuration Interface
+```go
+// File: internal/config/config.go
+type ConfigProvider interface {
+    GetString(key string) string
+    GetInt(key string) int
+    GetBool(key string) bool
+    GetDuration(key string) time.Duration
+    GetStringSlice(key string) []string
+    IsSet(key string) bool
+    Validate() error
+    GetDatabaseDSN() string
+    GetServerAddress() string
+    IsDevelopment() bool
+    IsProduction() bool
+}
+```
+
+### Configuration Validation
+```go
+func (c *Config) Validate() error {
+    var errors []string
+    
+    // Required configuration
+    required := []string{
+        "INTERNAL_HMAC_KEY",
+        "JWT_SECRET", 
+        "AUTH_SIGNING_KEY",
+        "DB_HOST",
+        "DB_NAME",
+    }
+    
+    for _, key := range required {
+        if !c.IsSet(key) {
+            errors = append(errors, fmt.Sprintf("required configuration %s is not set", key))
+        }
+    }
+    
+    // Validate key lengths
+    if len(c.GetString("INTERNAL_HMAC_KEY")) < 32 {
+        errors = append(errors, "INTERNAL_HMAC_KEY must be at least 32 characters")
+    }
+    
+    if len(errors) > 0 {
+        return fmt.Errorf("configuration validation failed: %s", strings.Join(errors, ", "))
+    }
+    
+    return nil
+}
+```
+
+### Environment Configuration
+```bash
+# Security Configuration
+INTERNAL_HMAC_KEY=3924f352b7ea63b27db02bf4b0014f2961a5d2f7c27643853a4581bb3a5457cb
+JWT_SECRET=7f5e11d55e957988b00ce002418680af384219ef98c50d08cbbbdd541978450c
+AUTH_SIGNING_KEY=484f921b39c383e6b3e0cc5a7cef3c2cec3d7c8d474ab5102891dc4c2bf63a68
+
+# Database Configuration  
+DB_HOST=postgres
+DB_PORT=5432
+DB_NAME=kms
+DB_USER=postgres
+DB_PASSWORD=postgres
+
+# Feature Flags
+RATE_LIMIT_ENABLED=true
+CACHE_ENABLED=false
+METRICS_ENABLED=true
+SAML_ENABLED=false
+```
+
+---
+
+## Implementation Best Practices
+
+### Code Organization
+1. **Follow clean architecture principles**
+2. **Use dependency injection throughout**
+3. **Implement comprehensive error handling**
+4. **Add structured logging to all components**
+5. **Write unit tests for business logic**
+
+### Security Guidelines
+1. **Always validate input at API boundaries**
+2. **Use parameterized database queries**
+3. **Implement proper authentication and authorization**
+4. **Log all security-relevant events**
+5. **Follow principle of least privilege**
+
+### Performance Considerations
+1. **Implement caching for frequently accessed data**
+2. **Use database indexes appropriately**
+3. **Monitor and optimize slow queries**
+4. **Implement proper connection pooling**
+5. **Use asynchronous operations where beneficial**
+
+### Testing Strategy
+1. **Unit tests for business logic**
+2. **Integration tests for API endpoints**
+3. **End-to-end tests for critical workflows**
+4. **Load testing for performance validation**
+5. **Security testing for vulnerability assessment**
+
+---
+
+*This document serves as a comprehensive implementation guide for the KMS system. It should be updated as the system evolves and new features are added.*