Files
skybridge/docs/ARCHITECTURE.md
2025-08-24 10:09:58 -04:00

22 KiB

API Key Management Service (KMS) - System Architecture

Table of Contents

  1. System Overview
  2. Architecture Principles
  3. Component Architecture
  4. System Architecture Diagram
  5. Request Flow Pipeline
  6. Authentication Flow
  7. API Design
  8. Technology Stack

System Overview

The API Key Management Service (KMS) is a secure, scalable platform for managing API authentication tokens across applications. Built with Go backend and React TypeScript frontend, it provides centralized token lifecycle management with enterprise-grade security features.

Key Capabilities

  • Multi-Provider Authentication: Header, JWT, OAuth2, SAML support
  • Dual Token System: Static HMAC tokens and renewable JWT user tokens
  • Hierarchical Permissions: Role-based access control with inheritance
  • Enterprise Security: Rate limiting, brute force protection, audit logging
  • High Availability: Containerized deployment with load balancing

Core Features

  • Token Lifecycle Management: Create, verify, renew, and revoke tokens
  • Application Management: Multi-tenant application configuration
  • User Session Tracking: Comprehensive session management
  • Audit Logging: Complete audit trail of all operations
  • Health Monitoring: Built-in health checks and metrics

Architecture Principles

Clean Architecture

The system follows clean architecture principles with clear separation of concerns:

┌─────────────────┐
│   Handlers      │ ← HTTP request handling
├─────────────────┤
│   Services      │ ← Business logic
├─────────────────┤
│  Repositories   │ ← Data access
├─────────────────┤
│    Database     │ ← Data persistence
└─────────────────┘

Design Principles

  • Dependency Injection: All services receive dependencies through constructors
  • Interface Segregation: Repository interfaces enable testing and flexibility
  • Single Responsibility: Each component has one clear purpose
  • Fail-Safe Defaults: Security-first configuration with safe fallbacks
  • Immutable Operations: Database transactions and audit logging
  • Defense in Depth: Multiple security layers throughout the stack

Component Architecture

Backend Components (internal/)

Handlers Layer (internal/handlers/)

HTTP request processors implementing REST API endpoints:

  • application.go: Application CRUD operations

    • Create, read, update, delete applications
    • HMAC key management
    • Ownership validation
  • auth.go: Authentication workflows

    • User login and logout
    • Token renewal and validation
    • Multi-provider authentication
  • token.go: Token operations

    • Static token creation
    • Token verification
    • Token revocation
  • health.go: System health checks

    • Database connectivity
    • Cache availability
    • Service status
  • oauth2.go, saml.go: External authentication providers

    • OAuth2 authorization code flow
    • SAML assertion validation
    • Provider callback handling

Services Layer (internal/services/)

Business logic implementation with transaction management:

  • auth_service.go: Authentication provider orchestration

    • Multi-provider authentication
    • Session management
    • User context creation
  • token_service.go: Token lifecycle management

    • Static token generation and validation
    • JWT user token management
    • Permission assignment
  • application_service.go: Application configuration management

    • Application CRUD operations
    • Configuration validation
    • HMAC key rotation
  • session_service.go: User session tracking

    • Session creation and validation
    • Session timeout handling
    • Cross-provider session management

Repository Layer (internal/repository/postgres/)

Data access with ACID transaction support:

  • application_repository.go: Application persistence

    • Secure dynamic query building
    • Parameterized queries
    • Ownership validation
  • token_repository.go: Static token management

    • BCrypt token hashing
    • Token lookup and validation
    • Permission relationship management
  • permission_repository.go: Permission catalog

    • Hierarchical permission structure
    • Permission validation
    • Bulk permission operations
  • session_repository.go: User session storage

    • Session persistence
    • Expiration management
    • Provider metadata storage

Authentication Providers (internal/auth/)

Pluggable authentication system:

  • header_validator.go: HMAC signature validation

    • Timestamp-based replay protection
    • Constant-time signature comparison
    • Email format validation
  • jwt.go: JWT token management

    • Token generation with secure JTI
    • Signature validation
    • Revocation list management
  • oauth2.go: OAuth2 authorization code flow

    • State management
    • Token exchange
    • Provider integration
  • saml.go: SAML assertion validation

    • XML signature validation
    • Attribute extraction
    • Provider configuration
  • permissions.go: Hierarchical permission evaluation

    • Role-based access control
    • Permission inheritance
    • Bulk permission evaluation

System Architecture Diagram

graph TB
    %% External Components
    Client[Client Applications]
    Browser[Web Browser]
    AuthProvider[OAuth2/SAML Provider]
    
    %% Load Balancer & Proxy
    subgraph "Load Balancer Layer"
        Nginx[Nginx Proxy<br/>:80, :8081]
    end
    
    %% Frontend Layer
    subgraph "Frontend Layer"
        React[React TypeScript SPA<br/>Ant Design UI<br/>:3000]
        AuthContext[Authentication Context]
        APIService[API Service Client]
    end
    
    %% API Gateway & Middleware
    subgraph "API Layer"
        API[Go API Server<br/>:8080]
        
        subgraph "Middleware Chain"
            Logger[Request Logger]
            Security[Security Headers<br/>CORS, CSRF]
            RateLimit[Rate Limiter<br/>100 RPS]
            Auth[Authentication<br/>Header/JWT/OAuth2/SAML]
            Validation[Request Validator]
        end
    end
    
    %% Business Logic Layer
    subgraph "Service Layer"
        AuthService[Authentication Service]
        TokenService[Token Service]
        AppService[Application Service]
        SessionService[Session Service]
        PermService[Permission Service]
    end
    
    %% Data Access Layer
    subgraph "Repository Layer"
        AppRepo[Application Repository]
        TokenRepo[Token Repository]
        PermRepo[Permission Repository]
        SessionRepo[Session Repository]
    end
    
    %% Infrastructure Layer
    subgraph "Infrastructure"
        PostgreSQL[(PostgreSQL 15<br/>:5432)]
        Redis[(Redis Cache<br/>Optional)]
        Metrics[Prometheus Metrics<br/>:9090]
    end
    
    %% External Security
    subgraph "Security & Crypto"
        HMAC[HMAC Signature<br/>Validation]
        BCrypt[BCrypt Hashing<br/>Cost 14]
        JWT[JWT Token<br/>Generation]
    end
    
    %% Flow Connections
    Client -->|API Requests| Nginx
    Browser -->|HTTPS| Nginx
    AuthProvider -->|OAuth2/SAML| API
    
    Nginx -->|Proxy| React
    Nginx -->|API Proxy| API
    
    React --> AuthContext
    React --> APIService
    APIService -->|REST API| API
    
    API --> Logger
    Logger --> Security
    Security --> RateLimit
    RateLimit --> Auth
    Auth --> Validation
    Validation --> AuthService
    Validation --> TokenService
    Validation --> AppService
    Validation --> SessionService
    Validation --> PermService
    
    AuthService --> AppRepo
    TokenService --> TokenRepo
    AppService --> AppRepo
    SessionService --> SessionRepo
    PermService --> PermRepo
    
    AppRepo --> PostgreSQL
    TokenRepo --> PostgreSQL
    PermRepo --> PostgreSQL
    SessionRepo --> PostgreSQL
    
    TokenService --> HMAC
    TokenService --> BCrypt
    TokenService --> JWT
    AuthService --> Redis
    
    API --> Metrics
    
    %% Styling
    classDef frontend fill:#e1f5fe,stroke:#0277bd,stroke-width:2px
    classDef api fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    classDef service fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
    classDef data fill:#fff3e0,stroke:#ef6c00,stroke-width:2px
    classDef security fill:#ffebee,stroke:#c62828,stroke-width:2px
    
    class React,AuthContext,APIService frontend
    class API,Logger,Security,RateLimit,Auth,Validation api
    class AuthService,TokenService,AppService,SessionService,PermService service
    class PostgreSQL,Redis,Metrics,AppRepo,TokenRepo,PermRepo,SessionRepo data
    class HMAC,BCrypt,JWT security

Request Flow Pipeline

flowchart TD
    Start([HTTP Request]) --> Nginx{Nginx Proxy<br/>Load Balancer}
    
    Nginx -->|Static Assets| Frontend[React SPA<br/>Port 3000]
    Nginx -->|API Routes| API[Go API Server<br/>Port 8080]
    
    API --> Logger[Request Logger<br/>Structured Logging]
    Logger --> Security[Security Middleware<br/>Headers, CORS, CSRF]
    Security --> RateLimit{Rate Limiter<br/>100 RPS, 200 Burst}
    
    RateLimit -->|Exceeded| RateResponse[429 Too Many Requests]
    RateLimit -->|Within Limits| Auth[Authentication<br/>Middleware]
    
    Auth --> AuthHeader{Auth Provider}
    AuthHeader -->|header| HeaderAuth[Header Validator<br/>X-User-Email]
    AuthHeader -->|jwt| JWTAuth[JWT Validator<br/>Signature + Claims]
    AuthHeader -->|oauth2| OAuth2Auth[OAuth2 Flow<br/>Authorization Code]
    AuthHeader -->|saml| SAMLAuth[SAML Assertion<br/>XML Validation]
    
    HeaderAuth --> AuthCache{Check Cache<br/>Redis 5min TTL}
    JWTAuth --> JWTValidation[Signature Validation<br/>Expiry Check]
    OAuth2Auth --> OAuth2Exchange[Token Exchange<br/>User Info Retrieval]
    SAMLAuth --> SAMLValidation[Assertion Validation<br/>Signature Check]
    
    AuthCache -->|Hit| AuthContext[Create AuthContext]
    AuthCache -->|Miss| DBAuth[Database Lookup<br/>User Permissions]
    JWTValidation --> RevocationCheck[Check Revocation List<br/>Redis Cache]
    OAuth2Exchange --> SessionStore[Store User Session<br/>PostgreSQL]
    SAMLValidation --> SessionStore
    
    DBAuth --> CacheStore[Store in Cache<br/>5min TTL]
    RevocationCheck --> AuthContext
    SessionStore --> AuthContext
    CacheStore --> AuthContext
    
    AuthContext --> Validation[Request Validator<br/>JSON Schema]
    Validation -->|Invalid| ValidationError[400 Bad Request]
    Validation -->|Valid| Router{Route Handler}
    
    Router -->|/health| HealthHandler[Health Check<br/>DB + Cache Status]
    Router -->|/api/applications| AppHandler[Application CRUD<br/>HMAC Key Management]
    Router -->|/api/tokens| TokenHandler[Token Operations<br/>Create, Verify, Revoke]
    Router -->|/api/login| AuthHandler[Authentication<br/>Login, Renewal]
    Router -->|/api/oauth2| OAuth2Handler[OAuth2 Callbacks<br/>State Management]
    Router -->|/api/saml| SAMLHandler[SAML Callbacks<br/>Assertion Processing]
    
    HealthHandler --> Service[Service Layer]
    AppHandler --> Service
    TokenHandler --> Service
    AuthHandler --> Service
    OAuth2Handler --> Service
    SAMLHandler --> Service
    
    Service --> Repository[Repository Layer<br/>Database Operations]
    Repository --> PostgreSQL[(PostgreSQL<br/>ACID Transactions)]
    
    Service --> CryptoOps[Cryptographic Operations]
    CryptoOps --> HMAC[HMAC Signature<br/>Timestamp Validation]
    CryptoOps --> BCrypt[BCrypt Hashing<br/>Cost 14]
    CryptoOps --> JWT[JWT Generation<br/>RS256 Signing]
    
    Repository --> AuditLog[Audit Logging<br/>All Operations]
    AuditLog --> AuditTable[(audit_logs table)]
    
    Service --> Response[HTTP Response]
    Response --> Metrics[Prometheus Metrics<br/>Port 9090]
    Response --> End([Response Sent])
    
    %% Error Paths
    RateResponse --> End
    ValidationError --> End
    
    %% Styling
    classDef middleware fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    classDef auth fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    classDef handler fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
    classDef data fill:#fff3e0,stroke:#ef6c00,stroke-width:2px
    classDef crypto fill:#ffebee,stroke:#c62828,stroke-width:2px
    classDef error fill:#fce4ec,stroke:#ad1457,stroke-width:2px
    
    class Logger,Security,RateLimit,Validation middleware
    class Auth,HeaderAuth,JWTAuth,OAuth2Auth,SAMLAuth,AuthContext auth
    class HealthHandler,AppHandler,TokenHandler,AuthHandler,OAuth2Handler,SAMLHandler,Service handler
    class Repository,PostgreSQL,AuditTable,AuthCache data
    class CryptoOps,HMAC,BCrypt,JWT crypto
    class RateResponse,ValidationError error

Request Processing Pipeline

  1. Load Balancer: Nginx receives and routes requests
  2. Static Assets: React SPA served directly by Nginx
  3. API Gateway: Go server handles API requests
  4. Middleware Chain: Security, rate limiting, authentication
  5. Route Handler: Business logic processing
  6. Service Layer: Transaction management and orchestration
  7. Repository Layer: Database operations with audit logging
  8. Response: JSON response with metrics collection

Authentication Flow

sequenceDiagram
    participant Client as Client App
    participant API as API Gateway
    participant Auth as Auth Service
    participant DB as PostgreSQL
    participant Provider as OAuth2/SAML
    participant Cache as Redis Cache
    
    %% Header-based Authentication
    rect rgb(240, 248, 255)
        Note over Client, Cache: Header-based Authentication Flow
        Client->>API: Request with X-User-Email header
        API->>Auth: Validate header auth
        Auth->>DB: Check user permissions
        DB-->>Auth: Return user context
        Auth->>Cache: Cache auth result (5min TTL)
        Auth-->>API: AuthContext{UserID, Permissions}
        API-->>Client: Authenticated response
    end
    
    %% JWT Authentication Flow
    rect rgb(245, 255, 245)
        Note over Client, Cache: JWT Authentication Flow
        Client->>API: Login request {app_id, permissions}
        API->>Auth: Generate JWT token
        Auth->>DB: Validate app_id and permissions
        DB-->>Auth: Application config
        Auth->>Auth: Create JWT with claims<br/>{user_id, permissions, exp, iat}
        Auth-->>API: JWT token + expires_at
        API-->>Client: LoginResponse{token, expires_at}
        
        Note over Client, API: Subsequent requests with JWT
        Client->>API: Request with Bearer JWT
        API->>Auth: Verify JWT signature
        Auth->>Auth: Check expiration & claims
        Auth->>Cache: Check revocation list
        Cache-->>Auth: Token status
        Auth-->>API: Valid AuthContext
        API-->>Client: Authorized response
    end
    
    %% OAuth2/SAML Flow
    rect rgb(255, 248, 240)
        Note over Client, Provider: OAuth2/SAML Authentication Flow
        Client->>API: POST /api/login {app_id, redirect_uri}
        API->>Auth: Generate OAuth2 state
        Auth->>DB: Store state + app context
        Auth-->>API: Redirect URL + state
        API-->>Client: {redirect_url, state}
        
        Client->>Provider: Redirect to OAuth2 provider
        Provider-->>Client: Authorization code + state
        
        Client->>API: GET /api/oauth2/callback?code=xxx&state=yyy
        API->>Auth: Validate state and exchange code
        Auth->>Provider: Exchange code for tokens
        Provider-->>Auth: Access token + ID token
        Auth->>Provider: Get user info
        Provider-->>Auth: User profile
        Auth->>DB: Create/update user session
        Auth->>Auth: Generate internal JWT
        Auth-->>API: JWT token + user context
        API-->>Client: Set-Cookie with JWT + redirect
    end
    
    %% Token Renewal Flow
    rect rgb(248, 245, 255)
        Note over Client, DB: Token Renewal Flow
        Client->>API: POST /api/renew {app_id, user_id, token}
        API->>Auth: Validate current token
        Auth->>Auth: Check token expiration<br/>and max_valid_at
        Auth->>DB: Get application config
        DB-->>Auth: TokenRenewalDuration, MaxTokenDuration
        Auth->>Auth: Generate new JWT<br/>with extended expiry
        Auth->>Cache: Invalidate old token
        Auth-->>API: New token + expires_at
        API-->>Client: RenewResponse{token, expires_at}
    end

Authentication Methods

Header-based Authentication

  • Use Case: Service-to-service authentication
  • Security: HMAC-SHA256 signatures with timestamp validation
  • Replay Protection: 5-minute timestamp window
  • Caching: 5-minute Redis cache for performance

JWT Authentication

  • Use Case: User authentication with session management
  • Security: RSA signatures with revocation checking
  • Token Lifecycle: Configurable expiration with renewal
  • Claims: User ID, permissions, application scope

OAuth2/SAML Authentication

  • Use Case: External identity provider integration
  • Security: Authorization code flow with state validation
  • Session Management: Database-backed session storage
  • Provider Support: Configurable provider endpoints

API Design

RESTful Endpoints

Authentication Endpoints

POST /api/login          - Authenticate user, issue JWT
POST /api/renew          - Renew JWT token
POST /api/logout         - Revoke JWT token
GET  /api/verify         - Verify token and permissions

Application Management

GET    /api/applications     - List applications (paginated)
POST   /api/applications     - Create application
GET    /api/applications/:id - Get application details
PUT    /api/applications/:id - Update application
DELETE /api/applications/:id - Delete application

Token Management

GET    /api/applications/:id/tokens - List application tokens
POST   /api/applications/:id/tokens - Create new token
DELETE /api/tokens/:id              - Revoke token

OAuth2/SAML Integration

POST /api/oauth2/login          - Initiate OAuth2 flow
GET  /api/oauth2/callback       - OAuth2 callback handler
POST /api/saml/login            - Initiate SAML flow
POST /api/saml/callback         - SAML assertion handler

System Endpoints

GET /health   - System health check
GET /ready    - Readiness probe
GET /metrics  - Prometheus metrics

Request/Response Patterns

Authentication Headers

X-User-Email: user@example.com
X-Auth-Timestamp: 2024-01-15T10:30:00Z
X-Auth-Signature: sha256=abc123...
Authorization: Bearer eyJhbGciOiJSUzI1NiIs...
Content-Type: application/json

Error Response Format

{
  "error": "validation_failed",
  "message": "Request validation failed",
  "details": [
    {
      "field": "permissions",
      "message": "Invalid permission format",
      "value": "invalid.perm"
    }
  ]
}

Success Response Format

{
  "data": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "token": "ABC123_xyz789...",
    "permissions": ["app.read", "token.create"],
    "created_at": "2024-01-15T10:30:00Z"
  }
}

Technology Stack

Backend Technologies

  • Language: Go 1.21+ with modules
  • Web Framework: Gin HTTP framework
  • Database: PostgreSQL 15 with connection pooling
  • Authentication: JWT-Go library with RSA signing
  • Cryptography: Go standard crypto libraries
  • Caching: Redis for session and revocation storage
  • Logging: Zap structured logging
  • Metrics: Prometheus metrics collection

Frontend Technologies

  • Framework: React 18 with TypeScript
  • UI Library: Ant Design components
  • State Management: React Context API
  • HTTP Client: Axios with interceptors
  • Routing: React Router with protected routes
  • Build Tool: Create React App with TypeScript

Infrastructure

  • Containerization: Docker with multi-stage builds
  • Orchestration: Docker Compose for local development
  • Reverse Proxy: Nginx with load balancing
  • Database Migrations: Custom Go migration system
  • Health Monitoring: Built-in health check endpoints

Security Stack

  • TLS: TLS 1.3 for all communications
  • Hashing: BCrypt with cost 14 for production
  • Signatures: HMAC-SHA256 and RSA signatures
  • Rate Limiting: Token bucket algorithm
  • CSRF: Double-submit cookie pattern
  • Headers: Comprehensive security headers

Deployment Considerations

Container Configuration

services:
  kms-api:
    image: kms-api:latest
    ports:
      - "8080:8080"
    environment:
      - DB_HOST=postgres
      - DB_PORT=5432
      - JWT_SECRET=${JWT_SECRET}
      - HMAC_KEY=${HMAC_KEY}
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3

Environment Variables

# Database Configuration
DB_HOST=localhost
DB_PORT=5432
DB_NAME=kms
DB_USER=postgres
DB_PASSWORD=postgres

# Server Configuration
SERVER_HOST=0.0.0.0
SERVER_PORT=8080

# Authentication
AUTH_PROVIDER=header
JWT_SECRET=your-jwt-secret
HMAC_KEY=your-hmac-key

# Security
RATE_LIMIT_ENABLED=true
RATE_LIMIT_RPS=100
RATE_LIMIT_BURST=200

Monitoring Setup

prometheus:
  scrape_configs:
    - job_name: 'kms-api'
      static_configs:
        - targets: ['kms-api:9090']
      scrape_interval: 15s
      metrics_path: /metrics

This architecture documentation provides a comprehensive technical overview of the KMS system, suitable for development teams, system architects, and operations personnel who need to understand, deploy, or maintain the system.