This commit is contained in:
2025-08-26 19:16:41 -04:00
parent 7ca61eb712
commit 6725529b01
113 changed files with 0 additions and 337 deletions

613
kms/docs/API.md Normal file
View File

@ -0,0 +1,613 @@
# API Key Management Service - API Documentation
This document describes the REST API endpoints for the API Key Management Service.
## Base URL
```
http://localhost:8080
```
## Authentication
All protected endpoints require authentication via the `X-User-Email` header (when using the HeaderAuthenticationProvider).
```
X-User-Email: user@example.com
```
## Content Type
All endpoints accept and return JSON data:
```
Content-Type: application/json
```
## Error Response Format
All error responses follow this format:
```json
{
"error": "Error Type",
"message": "Detailed error message"
}
```
Common HTTP status codes:
- `400` - Bad Request (invalid input)
- `401` - Unauthorized (authentication required)
- `404` - Not Found (resource not found)
- `500` - Internal Server Error
## Health Check Endpoints
### Health Check
```
GET /health
```
Basic health check for load balancers.
**Response:**
```json
{
"status": "healthy",
"timestamp": "2023-01-01T00:00:00Z"
}
```
### Readiness Check
```
GET /ready
```
Comprehensive readiness check including database connectivity.
**Response:**
```json
{
"status": "ready",
"timestamp": "2023-01-01T00:00:00Z",
"checks": {
"database": "healthy"
}
}
```
## Authentication Endpoints
### User Login
```
POST /api/login
```
Initiates user authentication flow.
**Headers:**
- `X-User-Email: user@example.com` (required for HeaderAuthenticationProvider)
**Request Body:**
```json
{
"app_id": "com.example.app",
"permissions": ["repo.read", "repo.write"],
"redirect_uri": "https://example.com/callback"
}
```
**Response:**
```json
{
"redirect_url": "https://example.com/callback?token=user-token-abc123"
}
```
Or if no redirect_uri provided:
```json
{
"token": "user-token-abc123",
"user_id": "user@example.com",
"app_id": "com.example.app",
"expires_in": 604800
}
```
### Token Verification
```
POST /api/verify
```
Verifies a token and returns its permissions.
**Request Body:**
```json
{
"app_id": "com.example.app",
"type": "user",
"user_id": "user@example.com",
"token": "token-to-verify",
"permissions": ["repo.read", "repo.write"]
}
```
**Response:**
```json
{
"valid": true,
"user_id": "user@example.com",
"permissions": ["repo.read", "repo.write"],
"permission_results": {
"repo.read": true,
"repo.write": true
},
"expires_at": "2023-01-08T00:00:00Z",
"max_valid_at": "2023-01-31T00:00:00Z",
"token_type": "user"
}
```
### Token Renewal
```
POST /api/renew
```
Renews a user token with extended expiration.
**Request Body:**
```json
{
"app_id": "com.example.app",
"user_id": "user@example.com",
"token": "current-token"
}
```
**Response:**
```json
{
"token": "new-renewed-token",
"expires_at": "2023-01-15T00:00:00Z",
"max_valid_at": "2023-01-31T00:00:00Z"
}
```
## Application Management
### List Applications
```
GET /api/applications?limit=50&offset=0
```
Retrieves a paginated list of applications.
**Query Parameters:**
- `limit` (optional): Number of results to return (default: 50, max: 100)
- `offset` (optional): Number of results to skip (default: 0)
**Headers:**
- `X-User-Email: user@example.com` (required)
**Response:**
```json
{
"data": [
{
"app_id": "com.example.app",
"app_link": "https://example.com",
"type": ["static", "user"],
"callback_url": "https://example.com/callback",
"hmac_key": "hmac-key-hidden-in-responses",
"token_renewal_duration": 604800000000000,
"max_token_duration": 2592000000000000,
"owner": {
"type": "team",
"name": "Example Team",
"owner": "example-org"
},
"created_at": "2023-01-01T00:00:00Z",
"updated_at": "2023-01-01T00:00:00Z"
}
],
"limit": 50,
"offset": 0,
"count": 1
}
```
### Create Application
```
POST /api/applications
```
Creates a new application.
**Headers:**
- `X-User-Email: user@example.com` (required)
**Request Body:**
```json
{
"app_id": "com.example.newapp",
"app_link": "https://newapp.example.com",
"type": ["static", "user"],
"callback_url": "https://newapp.example.com/callback",
"token_renewal_duration": "168h",
"max_token_duration": "720h",
"owner": {
"type": "team",
"name": "Development Team",
"owner": "example-org"
}
}
```
**Response:**
```json
{
"app_id": "com.example.newapp",
"app_link": "https://newapp.example.com",
"type": ["static", "user"],
"callback_url": "https://newapp.example.com/callback",
"hmac_key": "generated-hmac-key",
"token_renewal_duration": 604800000000000,
"max_token_duration": 2592000000000000,
"owner": {
"type": "team",
"name": "Development Team",
"owner": "example-org"
},
"created_at": "2023-01-01T00:00:00Z",
"updated_at": "2023-01-01T00:00:00Z"
}
```
### Get Application
```
GET /api/applications/{app_id}
```
Retrieves a specific application by ID.
**Headers:**
- `X-User-Email: user@example.com` (required)
**Response:**
```json
{
"app_id": "com.example.app",
"app_link": "https://example.com",
"type": ["static", "user"],
"callback_url": "https://example.com/callback",
"hmac_key": "hmac-key-value",
"token_renewal_duration": 604800000000000,
"max_token_duration": 2592000000000000,
"owner": {
"type": "team",
"name": "Example Team",
"owner": "example-org"
},
"created_at": "2023-01-01T00:00:00Z",
"updated_at": "2023-01-01T00:00:00Z"
}
```
### Update Application
```
PUT /api/applications/{app_id}
```
Updates an existing application. Only provided fields will be updated.
**Headers:**
- `X-User-Email: user@example.com` (required)
**Request Body:**
```json
{
"app_link": "https://updated.example.com",
"callback_url": "https://updated.example.com/callback",
"owner": {
"type": "individual",
"name": "John Doe",
"owner": "john.doe@example.com"
}
}
```
**Response:**
```json
{
"app_id": "com.example.app",
"app_link": "https://updated.example.com",
"type": ["static", "user"],
"callback_url": "https://updated.example.com/callback",
"hmac_key": "existing-hmac-key",
"token_renewal_duration": 604800000000000,
"max_token_duration": 2592000000000000,
"owner": {
"type": "individual",
"name": "John Doe",
"owner": "john.doe@example.com"
},
"created_at": "2023-01-01T00:00:00Z",
"updated_at": "2023-01-01T12:00:00Z"
}
```
### Delete Application
```
DELETE /api/applications/{app_id}
```
Deletes an application and all associated tokens.
**Headers:**
- `X-User-Email: user@example.com` (required)
**Response:**
```
HTTP 204 No Content
```
## Static Token Management
### List Tokens for Application
```
GET /api/applications/{app_id}/tokens?limit=50&offset=0
```
Retrieves all static tokens for a specific application.
**Query Parameters:**
- `limit` (optional): Number of results to return (default: 50, max: 100)
- `offset` (optional): Number of results to skip (default: 0)
**Headers:**
- `X-User-Email: user@example.com` (required)
**Response:**
```json
{
"data": [
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"app_id": "com.example.app",
"owner": {
"type": "individual",
"name": "John Doe",
"owner": "john.doe@example.com"
},
"type": "hmac",
"created_at": "2023-01-01T00:00:00Z",
"updated_at": "2023-01-01T00:00:00Z"
}
],
"limit": 50,
"offset": 0,
"count": 1
}
```
### Create Static Token
```
POST /api/applications/{app_id}/tokens
```
Creates a new static token for an application.
**Headers:**
- `X-User-Email: user@example.com` (required)
**Request Body:**
```json
{
"owner": {
"type": "individual",
"name": "API Client",
"owner": "api-client@example.com"
},
"permissions": ["repo.read", "repo.write", "app.read"]
}
```
**Response:**
```json
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"token": "static-token-abc123xyz789",
"permissions": ["repo.read", "repo.write", "app.read"],
"created_at": "2023-01-01T00:00:00Z"
}
```
**Note:** The `token` field is only returned once during creation for security reasons.
### Delete Static Token
```
DELETE /api/tokens/{token_id}
```
Deletes a static token and revokes all its permissions.
**Headers:**
- `X-User-Email: user@example.com` (required)
**Response:**
```
HTTP 204 No Content
```
## Permission Scopes
The following permission scopes are available:
### System Permissions
- `internal` - Full access to internal system operations (system only)
- `internal.read` - Read access to internal system data (system only)
- `internal.write` - Write access to internal system data (system only)
- `internal.admin` - Administrative access to internal system (system only)
### Application Management
- `app` - Access to application management
- `app.read` - Read application information
- `app.write` - Create and update applications
- `app.delete` - Delete applications
### Token Management
- `token` - Access to token management
- `token.read` - Read token information
- `token.create` - Create new tokens
- `token.revoke` - Revoke existing tokens
### Permission Management
- `permission` - Access to permission management
- `permission.read` - Read permission information
- `permission.write` - Create and update permissions
- `permission.grant` - Grant permissions to tokens
- `permission.revoke` - Revoke permissions from tokens
### Repository Access (Example)
- `repo` - Access to repository operations
- `repo.read` - Read repository data
- `repo.write` - Write to repositories
- `repo.admin` - Administrative access to repositories
## Rate Limiting
The API implements rate limiting with the following limits:
- **General API endpoints**: 100 requests per minute with burst of 20
- **Authentication endpoints** (`/login`, `/verify`, `/renew`): 10 requests per minute with burst of 5
Rate limit headers are included in responses:
- `X-RateLimit-Limit`: Request limit per window
- `X-RateLimit-Remaining`: Remaining requests in current window
- `X-RateLimit-Reset`: Unix timestamp when the window resets
When rate limited:
```json
{
"error": "Rate limit exceeded",
"message": "Too many requests. Please try again later."
}
```
## Testing Endpoints
For testing purposes, different user scenarios are available through different ports:
- **Port 80**: Regular user (`test@example.com`)
- **Port 8081**: Admin user (`admin@example.com`) with higher rate limits
- **Port 8082**: Limited user (`limited@example.com`) with lower rate limits
## Example Workflows
### Creating an Application and Static Token
1. **Create Application:**
```bash
curl -X POST http://localhost/api/applications \
-H "Content-Type: application/json" \
-H "X-User-Email: admin@example.com" \
-d '{
"app_id": "com.mycompany.api",
"app_link": "https://api.mycompany.com",
"type": ["static", "user"],
"callback_url": "https://api.mycompany.com/callback",
"token_renewal_duration": "168h",
"max_token_duration": "720h",
"owner": {
"type": "team",
"name": "API Team",
"owner": "api-team@mycompany.com"
}
}'
```
2. **Create Static Token:**
```bash
curl -X POST http://localhost/api/applications/com.mycompany.api/tokens \
-H "Content-Type: application/json" \
-H "X-User-Email: admin@example.com" \
-d '{
"owner": {
"type": "individual",
"name": "Service Account",
"owner": "service@mycompany.com"
},
"permissions": ["repo.read", "repo.write"]
}'
```
3. **Verify Token:**
```bash
curl -X POST http://localhost/api/verify \
-H "Content-Type: application/json" \
-d '{
"app_id": "com.mycompany.api",
"type": "static",
"token": "static-token-abc123xyz789",
"permissions": ["repo.read"]
}'
```
### User Authentication Flow
1. **Initiate Login:**
```bash
curl -X POST http://localhost/api/login \
-H "Content-Type: application/json" \
-H "X-User-Email: user@example.com" \
-d '{
"app_id": "com.mycompany.api",
"permissions": ["repo.read"],
"redirect_uri": "https://myapp.com/callback"
}'
```
2. **Verify User Token:**
```bash
curl -X POST http://localhost/api/verify \
-H "Content-Type: application/json" \
-d '{
"app_id": "com.mycompany.api",
"type": "user",
"user_id": "user@example.com",
"token": "user-token-from-login",
"permissions": ["repo.read"]
}'
```
3. **Renew Token Before Expiry:**
```bash
curl -X POST http://localhost/api/renew \
-H "Content-Type: application/json" \
-d '{
"app_id": "com.mycompany.api",
"user_id": "user@example.com",
"token": "current-user-token"
}'
```
## Security Considerations
- All tokens should be transmitted over HTTPS in production
- Static tokens are returned only once during creation - store them securely
- User tokens have both renewal and maximum validity periods
- HMAC keys are used for token signing and should be rotated regularly
- Rate limiting helps prevent abuse
- Permission scopes follow hierarchical structure
- All operations are logged with user attribution
## Development and Testing
The service includes comprehensive health checks, detailed logging, and metrics collection. When running with `LOG_LEVEL=debug`, additional debugging information is available in the logs.
For local development, the service can be started with:
```bash
docker-compose up -d
```
This starts PostgreSQL, the API service, and Nginx proxy with test user headers configured.

677
kms/docs/ARCHITECTURE.md Normal file
View File

@ -0,0 +1,677 @@
# API Key Management Service (KMS) - System Architecture
## Table of Contents
1. [System Overview](#system-overview)
2. [Architecture Principles](#architecture-principles)
3. [Component Architecture](#component-architecture)
4. [System Architecture Diagram](#system-architecture-diagram)
5. [Request Flow Pipeline](#request-flow-pipeline)
6. [Authentication Flow](#authentication-flow)
7. [API Design](#api-design)
8. [Technology Stack](#technology-stack)
---
## System Overview
The API Key Management Service (KMS) is a secure, scalable platform for managing API authentication tokens across applications. Built with Go backend and React TypeScript frontend, it provides centralized token lifecycle management with enterprise-grade security features.
### Key Capabilities
- **Multi-Provider Authentication**: Header, JWT, OAuth2, SAML support
- **Dual Token System**: Static HMAC tokens and renewable JWT user tokens
- **Hierarchical Permissions**: Role-based access control with inheritance
- **Enterprise Security**: Rate limiting, brute force protection, audit logging
- **High Availability**: Containerized deployment with load balancing
### Core Features
- **Token Lifecycle Management**: Create, verify, renew, and revoke tokens
- **Application Management**: Multi-tenant application configuration
- **User Session Tracking**: Comprehensive session management
- **Audit Logging**: Complete audit trail of all operations
- **Health Monitoring**: Built-in health checks and metrics
---
## Architecture Principles
### Clean Architecture
The system follows clean architecture principles with clear separation of concerns:
```
┌─────────────────┐
│ Handlers │ ← HTTP request handling
├─────────────────┤
│ Services │ ← Business logic
├─────────────────┤
│ Repositories │ ← Data access
├─────────────────┤
│ Database │ ← Data persistence
└─────────────────┘
```
### Design Principles
- **Dependency Injection**: All services receive dependencies through constructors
- **Interface Segregation**: Repository interfaces enable testing and flexibility
- **Single Responsibility**: Each component has one clear purpose
- **Fail-Safe Defaults**: Security-first configuration with safe fallbacks
- **Immutable Operations**: Database transactions and audit logging
- **Defense in Depth**: Multiple security layers throughout the stack
---
## Component Architecture
### Backend Components (`internal/`)
#### **Handlers Layer** (`internal/handlers/`)
HTTP request processors implementing REST API endpoints:
- **`application.go`**: Application CRUD operations
- Create, read, update, delete applications
- HMAC key management
- Ownership validation
- **`auth.go`**: Authentication workflows
- User login and logout
- Token renewal and validation
- Multi-provider authentication
- **`token.go`**: Token operations
- Static token creation
- Token verification
- Token revocation
- **`health.go`**: System health checks
- Database connectivity
- Cache availability
- Service status
- **`oauth2.go`, `saml.go`**: External authentication providers
- OAuth2 authorization code flow
- SAML assertion validation
- Provider callback handling
#### **Services Layer** (`internal/services/`)
Business logic implementation with transaction management:
- **`auth_service.go`**: Authentication provider orchestration
- Multi-provider authentication
- Session management
- User context creation
- **`token_service.go`**: Token lifecycle management
- Static token generation and validation
- JWT user token management
- Permission assignment
- **`application_service.go`**: Application configuration management
- Application CRUD operations
- Configuration validation
- HMAC key rotation
- **`session_service.go`**: User session tracking
- Session creation and validation
- Session timeout handling
- Cross-provider session management
#### **Repository Layer** (`internal/repository/postgres/`)
Data access with ACID transaction support:
- **`application_repository.go`**: Application persistence
- Secure dynamic query building
- Parameterized queries
- Ownership validation
- **`token_repository.go`**: Static token management
- BCrypt token hashing
- Token lookup and validation
- Permission relationship management
- **`permission_repository.go`**: Permission catalog
- Hierarchical permission structure
- Permission validation
- Bulk permission operations
- **`session_repository.go`**: User session storage
- Session persistence
- Expiration management
- Provider metadata storage
#### **Authentication Providers** (`internal/auth/`)
Pluggable authentication system:
- **`header_validator.go`**: HMAC signature validation
- Timestamp-based replay protection
- Constant-time signature comparison
- Email format validation
- **`jwt.go`**: JWT token management
- Token generation with secure JTI
- Signature validation
- Revocation list management
- **`oauth2.go`**: OAuth2 authorization code flow
- State management
- Token exchange
- Provider integration
- **`saml.go`**: SAML assertion validation
- XML signature validation
- Attribute extraction
- Provider configuration
- **`permissions.go`**: Hierarchical permission evaluation
- Role-based access control
- Permission inheritance
- Bulk permission evaluation
---
## System Architecture Diagram
```mermaid
graph TB
%% External Components
Client[Client Applications]
Browser[Web Browser]
AuthProvider[OAuth2/SAML Provider]
%% Load Balancer & Proxy
subgraph "Load Balancer Layer"
Nginx[Nginx Proxy<br/>:80, :8081]
end
%% Frontend Layer
subgraph "Frontend Layer"
React[React TypeScript SPA<br/>Ant Design UI<br/>:3000]
AuthContext[Authentication Context]
APIService[API Service Client]
end
%% API Gateway & Middleware
subgraph "API Layer"
API[Go API Server<br/>:8080]
subgraph "Middleware Chain"
Logger[Request Logger]
Security[Security Headers<br/>CORS, CSRF]
RateLimit[Rate Limiter<br/>100 RPS]
Auth[Authentication<br/>Header/JWT/OAuth2/SAML]
Validation[Request Validator]
end
end
%% Business Logic Layer
subgraph "Service Layer"
AuthService[Authentication Service]
TokenService[Token Service]
AppService[Application Service]
SessionService[Session Service]
PermService[Permission Service]
end
%% Data Access Layer
subgraph "Repository Layer"
AppRepo[Application Repository]
TokenRepo[Token Repository]
PermRepo[Permission Repository]
SessionRepo[Session Repository]
end
%% Infrastructure Layer
subgraph "Infrastructure"
PostgreSQL[(PostgreSQL 15<br/>:5432)]
Redis[(Redis Cache<br/>Optional)]
Metrics[Prometheus Metrics<br/>:9090]
end
%% External Security
subgraph "Security & Crypto"
HMAC[HMAC Signature<br/>Validation]
BCrypt[BCrypt Hashing<br/>Cost 14]
JWT[JWT Token<br/>Generation]
end
%% Flow Connections
Client -->|API Requests| Nginx
Browser -->|HTTPS| Nginx
AuthProvider -->|OAuth2/SAML| API
Nginx -->|Proxy| React
Nginx -->|API Proxy| API
React --> AuthContext
React --> APIService
APIService -->|REST API| API
API --> Logger
Logger --> Security
Security --> RateLimit
RateLimit --> Auth
Auth --> Validation
Validation --> AuthService
Validation --> TokenService
Validation --> AppService
Validation --> SessionService
Validation --> PermService
AuthService --> AppRepo
TokenService --> TokenRepo
AppService --> AppRepo
SessionService --> SessionRepo
PermService --> PermRepo
AppRepo --> PostgreSQL
TokenRepo --> PostgreSQL
PermRepo --> PostgreSQL
SessionRepo --> PostgreSQL
TokenService --> HMAC
TokenService --> BCrypt
TokenService --> JWT
AuthService --> Redis
API --> Metrics
%% Styling
classDef frontend fill:#e1f5fe,stroke:#0277bd,stroke-width:2px
classDef api fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef service fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
classDef data fill:#fff3e0,stroke:#ef6c00,stroke-width:2px
classDef security fill:#ffebee,stroke:#c62828,stroke-width:2px
class React,AuthContext,APIService frontend
class API,Logger,Security,RateLimit,Auth,Validation api
class AuthService,TokenService,AppService,SessionService,PermService service
class PostgreSQL,Redis,Metrics,AppRepo,TokenRepo,PermRepo,SessionRepo data
class HMAC,BCrypt,JWT security
```
---
## Request Flow Pipeline
```mermaid
flowchart TD
Start([HTTP Request]) --> Nginx{Nginx Proxy<br/>Load Balancer}
Nginx -->|Static Assets| Frontend[React SPA<br/>Port 3000]
Nginx -->|API Routes| API[Go API Server<br/>Port 8080]
API --> Logger[Request Logger<br/>Structured Logging]
Logger --> Security[Security Middleware<br/>Headers, CORS, CSRF]
Security --> RateLimit{Rate Limiter<br/>100 RPS, 200 Burst}
RateLimit -->|Exceeded| RateResponse[429 Too Many Requests]
RateLimit -->|Within Limits| Auth[Authentication<br/>Middleware]
Auth --> AuthHeader{Auth Provider}
AuthHeader -->|header| HeaderAuth[Header Validator<br/>X-User-Email]
AuthHeader -->|jwt| JWTAuth[JWT Validator<br/>Signature + Claims]
AuthHeader -->|oauth2| OAuth2Auth[OAuth2 Flow<br/>Authorization Code]
AuthHeader -->|saml| SAMLAuth[SAML Assertion<br/>XML Validation]
HeaderAuth --> AuthCache{Check Cache<br/>Redis 5min TTL}
JWTAuth --> JWTValidation[Signature Validation<br/>Expiry Check]
OAuth2Auth --> OAuth2Exchange[Token Exchange<br/>User Info Retrieval]
SAMLAuth --> SAMLValidation[Assertion Validation<br/>Signature Check]
AuthCache -->|Hit| AuthContext[Create AuthContext]
AuthCache -->|Miss| DBAuth[Database Lookup<br/>User Permissions]
JWTValidation --> RevocationCheck[Check Revocation List<br/>Redis Cache]
OAuth2Exchange --> SessionStore[Store User Session<br/>PostgreSQL]
SAMLValidation --> SessionStore
DBAuth --> CacheStore[Store in Cache<br/>5min TTL]
RevocationCheck --> AuthContext
SessionStore --> AuthContext
CacheStore --> AuthContext
AuthContext --> Validation[Request Validator<br/>JSON Schema]
Validation -->|Invalid| ValidationError[400 Bad Request]
Validation -->|Valid| Router{Route Handler}
Router -->|/health| HealthHandler[Health Check<br/>DB + Cache Status]
Router -->|/api/applications| AppHandler[Application CRUD<br/>HMAC Key Management]
Router -->|/api/tokens| TokenHandler[Token Operations<br/>Create, Verify, Revoke]
Router -->|/api/login| AuthHandler[Authentication<br/>Login, Renewal]
Router -->|/api/oauth2| OAuth2Handler[OAuth2 Callbacks<br/>State Management]
Router -->|/api/saml| SAMLHandler[SAML Callbacks<br/>Assertion Processing]
HealthHandler --> Service[Service Layer]
AppHandler --> Service
TokenHandler --> Service
AuthHandler --> Service
OAuth2Handler --> Service
SAMLHandler --> Service
Service --> Repository[Repository Layer<br/>Database Operations]
Repository --> PostgreSQL[(PostgreSQL<br/>ACID Transactions)]
Service --> CryptoOps[Cryptographic Operations]
CryptoOps --> HMAC[HMAC Signature<br/>Timestamp Validation]
CryptoOps --> BCrypt[BCrypt Hashing<br/>Cost 14]
CryptoOps --> JWT[JWT Generation<br/>RS256 Signing]
Repository --> AuditLog[Audit Logging<br/>All Operations]
AuditLog --> AuditTable[(audit_logs table)]
Service --> Response[HTTP Response]
Response --> Metrics[Prometheus Metrics<br/>Port 9090]
Response --> End([Response Sent])
%% Error Paths
RateResponse --> End
ValidationError --> End
%% Styling
classDef middleware fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
classDef auth fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef handler fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
classDef data fill:#fff3e0,stroke:#ef6c00,stroke-width:2px
classDef crypto fill:#ffebee,stroke:#c62828,stroke-width:2px
classDef error fill:#fce4ec,stroke:#ad1457,stroke-width:2px
class Logger,Security,RateLimit,Validation middleware
class Auth,HeaderAuth,JWTAuth,OAuth2Auth,SAMLAuth,AuthContext auth
class HealthHandler,AppHandler,TokenHandler,AuthHandler,OAuth2Handler,SAMLHandler,Service handler
class Repository,PostgreSQL,AuditTable,AuthCache data
class CryptoOps,HMAC,BCrypt,JWT crypto
class RateResponse,ValidationError error
```
### Request Processing Pipeline
1. **Load Balancer**: Nginx receives and routes requests
2. **Static Assets**: React SPA served directly by Nginx
3. **API Gateway**: Go server handles API requests
4. **Middleware Chain**: Security, rate limiting, authentication
5. **Route Handler**: Business logic processing
6. **Service Layer**: Transaction management and orchestration
7. **Repository Layer**: Database operations with audit logging
8. **Response**: JSON response with metrics collection
---
## Authentication Flow
```mermaid
sequenceDiagram
participant Client as Client App
participant API as API Gateway
participant Auth as Auth Service
participant DB as PostgreSQL
participant Provider as OAuth2/SAML
participant Cache as Redis Cache
%% Header-based Authentication
rect rgb(240, 248, 255)
Note over Client, Cache: Header-based Authentication Flow
Client->>API: Request with X-User-Email header
API->>Auth: Validate header auth
Auth->>DB: Check user permissions
DB-->>Auth: Return user context
Auth->>Cache: Cache auth result (5min TTL)
Auth-->>API: AuthContext{UserID, Permissions}
API-->>Client: Authenticated response
end
%% JWT Authentication Flow
rect rgb(245, 255, 245)
Note over Client, Cache: JWT Authentication Flow
Client->>API: Login request {app_id, permissions}
API->>Auth: Generate JWT token
Auth->>DB: Validate app_id and permissions
DB-->>Auth: Application config
Auth->>Auth: Create JWT with claims<br/>{user_id, permissions, exp, iat}
Auth-->>API: JWT token + expires_at
API-->>Client: LoginResponse{token, expires_at}
Note over Client, API: Subsequent requests with JWT
Client->>API: Request with Bearer JWT
API->>Auth: Verify JWT signature
Auth->>Auth: Check expiration & claims
Auth->>Cache: Check revocation list
Cache-->>Auth: Token status
Auth-->>API: Valid AuthContext
API-->>Client: Authorized response
end
%% OAuth2/SAML Flow
rect rgb(255, 248, 240)
Note over Client, Provider: OAuth2/SAML Authentication Flow
Client->>API: POST /api/login {app_id, redirect_uri}
API->>Auth: Generate OAuth2 state
Auth->>DB: Store state + app context
Auth-->>API: Redirect URL + state
API-->>Client: {redirect_url, state}
Client->>Provider: Redirect to OAuth2 provider
Provider-->>Client: Authorization code + state
Client->>API: GET /api/oauth2/callback?code=xxx&state=yyy
API->>Auth: Validate state and exchange code
Auth->>Provider: Exchange code for tokens
Provider-->>Auth: Access token + ID token
Auth->>Provider: Get user info
Provider-->>Auth: User profile
Auth->>DB: Create/update user session
Auth->>Auth: Generate internal JWT
Auth-->>API: JWT token + user context
API-->>Client: Set-Cookie with JWT + redirect
end
%% Token Renewal Flow
rect rgb(248, 245, 255)
Note over Client, DB: Token Renewal Flow
Client->>API: POST /api/renew {app_id, user_id, token}
API->>Auth: Validate current token
Auth->>Auth: Check token expiration<br/>and max_valid_at
Auth->>DB: Get application config
DB-->>Auth: TokenRenewalDuration, MaxTokenDuration
Auth->>Auth: Generate new JWT<br/>with extended expiry
Auth->>Cache: Invalidate old token
Auth-->>API: New token + expires_at
API-->>Client: RenewResponse{token, expires_at}
end
```
### Authentication Methods
#### **Header-based Authentication**
- **Use Case**: Service-to-service authentication
- **Security**: HMAC-SHA256 signatures with timestamp validation
- **Replay Protection**: 5-minute timestamp window
- **Caching**: 5-minute Redis cache for performance
#### **JWT Authentication**
- **Use Case**: User authentication with session management
- **Security**: RSA signatures with revocation checking
- **Token Lifecycle**: Configurable expiration with renewal
- **Claims**: User ID, permissions, application scope
#### **OAuth2/SAML Authentication**
- **Use Case**: External identity provider integration
- **Security**: Authorization code flow with state validation
- **Session Management**: Database-backed session storage
- **Provider Support**: Configurable provider endpoints
---
## API Design
### RESTful Endpoints
#### **Authentication Endpoints**
```
POST /api/login - Authenticate user, issue JWT
POST /api/renew - Renew JWT token
POST /api/logout - Revoke JWT token
GET /api/verify - Verify token and permissions
```
#### **Application Management**
```
GET /api/applications - List applications (paginated)
POST /api/applications - Create application
GET /api/applications/:id - Get application details
PUT /api/applications/:id - Update application
DELETE /api/applications/:id - Delete application
```
#### **Token Management**
```
GET /api/applications/:id/tokens - List application tokens
POST /api/applications/:id/tokens - Create new token
DELETE /api/tokens/:id - Revoke token
```
#### **OAuth2/SAML Integration**
```
POST /api/oauth2/login - Initiate OAuth2 flow
GET /api/oauth2/callback - OAuth2 callback handler
POST /api/saml/login - Initiate SAML flow
POST /api/saml/callback - SAML assertion handler
```
#### **System Endpoints**
```
GET /health - System health check
GET /ready - Readiness probe
GET /metrics - Prometheus metrics
```
### Request/Response Patterns
#### **Authentication Headers**
```http
X-User-Email: user@example.com
X-Auth-Timestamp: 2024-01-15T10:30:00Z
X-Auth-Signature: sha256=abc123...
Authorization: Bearer eyJhbGciOiJSUzI1NiIs...
Content-Type: application/json
```
#### **Error Response Format**
```json
{
"error": "validation_failed",
"message": "Request validation failed",
"details": [
{
"field": "permissions",
"message": "Invalid permission format",
"value": "invalid.perm"
}
]
}
```
#### **Success Response Format**
```json
{
"data": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"token": "ABC123_xyz789...",
"permissions": ["app.read", "token.create"],
"created_at": "2024-01-15T10:30:00Z"
}
}
```
---
## Technology Stack
### Backend Technologies
- **Language**: Go 1.21+ with modules
- **Web Framework**: Gin HTTP framework
- **Database**: PostgreSQL 15 with connection pooling
- **Authentication**: JWT-Go library with RSA signing
- **Cryptography**: Go standard crypto libraries
- **Caching**: Redis for session and revocation storage
- **Logging**: Zap structured logging
- **Metrics**: Prometheus metrics collection
### Frontend Technologies
- **Framework**: React 18 with TypeScript
- **UI Library**: Ant Design components
- **State Management**: React Context API
- **HTTP Client**: Axios with interceptors
- **Routing**: React Router with protected routes
- **Build Tool**: Create React App with TypeScript
### Infrastructure
- **Containerization**: Docker with multi-stage builds
- **Orchestration**: Docker Compose for local development
- **Reverse Proxy**: Nginx with load balancing
- **Database Migrations**: Custom Go migration system
- **Health Monitoring**: Built-in health check endpoints
### Security Stack
- **TLS**: TLS 1.3 for all communications
- **Hashing**: BCrypt with cost 14 for production
- **Signatures**: HMAC-SHA256 and RSA signatures
- **Rate Limiting**: Token bucket algorithm
- **CSRF**: Double-submit cookie pattern
- **Headers**: Comprehensive security headers
---
## Deployment Considerations
### Container Configuration
```yaml
services:
kms-api:
image: kms-api:latest
ports:
- "8080:8080"
environment:
- DB_HOST=postgres
- DB_PORT=5432
- JWT_SECRET=${JWT_SECRET}
- HMAC_KEY=${HMAC_KEY}
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
```
### Environment Variables
```bash
# Database Configuration
DB_HOST=localhost
DB_PORT=5432
DB_NAME=kms
DB_USER=postgres
DB_PASSWORD=postgres
# Server Configuration
SERVER_HOST=0.0.0.0
SERVER_PORT=8080
# Authentication
AUTH_PROVIDER=header
JWT_SECRET=your-jwt-secret
HMAC_KEY=your-hmac-key
# Security
RATE_LIMIT_ENABLED=true
RATE_LIMIT_RPS=100
RATE_LIMIT_BURST=200
```
### Monitoring Setup
```yaml
prometheus:
scrape_configs:
- job_name: 'kms-api'
static_configs:
- targets: ['kms-api:9090']
scrape_interval: 15s
metrics_path: /metrics
```
This architecture documentation provides a comprehensive technical overview of the KMS system, suitable for development teams, system architects, and operations personnel who need to understand, deploy, or maintain the system.

1233
kms/docs/DEPLOYMENT_GUIDE.md Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,319 @@
# KMS API Service - Production Roadmap
This document outlines the complete roadmap for making the API Key Management Service fully production-ready. Use the checkboxes to track progress and refer to the implementation notes at the bottom.
## 🏗️ Core Infrastructure (COMPLETED)
### Repository Layer
- [x] Complete token repository implementation (CRUD operations)
- [x] Complete permission repository implementation (core methods)
- [x] Implement granted permission repository (authorization logic)
- [x] Add database transaction support
- [x] Implement proper error handling in repositories
### Security & Cryptography
- [x] Implement secure token generation using crypto/rand
- [x] Add bcrypt-based token hashing for storage
- [x] Implement HMAC token signing and verification
- [x] Create token format validation utilities
- [x] Add cryptographic key management
### Service Layer
- [x] Update token service with secure generation
- [x] Implement permission validation in token creation
- [x] Add application validation before token operations
- [x] Implement proper error propagation
- [x] Add comprehensive logging throughout services
### Middleware & Validation
- [x] Create comprehensive input validation middleware
- [x] Implement struct-based validation with detailed errors
- [x] Add UUID parameter validation
- [x] Create permission scope format validation
- [x] Implement request sanitization
### Error Handling
- [x] Create structured error framework with typed codes
- [x] Implement HTTP status code mapping
- [x] Add error context and chaining support
- [x] Create consistent JSON error responses
- [x] Add retry logic indicators
### Monitoring & Metrics
- [x] Implement comprehensive metrics collection
- [x] Add Prometheus-compatible metrics export
- [x] Create HTTP request monitoring middleware
- [x] Add business metrics tracking
- [x] Implement system health metrics
## 🔐 Authentication & Authorization (HIGH PRIORITY)
### JWT Implementation
- [x] Complete JWT token generation and validation
- [x] Implement token expiration and renewal logic
- [x] Add JWT claims management
- [x] Create token blacklisting mechanism
- [x] Implement refresh token rotation
- [x] Add comprehensive JWT unit tests with benchmarks
- [x] Implement cache-based token revocation system
### SSO Integration
- [x] Implement OAuth2/OIDC provider integration
- [x] Add OAuth2 authentication handlers with PKCE support
- [x] Create OAuth2 discovery document fetching
- [x] Implement authorization code exchange and token refresh
- [x] Add user info retrieval from OAuth2 providers
- [x] Create comprehensive OAuth2 unit tests with benchmarks
- [x] Add SAML authentication support
- [x] Create user session management
- [x] Implement role-based access control (RBAC)
- [x] Add multi-tenant authentication support
### Permission System Enhancement
- [x] Implement hierarchical permission inheritance
- [x] Add dynamic permission evaluation
- [x] Create permission caching mechanism
- [x] Add bulk permission operations
- [x] Implement default permission hierarchy (admin, read, write, app.*, token.*, etc.)
- [x] Create role-based permission system with inheritance
- [x] Add comprehensive permission unit tests with benchmarks
- [ ] Implement permission audit logging
## 🚀 Performance & Scalability (MEDIUM PRIORITY)
### Caching Layer
- [x] Implement basic caching layer with memory provider
- [x] Add JSON serialization/deserialization support
- [x] Create cache manager with TTL support
- [x] Add cache key management and prefixes
- [x] Implement Redis integration for caching
- [x] Add token blacklist caching for revocation
- [ ] Add permission result caching
- [ ] Create application metadata caching
- [ ] Implement token validation result caching
- [ ] Add cache invalidation strategies
### Database Optimization
- [ ] Implement database connection pool tuning
- [ ] Add query performance monitoring
- [ ] Create database migration rollback procedures
- [ ] Implement read replica support
- [ ] Add database backup and recovery procedures
### Load Balancing & Clustering
- [ ] Implement horizontal scaling support
- [ ] Add load balancer health checks
- [ ] Create session affinity handling
- [ ] Implement distributed rate limiting
- [ ] Add circuit breaker patterns
## 🔒 Security Hardening (HIGH PRIORITY)
### Advanced Security Features
- [ ] Implement API key rotation mechanisms
- [x] Add brute force protection
- [x] Create account lockout mechanisms
- [x] Implement IP whitelisting/blacklisting
- [x] Add request signing validation
- [x] Implement rate limiting middleware
- [x] Add security headers middleware
- [x] Create authentication failure tracking
### Audit & Compliance
- [x] Implement comprehensive audit logging
- [ ] Add compliance reporting features
- [ ] Create data retention policies
- [ ] Implement GDPR compliance features
- [ ] Add security event alerting
### Secrets Management
- [ ] Integrate with HashiCorp Vault or similar
- [ ] Implement automatic key rotation
- [ ] Add encrypted configuration storage
- [ ] Create secure backup procedures
- [ ] Implement key escrow mechanisms
## 🧪 Testing & Quality Assurance (MEDIUM PRIORITY)
### Unit Testing
- [x] Add comprehensive JWT authentication unit tests
- [x] Create caching layer unit tests with benchmarks
- [x] Implement authentication service unit tests
- [x] Add comprehensive permission system unit tests
- [ ] Add comprehensive unit tests for repositories
- [ ] Create service layer unit tests
- [ ] Implement middleware unit tests
- [ ] Add crypto utility unit tests
- [ ] Create error handling unit tests
### Integration Testing
- [ ] Expand integration test coverage
- [ ] Add database integration tests
- [ ] Create API endpoint integration tests
- [ ] Implement authentication flow tests
- [ ] Add permission validation tests
### Performance Testing
- [ ] Implement load testing scenarios
- [ ] Add stress testing for concurrent operations
- [ ] Create database performance benchmarks
- [ ] Implement memory leak detection
- [ ] Add latency and throughput testing
### Security Testing
- [ ] Implement penetration testing scenarios
- [ ] Add vulnerability scanning automation
- [ ] Create security regression tests
- [ ] Implement fuzzing tests
- [ ] Add compliance validation tests
## 📦 Deployment & Operations (MEDIUM PRIORITY)
### Containerization & Orchestration
- [ ] Create optimized Docker images
- [ ] Implement Kubernetes manifests
- [ ] Add Helm charts for deployment
- [ ] Create deployment automation scripts
- [ ] Implement blue-green deployment strategy
### Infrastructure as Code
- [ ] Create Terraform configurations
- [ ] Implement AWS/GCP/Azure resource definitions
- [ ] Add infrastructure testing
- [ ] Create disaster recovery procedures
- [ ] Implement infrastructure monitoring
### CI/CD Pipeline
- [ ] Implement automated testing pipeline
- [ ] Add security scanning in CI/CD
- [ ] Create automated deployment pipeline
- [ ] Implement rollback mechanisms
- [ ] Add deployment notifications
## 📊 Observability & Monitoring (LOW PRIORITY)
### Advanced Monitoring
- [ ] Implement distributed tracing
- [ ] Add application performance monitoring (APM)
- [ ] Create custom dashboards
- [ ] Implement alerting rules
- [ ] Add log aggregation and analysis
### Business Intelligence
- [ ] Create usage analytics
- [ ] Implement cost tracking
- [ ] Add capacity planning metrics
- [ ] Create business KPI dashboards
- [ ] Implement trend analysis
## 🔧 Maintenance & Operations (ONGOING)
### Documentation
- [ ] Create comprehensive API documentation
- [ ] Add deployment guides
- [ ] Create troubleshooting runbooks
- [ ] Implement architecture decision records (ADRs)
- [ ] Add security best practices guide
### Maintenance Procedures
- [ ] Create backup and restore procedures
- [ ] Implement log rotation and archival
- [ ] Add database maintenance scripts
- [ ] Create performance tuning guides
- [ ] Implement capacity planning procedures
---
## 📝 Implementation Notes for Future Development
### Code Organization Principles
1. **Maintain Clean Architecture**: Keep clear separation between domain, service, and infrastructure layers
2. **Interface-First Design**: Always define interfaces before implementations for better testability
3. **Error Handling**: Use the established error framework (`internal/errors`) for consistent error handling
4. **Logging**: Use structured logging with zap throughout the application
5. **Configuration**: Add new config options to `internal/config/config.go` with proper validation
### Security Guidelines
1. **Input Validation**: Always validate inputs using the validation middleware (`internal/middleware/validation.go`)
2. **Token Security**: Use the crypto utilities (`internal/crypto/token.go`) for all token operations
3. **Permission Checks**: Always validate permissions using the repository layer before operations
4. **Audit Logging**: Log all security-relevant operations with user context
5. **Secrets**: Never hardcode secrets; use environment variables or secret management systems
### Database Guidelines
1. **Migrations**: Always create both up and down migrations for schema changes
2. **Transactions**: Use database transactions for multi-step operations
3. **Indexing**: Add appropriate indexes for query performance
4. **Connection Management**: Use the existing connection pool configuration
5. **Error Handling**: Wrap database errors with the application error framework
### Testing Guidelines
1. **Test Structure**: Follow the existing test structure in `test/` directory
2. **Mock Dependencies**: Use interfaces for easy mocking in tests
3. **Test Data**: Use the test helpers for consistent test data creation
4. **Integration Tests**: Test against real database instances when possible
5. **Coverage**: Aim for >80% test coverage for critical paths
### Performance Guidelines
1. **Metrics**: Use the metrics system (`internal/metrics`) to track performance
2. **Caching**: Implement caching at the service layer, not repository layer
3. **Database Queries**: Optimize queries and use appropriate indexes
4. **Memory Management**: Be mindful of memory allocations in hot paths
5. **Concurrency**: Use proper synchronization for shared resources
### Deployment Guidelines
1. **Environment Variables**: Use environment-based configuration for all deployments
2. **Health Checks**: Ensure health endpoints are properly configured
3. **Graceful Shutdown**: Implement proper shutdown procedures for all services
4. **Resource Limits**: Set appropriate CPU and memory limits
5. **Monitoring**: Ensure metrics and logging are properly configured
### Code Quality Standards
1. **Go Standards**: Follow standard Go conventions and best practices
2. **Documentation**: Document all public APIs and complex business logic
3. **Error Messages**: Provide clear, actionable error messages
4. **Code Reviews**: Require code reviews for all changes
5. **Static Analysis**: Use tools like golangci-lint for code quality
### Security Best Practices
1. **Principle of Least Privilege**: Grant minimum necessary permissions
2. **Defense in Depth**: Implement multiple layers of security
3. **Regular Updates**: Keep dependencies updated for security patches
4. **Secure Defaults**: Use secure configurations by default
5. **Security Testing**: Include security testing in the development process
### Operational Considerations
1. **Monitoring**: Implement comprehensive monitoring and alerting
2. **Backup Strategy**: Ensure regular backups and test restore procedures
3. **Disaster Recovery**: Have documented disaster recovery procedures
4. **Capacity Planning**: Monitor resource usage and plan for growth
5. **Documentation**: Keep operational documentation up to date
---
## 🎯 Priority Matrix
### Immediate (Next Sprint)
- Complete JWT implementation
- Add comprehensive unit tests
- Implement caching layer basics
### Short Term (1-2 Months)
- SSO integration
- Security hardening features
- Performance optimization
### Medium Term (3-6 Months)
- Advanced monitoring and observability
- Deployment automation
- Compliance features
### Long Term (6+ Months)
- Advanced analytics
- Multi-region deployment
- Advanced security features
---
*Last Updated: [Current Date]*
*Version: 1.0*

View File

@ -0,0 +1,828 @@
# KMS Security Architecture
## Table of Contents
1. [Security Overview](#security-overview)
2. [Multi-Layered Security Model](#multi-layered-security-model)
3. [Authentication Security](#authentication-security)
4. [Token Security Models](#token-security-models)
5. [Authorization Framework](#authorization-framework)
6. [Data Protection](#data-protection)
7. [Security Middleware Chain](#security-middleware-chain)
8. [Threat Model](#threat-model)
9. [Security Controls](#security-controls)
10. [Compliance Framework](#compliance-framework)
---
## Security Overview
The KMS implements a comprehensive security architecture based on defense-in-depth principles. Every layer of the system includes security controls designed to protect against both external threats and insider risks.
### Security Objectives
- **Confidentiality**: Protect sensitive data and credentials
- **Integrity**: Ensure data accuracy and prevent tampering
- **Availability**: Maintain service availability under attack
- **Accountability**: Complete audit trail of all operations
- **Non-repudiation**: Cryptographic proof of operations
### Security Principles
- **Zero Trust**: Verify every request regardless of source
- **Fail Secure**: Safe defaults when systems fail
- **Defense in Depth**: Multiple security layers
- **Least Privilege**: Minimal permission grants
- **Security by Design**: Security integrated into architecture
---
## Multi-Layered Security Model
```mermaid
graph TB
subgraph "External Threats"
DDoS[DDoS Attacks]
XSS[XSS Attacks]
CSRF[CSRF Attacks]
Injection[SQL Injection]
AuthBypass[Auth Bypass]
TokenReplay[Token Replay]
BruteForce[Brute Force]
end
subgraph "Security Perimeter"
subgraph "Load Balancer Security"
NginxSec[Nginx Security<br/>- Rate limiting<br/>- SSL termination<br/>- Request filtering]
end
subgraph "Application Security Layer"
SecurityMW[Security Middleware]
subgraph "Security Headers"
HSTS[HSTS Header<br/>max-age=31536000]
ContentType[Content-Type-Options<br/>nosniff]
FrameOptions[X-Frame-Options<br/>DENY]
XSSProtection[XSS-Protection<br/>1; mode=block]
CSP[Content-Security-Policy<br/>Restrictive policy]
end
subgraph "CORS Protection"
Origins[Allowed Origins<br/>Whitelist validation]
Methods[Allowed Methods<br/>GET, POST, PUT, DELETE]
Headers[Allowed Headers<br/>Authorization, Content-Type]
Credentials[Credentials Policy<br/>Same-origin only]
end
subgraph "CSRF Protection"
CSRFToken[CSRF Token<br/>Double-submit cookie]
CSRFHeader[X-CSRF-Token<br/>Header validation]
SameSite[SameSite Cookie<br/>Strict policy]
end
end
subgraph "Rate Limiting Defense"
GlobalLimit[Global Rate Limit<br/>100 RPS per IP]
EndpointLimit[Endpoint-specific<br/>Limits per route]
BurstControl[Burst Control<br/>200 request burst]
Backoff[Exponential Backoff<br/>Failed attempts]
end
subgraph "Authentication Security"
MultiAuth[Multi-Provider Auth<br/>Header, JWT, OAuth2, SAML]
subgraph "Token Security"
JWTSigning[JWT Signing<br/>RS256 Algorithm]
TokenExpiry[Token Expiration<br/>Configurable TTL]
TokenRevocation[Token Revocation<br/>Blacklist in Redis]
TokenRotation[Token Rotation<br/>Refresh mechanism]
end
subgraph "Static Token Security"
HMACValidation[HMAC Validation<br/>SHA-256 signature]
TimestampCheck[Timestamp Validation<br/>5-minute window]
ReplayProtection[Replay Protection<br/>Nonce tracking]
KeyRotation[Key Rotation<br/>Configurable schedule]
end
subgraph "Session Security"
SessionEncryption[Session Encryption<br/>AES-256-GCM]
SessionTimeout[Session Timeout<br/>Idle timeout]
SessionFixation[Session Fixation<br/>Protection]
end
end
subgraph "Authorization Security"
RBAC[Role-Based Access<br/>Hierarchical permissions]
PermissionCheck[Permission Validation<br/>Request-level checks]
ScopeValidation[Scope Validation<br/>Token scope matching]
PrincipleOfLeastPrivilege[Least Privilege<br/>Minimal permissions]
end
subgraph "Data Protection"
Encryption[Data Encryption]
subgraph "Encryption at Rest"
DBEncryption[Database Encryption<br/>AES-256 column encryption]
KeyStorage[Key Management<br/>Environment variables]
SaltedHashing[Salted Hashing<br/>BCrypt cost 14]
end
subgraph "Encryption in Transit"
TLS13[TLS 1.3<br/>All communications]
CertPinning[Certificate Pinning<br/>OAuth2/SAML providers]
MTLS[Mutual TLS<br/>Service-to-service]
end
end
subgraph "Input Validation"
JSONValidation[JSON Schema<br/>Request validation]
SQLProtection[SQL Injection<br/>Parameterized queries]
XSSProtectionValidation[XSS Protection<br/>Input sanitization]
PathTraversal[Path Traversal<br/>Prevention]
end
subgraph "Monitoring & Detection"
SecurityAudit[Security Audit Log<br/>All operations logged]
FailureDetection[Failure Detection<br/>Failed auth attempts]
AnomalyDetection[Anomaly Detection<br/>Unusual patterns]
AlertSystem[Alert System<br/>Security incidents]
end
end
subgraph "Database Security"
DBSecurity[PostgreSQL Security<br/>- Connection encryption<br/>- User isolation<br/>- Query logging<br/>- Backup encryption]
end
subgraph "Infrastructure Security"
ContainerSecurity[Container Security<br/>- Non-root user<br/>- Minimal images<br/>- Security scanning]
NetworkSecurity[Network Security<br/>- Internal networks<br/>- Firewall rules<br/>- VPN access]
end
%% Threat Mitigation Connections
DDoS -.->|Mitigated by| NginxSec
XSS -.->|Prevented by| SecurityMW
CSRF -.->|Blocked by| CSRFToken
Injection -.->|Stopped by| JSONValidation
AuthBypass -.->|Blocked by| MultiAuth
TokenReplay -.->|Prevented by| TimestampCheck
BruteForce -.->|Limited by| GlobalLimit
%% Security Layer Flow
NginxSec --> SecurityMW
SecurityMW --> GlobalLimit
GlobalLimit --> MultiAuth
MultiAuth --> RBAC
RBAC --> Encryption
Encryption --> JSONValidation
JSONValidation --> SecurityAudit
SecurityAudit --> DBSecurity
SecurityAudit --> ContainerSecurity
%% Styling
classDef threat fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px
classDef security fill:#c8e6c9,stroke:#388e3c,stroke-width:2px
classDef auth fill:#e1bee7,stroke:#7b1fa2,stroke-width:2px
classDef data fill:#fff9c4,stroke:#f57f17,stroke-width:2px
classDef infra fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
class DDoS,XSS,CSRF,Injection,AuthBypass,TokenReplay,BruteForce threat
class SecurityMW,HSTS,ContentType,FrameOptions,XSSProtection,CSP,Origins,Methods,Headers,Credentials,CSRFToken,CSRFHeader,SameSite security
class MultiAuth,JWTSigning,TokenExpiry,TokenRevocation,TokenRotation,HMACValidation,TimestampCheck,ReplayProtection,KeyRotation,SessionEncryption,SessionTimeout,SessionFixation,RBAC,PermissionCheck,ScopeValidation,PrincipleOfLeastPrivilege auth
class Encryption,DBEncryption,KeyStorage,SaltedHashing,TLS13,CertPinning,MTLS,JSONValidation,SQLProtection,XSSProtectionValidation,PathTraversal,SecurityAudit,FailureDetection,AnomalyDetection,AlertSystem,DBSecurity data
class ContainerSecurity,NetworkSecurity,NginxSec,GlobalLimit,EndpointLimit,BurstControl,Backoff infra
```
### Security Layer Breakdown
1. **Perimeter Security**: Load balancer, rate limiting, DDoS protection
2. **Application Security**: Security headers, CORS, CSRF protection
3. **Authentication Security**: Multi-provider authentication with strong cryptography
4. **Authorization Security**: RBAC with hierarchical permissions
5. **Data Protection**: Encryption at rest and in transit
6. **Input Validation**: Comprehensive input sanitization and validation
7. **Monitoring**: Security event detection and alerting
---
## Authentication Security
### Multi-Provider Authentication Framework
The KMS supports four authentication providers, each with specific security controls:
#### **Header-based Authentication**
```go
// File: internal/auth/header_validator.go:42
func (hv *HeaderValidator) ValidateAuthenticationHeaders(r *http.Request) (*ValidatedUserContext, error)
```
**Security Features:**
- **HMAC-SHA256 Signatures**: Request integrity validation
- **Timestamp Validation**: 5-minute window prevents replay attacks
- **Constant-time Comparison**: Prevents timing attacks
- **Email Format Validation**: Prevents injection attacks
**Implementation Details:**
```go
// HMAC signature validation with constant-time comparison
func (hv *HeaderValidator) validateSignature(userEmail, timestamp, signature string) bool {
mac := hmac.New(sha256.New, []byte(signingKey))
mac.Write([]byte(signingString))
expectedSignature := hex.EncodeToString(mac.Sum(nil))
return hmac.Equal([]byte(signature), []byte(expectedSignature))
}
```
#### **JWT Authentication**
```go
// File: internal/auth/jwt.go:47
func (j *JWTManager) GenerateToken(userToken *domain.UserToken) (string, error)
```
**Security Features:**
- **RSA Signatures**: Strong cryptographic signing
- **Token Revocation**: Redis-backed blacklist
- **Secure JTI Generation**: Cryptographically secure token IDs
- **Claims Validation**: Complete token verification
**Token Structure:**
```json
{
"iss": "kms-api-service",
"sub": "user@example.com",
"aud": ["app-id"],
"exp": 1642781234,
"iat": 1642694834,
"nbf": 1642694834,
"jti": "secure-random-id",
"user_id": "user@example.com",
"app_id": "app-id",
"permissions": ["app.read", "token.create"],
"token_type": "user",
"max_valid_at": 1643299634
}
```
#### **OAuth2 Authentication**
```go
// File: internal/auth/oauth2.go
```
**Security Features:**
- **Authorization Code Flow**: Secure OAuth2 flow implementation
- **State Parameter**: CSRF protection for OAuth2
- **Token Exchange**: Secure credential handling
- **Provider Validation**: Certificate pinning support
#### **SAML Authentication**
```go
// File: internal/auth/saml.go
```
**Security Features:**
- **XML Signature Validation**: Assertion integrity verification
- **Certificate Validation**: Provider certificate verification
- **Attribute Extraction**: Secure claim processing
- **Replay Protection**: Timestamp and nonce validation
---
## Token Security Models
### Static Token Security (HMAC-based)
```mermaid
stateDiagram-v2
[*] --> TokenRequest: Client requests token
state TokenRequest {
[*] --> ValidateApp: Validate app_id
ValidateApp --> ValidatePerms: Check permissions
ValidatePerms --> GenerateToken: Create token
GenerateToken --> [*]
}
TokenRequest --> StaticToken: type=static
state StaticToken {
[*] --> GenerateHMAC: Generate HMAC key
GenerateHMAC --> HashKey: BCrypt hash (cost 14)
HashKey --> StoreDB: Store in static_tokens table
StoreDB --> GrantPerms: Assign permissions
GrantPerms --> Active: Token ready
Active --> Verify: Incoming request
Verify --> ValidateHMAC: Check HMAC signature
ValidateHMAC --> CheckTimestamp: Replay protection
CheckTimestamp --> CheckPerms: Validate permissions
CheckPerms --> Authorized: Permission check
CheckPerms --> Denied: Permission denied
Active --> Revoke: Admin action
Revoke --> Revoked: Update granted_permissions.revoked=true
}
Authorized --> TokenResponse: Success response
Denied --> ErrorResponse: Error response
Revoked --> [*]: Token lifecycle end
note right of StaticToken
Static tokens use HMAC signatures
- Timestamp-based replay protection
- BCrypt hashed storage
- Permission-based access control
- No expiration (until revoked)
end note
```
#### **Token Format**
```
Format: {PREFIX}_{BASE64_DATA}
Example: ABC_xyz123base64encodeddata...
Validation: HMAC-SHA256(request_data, hmac_key)
Storage: BCrypt hash (cost 14)
Lifetime: No expiration (until revoked)
```
#### **Security Implementation**
```go
// File: internal/crypto/token.go:95
func (tg *TokenGenerator) HashToken(token string) (string, error) {
hash, err := bcrypt.GenerateFromPassword([]byte(token), tg.bcryptCost)
if err != nil {
return "", fmt.Errorf("failed to hash token with bcrypt cost %d: %w", tg.bcryptCost, err)
}
return string(hash), nil
}
```
### User Token Security (JWT-based)
```mermaid
stateDiagram-v2
[*] --> UserTokenRequest: User requests token
state UserTokenRequest {
[*] --> AuthUser: Authenticate user
AuthUser --> GenerateJWT: Create JWT with claims
GenerateJWT --> SetExpiry: Set expires_at, max_valid_at
SetExpiry --> StoreSession: Store user session
StoreSession --> Active: JWT issued
}
UserTokenRequest --> UserToken: type=user
state UserToken {
Active --> Verify: Incoming request
Verify --> ValidateJWT: Check JWT signature
ValidateJWT --> CheckExpiry: Verify expiration
CheckExpiry --> CheckRevoked: Check revocation list
CheckRevoked --> Authorized: Valid token
CheckRevoked --> Denied: Token invalid
CheckExpiry --> Expired: Token expired
Active --> Renew: Renewal request
Renew --> CheckRenewal: Within renewal window?
CheckRenewal --> GenerateNew: Issue new JWT
CheckRenewal --> Denied: Renewal denied
GenerateNew --> Active: New token active
Active --> Logout: User logout
Logout --> Revoked: Add to revocation list
Expired --> Renew: Attempt renewal
}
Authorized --> TokenResponse: Success response
Denied --> ErrorResponse: Error response
Revoked --> [*]: Token lifecycle end
note right of UserToken
User tokens are JWT-based
- Configurable expiration
- Renewable within limits
- Session tracking
- Hierarchical permissions
end note
```
#### **Token Format**
```
Format: {CUSTOM_PREFIX}UT-{JWT_TOKEN}
Example: ABC123UT-eyJhbGciOiJSUzI1NiIs...
Validation: RSA signature + claims validation
Storage: Revocation list in Redis
Lifetime: Configurable with renewal window
```
#### **Security Features**
- **RSA-256 Signatures**: Strong cryptographic validation
- **Token Revocation**: Redis blacklist with TTL
- **Renewal Controls**: Limited renewal window
- **Session Tracking**: Database session management
---
## Authorization Framework
### Role-Based Access Control (RBAC)
The KMS implements a hierarchical permission system with role-based access control:
#### **Permission Hierarchy**
```
admin
├── app.admin
│ ├── app.read
│ ├── app.write
│ │ ├── app.create
│ │ ├── app.update
│ │ └── app.delete
├── token.admin
│ ├── token.read
│ │ └── token.verify
│ ├── token.write
│ │ ├── token.create
│ │ └── token.revoke
├── permission.admin
│ ├── permission.read
│ ├── permission.write
│ │ ├── permission.grant
│ │ └── permission.revoke
└── user.admin
├── user.read
└── user.write
```
#### **Role Definitions**
```go
// File: internal/auth/permissions.go:151
func (h *PermissionHierarchy) initializeDefaultRoles() {
defaultRoles := []*Role{
{
Name: "super_admin",
Description: "Super administrator with full access",
Permissions: []string{"admin"},
Metadata: map[string]string{"level": "system"},
},
{
Name: "app_admin",
Description: "Application administrator",
Permissions: []string{"app.admin", "token.admin", "user.read"},
Metadata: map[string]string{"level": "application"},
},
// Additional roles...
}
}
```
#### **Permission Evaluation**
```go
// File: internal/auth/permissions.go:201
func (pm *PermissionManager) HasPermission(ctx context.Context, userID, appID, permission string) (*PermissionEvaluation, error) {
// Cache lookup first
cacheKey := cache.CacheKey(cache.KeyPrefixPermission, fmt.Sprintf("%s:%s:%s", userID, appID, permission))
// Evaluate permission with hierarchy
evaluation := pm.evaluatePermission(ctx, userID, appID, permission)
// Cache result for 5 minutes
pm.cacheManager.SetJSON(ctx, cacheKey, evaluation, 5*time.Minute)
return evaluation, nil
}
```
### Authorization Controls
#### **Resource Ownership**
```go
// File: internal/authorization/rbac.go:100
func (a *AuthorizationService) AuthorizeApplicationOwnership(userID string, app *domain.Application) error {
// System admins can access any application
if a.isSystemAdmin(userID) {
return nil
}
// Check if user is the owner
if a.isOwner(userID, &app.Owner) {
return nil
}
return errors.NewForbiddenError("You do not have permission to access this application")
}
```
#### **Token Scoping**
```go
// File: internal/services/token_service.go:400
func (s *tokenService) verifyStaticToken(ctx context.Context, req *domain.VerifyRequest, app *domain.Application) (*domain.VerifyResponse, error) {
// Get granted permissions for this token
permissions, err := s.grantRepo.GetGrantedPermissionScopes(ctx, domain.TokenTypeStatic, matchedToken.ID)
// Check specific permissions if requested
if len(req.Permissions) > 0 {
permissionResults, err := s.grantRepo.HasAnyPermission(ctx, domain.TokenTypeStatic, matchedToken.ID, req.Permissions)
// Validate all requested permissions are granted
}
}
```
---
## Data Protection
### Encryption at Rest
#### **Database Encryption**
```sql
-- Sensitive data encryption in PostgreSQL
CREATE TABLE applications (
app_id VARCHAR(100) PRIMARY KEY,
hmac_key TEXT, -- Encrypted with application-level encryption
created_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE static_tokens (
id UUID PRIMARY KEY,
key_hash TEXT NOT NULL, -- BCrypt hashed with cost 14
created_at TIMESTAMP DEFAULT NOW()
);
```
#### **BCrypt Hashing**
```go
// File: internal/crypto/token.go:22
const BcryptCost = 14 // 2025 security standards (minimum 14)
func (tg *TokenGenerator) HashToken(token string) (string, error) {
hash, err := bcrypt.GenerateFromPassword([]byte(token), tg.bcryptCost)
if err != nil {
return "", fmt.Errorf("failed to hash token with bcrypt cost %d: %w", tg.bcryptCost, err)
}
return string(hash), nil
}
```
### Encryption in Transit
#### **TLS Configuration**
```go
// TLS 1.3 configuration for all communications
tlsConfig := &tls.Config{
MinVersion: tls.VersionTLS13,
CipherSuites: []uint16{
tls.TLS_AES_256_GCM_SHA384,
tls.TLS_CHACHA20_POLY1305_SHA256,
tls.TLS_AES_128_GCM_SHA256,
},
}
```
#### **Certificate Pinning**
```go
// OAuth2/SAML provider certificate pinning
func (o *OAuth2Provider) validateCertificate(cert *x509.Certificate) error {
expectedFingerprints := o.config.GetStringSlice("OAUTH2_CERT_FINGERPRINTS")
fingerprint := sha256.Sum256(cert.Raw)
fingerprintHex := hex.EncodeToString(fingerprint[:])
for _, expected := range expectedFingerprints {
if fingerprintHex == expected {
return nil
}
}
return fmt.Errorf("certificate fingerprint mismatch")
}
```
---
## Security Middleware Chain
### Middleware Processing Order
1. **Request Logger**: Structured logging with security context
2. **Security Headers**: XSS, clickjacking, MIME-type protection
3. **CORS Handler**: Cross-origin request validation
4. **CSRF Protection**: Double-submit token validation
5. **Rate Limiter**: DDoS and abuse protection
6. **Authentication**: Multi-provider authentication
7. **Authorization**: Permission validation
8. **Input Validation**: Request sanitization and validation
### Rate Limiting Implementation
```go
// File: internal/middleware/security.go:48
func (s *SecurityMiddleware) RateLimitMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
clientIP := s.getClientIP(r)
limiter := s.getRateLimiter(clientIP)
if !limiter.Allow() {
s.logger.Warn("Rate limit exceeded",
zap.String("client_ip", clientIP),
zap.String("path", r.URL.Path))
s.trackRateLimitViolation(clientIP)
http.Error(w, `{"error":"rate_limit_exceeded"}`, http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
```
### Brute Force Protection
```go
// File: internal/middleware/security.go:365
func (s *SecurityMiddleware) checkAndBlockIP(clientIP string) {
var count int
err := s.cacheManager.GetJSON(ctx, key, &count)
maxFailures := s.config.GetInt("MAX_AUTH_FAILURES")
if maxFailures <= 0 {
maxFailures = 5 // Default
}
if count >= maxFailures {
// Block the IP for 1 hour
blockInfo := map[string]interface{}{
"blocked_at": time.Now().Unix(),
"failure_count": count,
"reason": "excessive_auth_failures",
}
s.cacheManager.SetJSON(ctx, blockKey, blockInfo, blockDuration)
}
}
```
---
## Threat Model
### External Threats
#### **Network-based Attacks**
- **DDoS**: Rate limiting and load balancer protection
- **Man-in-the-Middle**: TLS 1.3 encryption
- **Packet Sniffing**: End-to-end encryption
#### **Application-level Attacks**
- **SQL Injection**: Parameterized queries only
- **XSS**: Input validation and CSP headers
- **CSRF**: Double-submit token protection
- **Session Hijacking**: Secure session management
#### **Authentication Attacks**
- **Brute Force**: Rate limiting and account lockout
- **Credential Stuffing**: Multi-factor authentication support
- **Token Replay**: Timestamp validation and nonces
- **JWT Attacks**: Strong cryptographic signing
### Insider Threats
#### **Privileged User Abuse**
- **Audit Logging**: Complete audit trail
- **Role Separation**: Principle of least privilege
- **Session Monitoring**: Abnormal activity detection
#### **Data Exfiltration**
- **Access Controls**: Resource-level authorization
- **Data Classification**: Sensitive data identification
- **Export Monitoring**: Large data access alerts
---
## Security Controls
### Preventive Controls
#### **Authentication Controls**
- Multi-factor authentication support
- Strong password policies (where applicable)
- Account lockout mechanisms
- Session timeout enforcement
#### **Authorization Controls**
- Role-based access control (RBAC)
- Resource-level permissions
- API endpoint protection
- Ownership validation
#### **Data Protection Controls**
- Encryption at rest (AES-256)
- Encryption in transit (TLS 1.3)
- Key management procedures
- Secure data disposal
### Detective Controls
#### **Monitoring and Logging**
```go
// Comprehensive audit logging
type AuditLog struct {
ID uuid.UUID `json:"id"`
Timestamp time.Time `json:"timestamp"`
UserID string `json:"user_id"`
Action string `json:"action"`
ResourceType string `json:"resource_type"`
ResourceID string `json:"resource_id"`
IPAddress string `json:"ip_address"`
UserAgent string `json:"user_agent"`
Metadata map[string]interface{} `json:"metadata"`
}
```
#### **Security Event Detection**
- Failed authentication attempts
- Privilege escalation attempts
- Unusual access patterns
- Token misuse detection
### Responsive Controls
#### **Incident Response**
- Automated threat response
- Token revocation procedures
- User account suspension
- Alert escalation workflows
#### **Recovery Controls**
- Backup and restore procedures
- Disaster recovery planning
- Business continuity measures
- Data integrity verification
---
## Compliance Framework
### OWASP Top 10 Protection
1. **Injection**: Parameterized queries, input validation
2. **Broken Authentication**: Multi-provider auth, strong crypto
3. **Sensitive Data Exposure**: Encryption, secure storage
4. **XML External Entities**: JSON-only API
5. **Broken Access Control**: RBAC, authorization checks
6. **Security Misconfiguration**: Secure defaults
7. **Cross-Site Scripting**: Input validation, CSP
8. **Insecure Deserialization**: Safe JSON handling
9. **Known Vulnerabilities**: Regular dependency updates
10. **Logging & Monitoring**: Comprehensive audit trails
### Regulatory Compliance
#### **SOC 2 Type II Controls**
- **Security**: Access controls, encryption, monitoring
- **Availability**: High availability, disaster recovery
- **Processing Integrity**: Data validation, error handling
- **Confidentiality**: Data classification, access restrictions
- **Privacy**: Data minimization, consent management
#### **GDPR Compliance** (where applicable)
- **Data Protection by Design**: Privacy-first architecture
- **Consent Management**: Granular consent controls
- **Right to Erasure**: Data deletion procedures
- **Data Portability**: Export capabilities
- **Breach Notification**: Automated incident response
#### **NIST Cybersecurity Framework**
- **Identify**: Asset inventory, risk assessment
- **Protect**: Access controls, data security
- **Detect**: Monitoring, anomaly detection
- **Respond**: Incident response procedures
- **Recover**: Business continuity planning
### Security Metrics
#### **Key Security Indicators**
- Authentication success/failure rates
- Authorization denial rates
- Token usage patterns
- Security event frequency
- Incident response times
#### **Security Monitoring**
```yaml
alerts:
- name: "High Authentication Failure Rate"
condition: "auth_failures > 10 per minute per IP"
severity: "warning"
- name: "Privilege Escalation Attempt"
condition: "permission_denied + elevated_permission"
severity: "critical"
- name: "Token Abuse Detection"
condition: "token_usage_anomaly"
severity: "warning"
```
This security architecture document provides comprehensive coverage of the KMS security model, suitable for security audits, compliance reviews, and security team training.

View File

@ -0,0 +1,879 @@
# KMS System Implementation Guide
This document provides detailed implementation guidance for the KMS system, covering areas not extensively documented in other files. It serves as a comprehensive reference for developers working on system components.
## Table of Contents
1. [Documentation Consistency Analysis](#documentation-consistency-analysis)
2. [Audit System Implementation](#audit-system-implementation)
3. [Multi-Tenancy Support](#multi-tenancy-support)
4. [Cache Implementation Details](#cache-implementation-details)
5. [Error Handling Framework](#error-handling-framework)
6. [Validation System](#validation-system)
7. [Metrics and Monitoring](#metrics-and-monitoring)
8. [Database Migration System](#database-migration-system)
9. [Frontend Architecture](#frontend-architecture)
10. [Configuration Management](#configuration-management)
---
## Documentation Consistency Analysis
### Current State Assessment
The existing documentation is comprehensive but has some minor inconsistencies with the actual codebase:
#### ✅ Accurate Documentation Areas:
- **API endpoints** match the implementation in handlers
- **Database schema** aligns with migrations (especially the new audit_events table)
- **Authentication flows** are correctly documented
- **Docker compose setup** matches actual configuration
- **Security architecture** accurately reflects implementation
- **Permission system** documentation is consistent with code
#### ⚠️ Minor Inconsistencies Found:
1. **Port references**: Some docs mention port 80 but actual nginx runs on 8081
2. **Container names**: Documentation uses generic names, actual compose uses specific names like `kms-postgres`
3. **Rate limiting values**: Docs show different values than actual middleware implementation
4. **Frontend build process**: React version mentioned as 18, but package.json shows 19+
#### ✨ Recently Added Features (Not in Original Docs):
- **Audit system** with comprehensive event logging
- **Multi-tenancy support** in database schema
- **Advanced caching layer** with Redis integration
- **SAML authentication** implementation
- **Advanced security middleware** with brute force protection
---
## Audit System Implementation
### Overview
The KMS implements a comprehensive audit logging system that tracks all system events, user actions, and security-related activities.
### Core Components
#### Audit Event Structure
```go
// File: internal/audit/audit.go
type AuditEvent struct {
ID uuid.UUID `json:"id"`
Type EventType `json:"type"`
Severity Severity `json:"severity"`
Status Status `json:"status"`
Timestamp time.Time `json:"timestamp"`
// Actor information
ActorID string `json:"actor_id"`
ActorType ActorType `json:"actor_type"`
ActorIP string `json:"actor_ip"`
UserAgent string `json:"user_agent"`
// Multi-tenancy support
TenantID *uuid.UUID `json:"tenant_id,omitempty"`
// Resource information
ResourceID string `json:"resource_id"`
ResourceType string `json:"resource_type"`
// Event details
Action string `json:"action"`
Description string `json:"description"`
Details map[string]interface{} `json:"details"`
// Request context
RequestID string `json:"request_id"`
SessionID string `json:"session_id"`
// Metadata
Tags []string `json:"tags"`
Metadata map[string]interface{} `json:"metadata"`
}
```
#### Event Types Taxonomy
```
auth.* - Authentication events
├── auth.login - Successful user login
├── auth.login_failed - Failed login attempt
├── auth.logout - User logout
├── auth.token_created - Token generation
├── auth.token_revoked - Token revocation
└── auth.token_validated - Token validation
session.* - Session management
├── session.created - New session created
├── session.revoked - Session terminated
└── session.expired - Session timeout
app.* - Application management
├── app.created - Application created
├── app.updated - Application modified
└── app.deleted - Application removed
permission.* - Permission operations
├── permission.granted - Permission assigned
├── permission.revoked - Permission removed
└── permission.denied - Access denied
tenant.* - Multi-tenant operations
├── tenant.created - New tenant
├── tenant.updated - Tenant modified
├── tenant.suspended - Tenant suspended
└── tenant.activated - Tenant reactivated
```
#### Database Schema
```sql
-- File: migrations/004_add_audit_events.up.sql
CREATE TABLE audit_events (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
type VARCHAR(50) NOT NULL,
severity VARCHAR(20) NOT NULL CHECK (severity IN ('info', 'warning', 'error', 'critical')),
status VARCHAR(20) NOT NULL CHECK (status IN ('success', 'failure', 'pending')),
timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
-- Actor information
actor_id VARCHAR(255),
actor_type VARCHAR(50) CHECK (actor_type IN ('user', 'system', 'service')),
actor_ip INET,
user_agent TEXT,
-- Multi-tenancy
tenant_id UUID,
-- Resource tracking
resource_id VARCHAR(255),
resource_type VARCHAR(100),
action VARCHAR(100) NOT NULL,
description TEXT NOT NULL,
details JSONB DEFAULT '{}',
-- Request context
request_id VARCHAR(100),
session_id VARCHAR(255),
-- Metadata
tags TEXT[],
metadata JSONB DEFAULT '{}'
);
```
#### Frontend Integration
```typescript
// File: kms-frontend/src/components/Audit.tsx
interface AuditEvent {
id: string;
type: string;
severity: 'info' | 'warning' | 'error' | 'critical';
status: 'success' | 'failure' | 'pending';
timestamp: string;
actor_id: string;
actor_type: string;
resource_type: string;
action: string;
description: string;
}
const Audit: React.FC = () => {
// Real-time audit log viewing with filtering
// Timeline view for event sequences
// Statistics dashboard for audit metrics
};
```
### Implementation Guidelines
#### Logging Best Practices
1. **Log all security-relevant events**
2. **Include sufficient context** for forensic analysis
3. **Use structured logging** with consistent fields
4. **Implement log retention policies**
5. **Ensure tamper-evident logging**
#### Performance Considerations
1. **Asynchronous logging** to avoid blocking operations
2. **Batch inserts** for high-volume events
3. **Proper indexing** on commonly queried fields
4. **Archival strategy** for historical data
---
## Multi-Tenancy Support
### Architecture
The KMS implements a multi-tenant architecture where each tenant has isolated data and permissions while sharing the same application instance.
### Database Design
#### Tenant Model
```go
// File: internal/domain/tenant.go
type Tenant struct {
ID uuid.UUID `json:"id" db:"id"`
Name string `json:"name" db:"name"`
Slug string `json:"slug" db:"slug"`
Status TenantStatus `json:"status" db:"status"`
Settings TenantSettings `json:"settings" db:"settings"`
Metadata map[string]interface{} `json:"metadata" db:"metadata"`
CreatedAt time.Time `json:"created_at" db:"created_at"`
UpdatedAt time.Time `json:"updated_at" db:"updated_at"`
}
type TenantStatus string
const (
TenantStatusActive TenantStatus = "active"
TenantStatusSuspended TenantStatus = "suspended"
TenantStatusPending TenantStatus = "pending"
)
```
#### Data Isolation Strategy
```sql
-- All tenant-specific tables include tenant_id
ALTER TABLE applications ADD COLUMN tenant_id UUID REFERENCES tenants(id);
ALTER TABLE static_tokens ADD COLUMN tenant_id UUID REFERENCES tenants(id);
ALTER TABLE user_sessions ADD COLUMN tenant_id UUID REFERENCES tenants(id);
ALTER TABLE audit_events ADD COLUMN tenant_id UUID;
-- Row-level security policies
ALTER TABLE applications ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON applications
FOR ALL TO kms_user
USING (tenant_id = current_setting('app.current_tenant')::UUID);
```
### Implementation Pattern
#### Tenant Context Middleware
```go
// File: internal/middleware/tenant.go
func TenantMiddleware() gin.HandlerFunc {
return func(c *gin.Context) {
tenantID := extractTenantID(c)
if tenantID == "" {
c.AbortWithStatusJSON(400, gin.H{"error": "tenant_required"})
return
}
// Set tenant context
c.Set("tenant_id", tenantID)
// Set database session variable
db := c.MustGet("db").(*sql.DB)
_, err := db.Exec("SELECT set_config('app.current_tenant', $1, true)", tenantID)
if err != nil {
c.AbortWithStatusJSON(500, gin.H{"error": "tenant_setup_failed"})
return
}
c.Next()
}
}
```
### Usage Guidelines
1. **Always include tenant_id** in database queries
2. **Validate tenant access** in middleware
3. **Implement tenant-aware caching**
4. **Audit cross-tenant operations**
5. **Test tenant isolation thoroughly**
---
## Cache Implementation Details
### Architecture
The KMS implements a layered caching system with multiple providers and configurable TTL policies.
### Cache Interface
```go
// File: internal/cache/cache.go
type CacheManager interface {
Get(ctx context.Context, key string) ([]byte, error)
Set(ctx context.Context, key string, value []byte, ttl time.Duration) error
GetJSON(ctx context.Context, key string, dest interface{}) error
SetJSON(ctx context.Context, key string, value interface{}, ttl time.Duration) error
Delete(ctx context.Context, key string) error
Clear(ctx context.Context) error
Exists(ctx context.Context, key string) (bool, error)
}
```
### Redis Implementation
```go
// File: internal/cache/redis.go
type RedisCacheManager struct {
client redis.Client
keyPrefix string
serializer JSONSerializer
logger *zap.Logger
}
func (r *RedisCacheManager) GetJSON(ctx context.Context, key string, dest interface{}) error {
prefixedKey := r.keyPrefix + key
data, err := r.client.Get(ctx, prefixedKey).Bytes()
if err != nil {
if err == redis.Nil {
return ErrCacheMiss
}
return fmt.Errorf("failed to get key %s: %w", prefixedKey, err)
}
return r.serializer.Deserialize(data, dest)
}
```
### Cache Key Management
```go
type CacheKey string
const (
KeyPrefixAuth = "auth:"
KeyPrefixToken = "token:"
KeyPrefixPermission = "perm:"
KeyPrefixSession = "sess:"
KeyPrefixApp = "app:"
)
func CacheKey(prefix, suffix string) string {
return fmt.Sprintf("%s%s", prefix, suffix)
}
```
### Usage Patterns
#### Authentication Caching
```go
// Cache authentication results for 5 minutes
cacheKey := cache.CacheKey(cache.KeyPrefixAuth, fmt.Sprintf("%s:%s", userID, appID))
err := cacheManager.SetJSON(ctx, cacheKey, authResult, 5*time.Minute)
```
#### Token Revocation List
```go
// Cache revoked tokens until their expiration
revokedKey := cache.CacheKey(cache.KeyPrefixToken, "revoked:"+tokenID)
err := cacheManager.Set(ctx, revokedKey, []byte("1"), tokenExpiry.Sub(time.Now()))
```
### Configuration
```bash
# Cache configuration
CACHE_ENABLED=true
CACHE_PROVIDER=redis # or memory
REDIS_ADDR=localhost:6379
REDIS_PASSWORD=
REDIS_DB=0
CACHE_DEFAULT_TTL=5m
```
---
## Error Handling Framework
### Error Type Hierarchy
```go
// File: internal/errors/errors.go
type ErrorCode string
const (
ErrorCodeValidation ErrorCode = "validation_error"
ErrorCodeAuthentication ErrorCode = "authentication_error"
ErrorCodeAuthorization ErrorCode = "authorization_error"
ErrorCodeNotFound ErrorCode = "not_found"
ErrorCodeConflict ErrorCode = "conflict"
ErrorCodeInternal ErrorCode = "internal_error"
ErrorCodeRateLimit ErrorCode = "rate_limit_exceeded"
ErrorCodeBadRequest ErrorCode = "bad_request"
)
type APIError struct {
Code ErrorCode `json:"code"`
Message string `json:"message"`
Details interface{} `json:"details,omitempty"`
HTTPStatus int `json:"-"`
Cause error `json:"-"`
}
```
### Error Factory Functions
```go
func NewValidationError(message string, details interface{}) *APIError {
return &APIError{
Code: ErrorCodeValidation,
Message: message,
Details: details,
HTTPStatus: http.StatusBadRequest,
}
}
func NewAuthenticationError(message string) *APIError {
return &APIError{
Code: ErrorCodeAuthentication,
Message: message,
HTTPStatus: http.StatusUnauthorized,
}
}
```
### Error Handler Middleware
```go
// File: internal/errors/secure_responses.go
func (e *ErrorHandler) HandleError(c *gin.Context, err error) {
var apiErr *APIError
if errors.As(err, &apiErr) {
// Log error with context
e.logger.Error("API error",
zap.String("error_code", string(apiErr.Code)),
zap.String("message", apiErr.Message),
zap.Int("http_status", apiErr.HTTPStatus),
zap.Error(apiErr.Cause))
c.JSON(apiErr.HTTPStatus, gin.H{
"error": apiErr.Code,
"message": apiErr.Message,
"details": apiErr.Details,
})
return
}
// Handle unexpected errors
e.logger.Error("Unexpected error", zap.Error(err))
c.JSON(http.StatusInternalServerError, gin.H{
"error": ErrorCodeInternal,
"message": "An internal error occurred",
})
}
```
---
## Validation System
### Validator Implementation
```go
// File: internal/validation/validator.go
type Validator struct {
validator *validator.Validate
logger *zap.Logger
}
func NewValidator(logger *zap.Logger) *Validator {
v := validator.New()
// Register custom validators
v.RegisterValidation("app_id", validateAppID)
v.RegisterValidation("token_type", validateTokenType)
v.RegisterValidation("permission_scope", validatePermissionScope)
return &Validator{
validator: v,
logger: logger,
}
}
```
### Custom Validation Rules
```go
func validateAppID(fl validator.FieldLevel) bool {
appID := fl.Field().String()
// App ID format: domain.app (e.g., com.example.app)
pattern := `^[a-z0-9]+(\.[a-z0-9]+)*\.[a-z0-9]+$`
match, _ := regexp.MatchString(pattern, appID)
return match && len(appID) >= 3 && len(appID) <= 100
}
func validatePermissionScope(fl validator.FieldLevel) bool {
scope := fl.Field().String()
// Permission format: domain.action (e.g., app.read)
pattern := `^[a-z_]+(\.[a-z_]+)*$`
match, _ := regexp.MatchString(pattern, scope)
return match && len(scope) >= 1 && len(scope) <= 50
}
```
### Middleware Integration
```go
// File: internal/middleware/validation.go
func (v *ValidationMiddleware) ValidateJSON(schema interface{}) gin.HandlerFunc {
return gin.HandlerFunc(func(c *gin.Context) {
if err := c.ShouldBindJSON(schema); err != nil {
var validationErrors []ValidationError
if errs, ok := err.(validator.ValidationErrors); ok {
for _, e := range errs {
validationErrors = append(validationErrors, ValidationError{
Field: e.Field(),
Message: e.Tag(),
Value: e.Value(),
})
}
}
apiErr := errors.NewValidationError("Request validation failed", validationErrors)
v.errorHandler.HandleError(c, apiErr)
return
}
c.Next()
})
}
```
---
## Metrics and Monitoring
### Prometheus Integration
```go
// File: internal/metrics/metrics.go
type Metrics struct {
// HTTP metrics
httpRequestsTotal *prometheus.CounterVec
httpRequestDuration *prometheus.HistogramVec
httpRequestsInFlight prometheus.Gauge
// Auth metrics
authAttemptsTotal *prometheus.CounterVec
authSuccessTotal *prometheus.CounterVec
authFailuresTotal *prometheus.CounterVec
// Token metrics
tokensIssuedTotal *prometheus.CounterVec
tokenValidationsTotal *prometheus.CounterVec
// Business metrics
applicationsTotal prometheus.Gauge
activeSessionsTotal prometheus.Gauge
}
```
### Metrics Collection
```go
func (m *Metrics) RecordHTTPRequest(method, path string, statusCode int, duration time.Duration) {
m.httpRequestsTotal.WithLabelValues(method, path, strconv.Itoa(statusCode)).Inc()
m.httpRequestDuration.WithLabelValues(method, path).Observe(duration.Seconds())
}
func (m *Metrics) RecordAuthAttempt(provider, result string) {
m.authAttemptsTotal.WithLabelValues(provider, result).Inc()
if result == "success" {
m.authSuccessTotal.WithLabelValues(provider).Inc()
} else {
m.authFailuresTotal.WithLabelValues(provider).Inc()
}
}
```
### Dashboard Configuration
```yaml
# Grafana dashboard config
panels:
- title: "Request Rate"
type: "graph"
targets:
- expr: "rate(http_requests_total[5m])"
legendFormat: "{{method}} {{path}}"
- title: "Authentication Success Rate"
type: "stat"
targets:
- expr: "rate(auth_success_total[5m]) / rate(auth_attempts_total[5m]) * 100"
legendFormat: "Success Rate %"
- title: "Active Applications"
type: "stat"
targets:
- expr: "applications_total"
legendFormat: "Applications"
```
---
## Database Migration System
### Migration Structure
```
migrations/
├── 001_initial_schema.up.sql
├── 001_initial_schema.down.sql
├── 002_user_sessions.up.sql
├── 002_user_sessions.down.sql
├── 003_add_token_prefix.up.sql
├── 003_add_token_prefix.down.sql
├── 004_add_audit_events.up.sql
└── 004_add_audit_events.down.sql
```
### Migration Runner
```go
// File: internal/database/postgres.go
func RunMigrations(db *sql.DB, migrationPath string) error {
driver, err := postgres.WithInstance(db, &postgres.Config{})
if err != nil {
return fmt.Errorf("failed to create migration driver: %w", err)
}
m, err := migrate.NewWithDatabaseInstance(
fmt.Sprintf("file://%s", migrationPath),
"postgres", driver)
if err != nil {
return fmt.Errorf("failed to create migration instance: %w", err)
}
if err := m.Up(); err != nil && err != migrate.ErrNoChange {
return fmt.Errorf("failed to run migrations: %w", err)
}
return nil
}
```
### Migration Best Practices
1. **Always create both up and down migrations**
2. **Test migrations on copy of production data**
3. **Make migrations idempotent**
4. **Add proper indexes for performance**
5. **Include rollback procedures**
### Example Migration
```sql
-- 005_add_oauth_providers.up.sql
CREATE TABLE IF NOT EXISTS oauth_providers (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
name VARCHAR(100) NOT NULL UNIQUE,
client_id VARCHAR(255) NOT NULL,
client_secret_encrypted TEXT NOT NULL,
authorization_url TEXT NOT NULL,
token_url TEXT NOT NULL,
user_info_url TEXT NOT NULL,
scopes TEXT[] DEFAULT ARRAY['openid', 'profile', 'email'],
enabled BOOLEAN DEFAULT true,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_oauth_providers_name ON oauth_providers(name);
CREATE INDEX idx_oauth_providers_enabled ON oauth_providers(enabled) WHERE enabled = true;
```
---
## Frontend Architecture
### Component Structure
```
src/
├── components/
│ ├── Applications.tsx # Application management
│ ├── Tokens.tsx # Token operations
│ ├── Users.tsx # User management
│ ├── Audit.tsx # Audit log viewer
│ ├── Dashboard.tsx # Main dashboard
│ ├── Login.tsx # Authentication
│ ├── TokenTester.tsx # Token testing utility
│ └── TokenTesterCallback.tsx
├── contexts/
│ └── AuthContext.tsx # Authentication state
├── services/
│ └── apiService.ts # API client
├── App.tsx # Main application
└── index.tsx # Entry point
```
### API Service Implementation
```typescript
// File: kms-frontend/src/services/apiService.ts
class APIService {
private baseURL: string;
private token: string | null = null;
constructor(baseURL: string) {
this.baseURL = baseURL;
}
async request<T>(endpoint: string, options: RequestInit = {}): Promise<T> {
const url = `${this.baseURL}${endpoint}`;
const headers = {
'Content-Type': 'application/json',
'X-User-Email': this.getUserEmail(),
...options.headers,
};
const response = await fetch(url, {
...options,
headers,
});
if (!response.ok) {
const error = await response.json().catch(() => ({}));
throw new APIError(error.message || 'Request failed', response.status);
}
return response.json();
}
// Application management
async getApplications(): Promise<Application[]> {
return this.request<Application[]>('/api/applications');
}
// Audit log access
async getAuditEvents(params: AuditQueryParams): Promise<AuditEvent[]> {
const queryString = new URLSearchParams(params).toString();
return this.request<AuditEvent[]>(`/api/audit/events?${queryString}`);
}
}
```
### Authentication Context
```typescript
// File: kms-frontend/src/contexts/AuthContext.tsx
interface AuthContextType {
user: User | null;
login: (email: string) => Promise<void>;
logout: () => void;
isAuthenticated: boolean;
isLoading: boolean;
}
export const AuthContext = React.createContext<AuthContextType | null>(null);
export const AuthProvider: React.FC<{ children: React.ReactNode }> = ({ children }) => {
const [user, setUser] = useState<User | null>(null);
const [isLoading, setIsLoading] = useState(true);
const login = async (email: string) => {
try {
setIsLoading(true);
const response = await apiService.login(email);
setUser({ email, token: response.token });
localStorage.setItem('kms_user', JSON.stringify({ email }));
} catch (error) {
throw error;
} finally {
setIsLoading(false);
}
};
// ... rest of implementation
};
```
---
## Configuration Management
### Configuration Interface
```go
// File: internal/config/config.go
type ConfigProvider interface {
GetString(key string) string
GetInt(key string) int
GetBool(key string) bool
GetDuration(key string) time.Duration
GetStringSlice(key string) []string
IsSet(key string) bool
Validate() error
GetDatabaseDSN() string
GetServerAddress() string
IsDevelopment() bool
IsProduction() bool
}
```
### Configuration Validation
```go
func (c *Config) Validate() error {
var errors []string
// Required configuration
required := []string{
"INTERNAL_HMAC_KEY",
"JWT_SECRET",
"AUTH_SIGNING_KEY",
"DB_HOST",
"DB_NAME",
}
for _, key := range required {
if !c.IsSet(key) {
errors = append(errors, fmt.Sprintf("required configuration %s is not set", key))
}
}
// Validate key lengths
if len(c.GetString("INTERNAL_HMAC_KEY")) < 32 {
errors = append(errors, "INTERNAL_HMAC_KEY must be at least 32 characters")
}
if len(errors) > 0 {
return fmt.Errorf("configuration validation failed: %s", strings.Join(errors, ", "))
}
return nil
}
```
### Environment Configuration
```bash
# Security Configuration
INTERNAL_HMAC_KEY=3924f352b7ea63b27db02bf4b0014f2961a5d2f7c27643853a4581bb3a5457cb
JWT_SECRET=7f5e11d55e957988b00ce002418680af384219ef98c50d08cbbbdd541978450c
AUTH_SIGNING_KEY=484f921b39c383e6b3e0cc5a7cef3c2cec3d7c8d474ab5102891dc4c2bf63a68
# Database Configuration
DB_HOST=postgres
DB_PORT=5432
DB_NAME=kms
DB_USER=postgres
DB_PASSWORD=postgres
# Feature Flags
RATE_LIMIT_ENABLED=true
CACHE_ENABLED=false
METRICS_ENABLED=true
SAML_ENABLED=false
```
---
## Implementation Best Practices
### Code Organization
1. **Follow clean architecture principles**
2. **Use dependency injection throughout**
3. **Implement comprehensive error handling**
4. **Add structured logging to all components**
5. **Write unit tests for business logic**
### Security Guidelines
1. **Always validate input at API boundaries**
2. **Use parameterized database queries**
3. **Implement proper authentication and authorization**
4. **Log all security-relevant events**
5. **Follow principle of least privilege**
### Performance Considerations
1. **Implement caching for frequently accessed data**
2. **Use database indexes appropriately**
3. **Monitor and optimize slow queries**
4. **Implement proper connection pooling**
5. **Use asynchronous operations where beneficial**
### Testing Strategy
1. **Unit tests for business logic**
2. **Integration tests for API endpoints**
3. **End-to-end tests for critical workflows**
4. **Load testing for performance validation**
5. **Security testing for vulnerability assessment**
---
*This document serves as a comprehensive implementation guide for the KMS system. It should be updated as the system evolves and new features are added.*