# KMS System Implementation Guide This document provides detailed implementation guidance for the KMS system, covering areas not extensively documented in other files. It serves as a comprehensive reference for developers working on system components. ## Table of Contents 1. [Documentation Consistency Analysis](#documentation-consistency-analysis) 2. [Audit System Implementation](#audit-system-implementation) 3. [Multi-Tenancy Support](#multi-tenancy-support) 4. [Cache Implementation Details](#cache-implementation-details) 5. [Error Handling Framework](#error-handling-framework) 6. [Validation System](#validation-system) 7. [Metrics and Monitoring](#metrics-and-monitoring) 8. [Database Migration System](#database-migration-system) 9. [Frontend Architecture](#frontend-architecture) 10. [Configuration Management](#configuration-management) --- ## Documentation Consistency Analysis ### Current State Assessment The existing documentation is comprehensive but has some minor inconsistencies with the actual codebase: #### ✅ Accurate Documentation Areas: - **API endpoints** match the implementation in handlers - **Database schema** aligns with migrations (especially the new audit_events table) - **Authentication flows** are correctly documented - **Docker compose setup** matches actual configuration - **Security architecture** accurately reflects implementation - **Permission system** documentation is consistent with code #### ⚠️ Minor Inconsistencies Found: 1. **Port references**: Some docs mention port 80 but actual nginx runs on 8081 2. **Container names**: Documentation uses generic names, actual compose uses specific names like `kms-postgres` 3. **Rate limiting values**: Docs show different values than actual middleware implementation 4. **Frontend build process**: React version mentioned as 18, but package.json shows 19+ #### ✨ Recently Added Features (Not in Original Docs): - **Audit system** with comprehensive event logging - **Multi-tenancy support** in database schema - **Advanced caching layer** with Redis integration - **SAML authentication** implementation - **Advanced security middleware** with brute force protection --- ## Audit System Implementation ### Overview The KMS implements a comprehensive audit logging system that tracks all system events, user actions, and security-related activities. ### Core Components #### Audit Event Structure ```go // File: internal/audit/audit.go type AuditEvent struct { ID uuid.UUID `json:"id"` Type EventType `json:"type"` Severity Severity `json:"severity"` Status Status `json:"status"` Timestamp time.Time `json:"timestamp"` // Actor information ActorID string `json:"actor_id"` ActorType ActorType `json:"actor_type"` ActorIP string `json:"actor_ip"` UserAgent string `json:"user_agent"` // Multi-tenancy support TenantID *uuid.UUID `json:"tenant_id,omitempty"` // Resource information ResourceID string `json:"resource_id"` ResourceType string `json:"resource_type"` // Event details Action string `json:"action"` Description string `json:"description"` Details map[string]interface{} `json:"details"` // Request context RequestID string `json:"request_id"` SessionID string `json:"session_id"` // Metadata Tags []string `json:"tags"` Metadata map[string]interface{} `json:"metadata"` } ``` #### Event Types Taxonomy ``` auth.* - Authentication events ├── auth.login - Successful user login ├── auth.login_failed - Failed login attempt ├── auth.logout - User logout ├── auth.token_created - Token generation ├── auth.token_revoked - Token revocation └── auth.token_validated - Token validation session.* - Session management ├── session.created - New session created ├── session.revoked - Session terminated └── session.expired - Session timeout app.* - Application management ├── app.created - Application created ├── app.updated - Application modified └── app.deleted - Application removed permission.* - Permission operations ├── permission.granted - Permission assigned ├── permission.revoked - Permission removed └── permission.denied - Access denied tenant.* - Multi-tenant operations ├── tenant.created - New tenant ├── tenant.updated - Tenant modified ├── tenant.suspended - Tenant suspended └── tenant.activated - Tenant reactivated ``` #### Database Schema ```sql -- File: migrations/004_add_audit_events.up.sql CREATE TABLE audit_events ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), type VARCHAR(50) NOT NULL, severity VARCHAR(20) NOT NULL CHECK (severity IN ('info', 'warning', 'error', 'critical')), status VARCHAR(20) NOT NULL CHECK (status IN ('success', 'failure', 'pending')), timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), -- Actor information actor_id VARCHAR(255), actor_type VARCHAR(50) CHECK (actor_type IN ('user', 'system', 'service')), actor_ip INET, user_agent TEXT, -- Multi-tenancy tenant_id UUID, -- Resource tracking resource_id VARCHAR(255), resource_type VARCHAR(100), action VARCHAR(100) NOT NULL, description TEXT NOT NULL, details JSONB DEFAULT '{}', -- Request context request_id VARCHAR(100), session_id VARCHAR(255), -- Metadata tags TEXT[], metadata JSONB DEFAULT '{}' ); ``` #### Frontend Integration ```typescript // File: kms-frontend/src/components/Audit.tsx interface AuditEvent { id: string; type: string; severity: 'info' | 'warning' | 'error' | 'critical'; status: 'success' | 'failure' | 'pending'; timestamp: string; actor_id: string; actor_type: string; resource_type: string; action: string; description: string; } const Audit: React.FC = () => { // Real-time audit log viewing with filtering // Timeline view for event sequences // Statistics dashboard for audit metrics }; ``` ### Implementation Guidelines #### Logging Best Practices 1. **Log all security-relevant events** 2. **Include sufficient context** for forensic analysis 3. **Use structured logging** with consistent fields 4. **Implement log retention policies** 5. **Ensure tamper-evident logging** #### Performance Considerations 1. **Asynchronous logging** to avoid blocking operations 2. **Batch inserts** for high-volume events 3. **Proper indexing** on commonly queried fields 4. **Archival strategy** for historical data --- ## Multi-Tenancy Support ### Architecture The KMS implements a multi-tenant architecture where each tenant has isolated data and permissions while sharing the same application instance. ### Database Design #### Tenant Model ```go // File: internal/domain/tenant.go type Tenant struct { ID uuid.UUID `json:"id" db:"id"` Name string `json:"name" db:"name"` Slug string `json:"slug" db:"slug"` Status TenantStatus `json:"status" db:"status"` Settings TenantSettings `json:"settings" db:"settings"` Metadata map[string]interface{} `json:"metadata" db:"metadata"` CreatedAt time.Time `json:"created_at" db:"created_at"` UpdatedAt time.Time `json:"updated_at" db:"updated_at"` } type TenantStatus string const ( TenantStatusActive TenantStatus = "active" TenantStatusSuspended TenantStatus = "suspended" TenantStatusPending TenantStatus = "pending" ) ``` #### Data Isolation Strategy ```sql -- All tenant-specific tables include tenant_id ALTER TABLE applications ADD COLUMN tenant_id UUID REFERENCES tenants(id); ALTER TABLE static_tokens ADD COLUMN tenant_id UUID REFERENCES tenants(id); ALTER TABLE user_sessions ADD COLUMN tenant_id UUID REFERENCES tenants(id); ALTER TABLE audit_events ADD COLUMN tenant_id UUID; -- Row-level security policies ALTER TABLE applications ENABLE ROW LEVEL SECURITY; CREATE POLICY tenant_isolation ON applications FOR ALL TO kms_user USING (tenant_id = current_setting('app.current_tenant')::UUID); ``` ### Implementation Pattern #### Tenant Context Middleware ```go // File: internal/middleware/tenant.go func TenantMiddleware() gin.HandlerFunc { return func(c *gin.Context) { tenantID := extractTenantID(c) if tenantID == "" { c.AbortWithStatusJSON(400, gin.H{"error": "tenant_required"}) return } // Set tenant context c.Set("tenant_id", tenantID) // Set database session variable db := c.MustGet("db").(*sql.DB) _, err := db.Exec("SELECT set_config('app.current_tenant', $1, true)", tenantID) if err != nil { c.AbortWithStatusJSON(500, gin.H{"error": "tenant_setup_failed"}) return } c.Next() } } ``` ### Usage Guidelines 1. **Always include tenant_id** in database queries 2. **Validate tenant access** in middleware 3. **Implement tenant-aware caching** 4. **Audit cross-tenant operations** 5. **Test tenant isolation thoroughly** --- ## Cache Implementation Details ### Architecture The KMS implements a layered caching system with multiple providers and configurable TTL policies. ### Cache Interface ```go // File: internal/cache/cache.go type CacheManager interface { Get(ctx context.Context, key string) ([]byte, error) Set(ctx context.Context, key string, value []byte, ttl time.Duration) error GetJSON(ctx context.Context, key string, dest interface{}) error SetJSON(ctx context.Context, key string, value interface{}, ttl time.Duration) error Delete(ctx context.Context, key string) error Clear(ctx context.Context) error Exists(ctx context.Context, key string) (bool, error) } ``` ### Redis Implementation ```go // File: internal/cache/redis.go type RedisCacheManager struct { client redis.Client keyPrefix string serializer JSONSerializer logger *zap.Logger } func (r *RedisCacheManager) GetJSON(ctx context.Context, key string, dest interface{}) error { prefixedKey := r.keyPrefix + key data, err := r.client.Get(ctx, prefixedKey).Bytes() if err != nil { if err == redis.Nil { return ErrCacheMiss } return fmt.Errorf("failed to get key %s: %w", prefixedKey, err) } return r.serializer.Deserialize(data, dest) } ``` ### Cache Key Management ```go type CacheKey string const ( KeyPrefixAuth = "auth:" KeyPrefixToken = "token:" KeyPrefixPermission = "perm:" KeyPrefixSession = "sess:" KeyPrefixApp = "app:" ) func CacheKey(prefix, suffix string) string { return fmt.Sprintf("%s%s", prefix, suffix) } ``` ### Usage Patterns #### Authentication Caching ```go // Cache authentication results for 5 minutes cacheKey := cache.CacheKey(cache.KeyPrefixAuth, fmt.Sprintf("%s:%s", userID, appID)) err := cacheManager.SetJSON(ctx, cacheKey, authResult, 5*time.Minute) ``` #### Token Revocation List ```go // Cache revoked tokens until their expiration revokedKey := cache.CacheKey(cache.KeyPrefixToken, "revoked:"+tokenID) err := cacheManager.Set(ctx, revokedKey, []byte("1"), tokenExpiry.Sub(time.Now())) ``` ### Configuration ```bash # Cache configuration CACHE_ENABLED=true CACHE_PROVIDER=redis # or memory REDIS_ADDR=localhost:6379 REDIS_PASSWORD= REDIS_DB=0 CACHE_DEFAULT_TTL=5m ``` --- ## Error Handling Framework ### Error Type Hierarchy ```go // File: internal/errors/errors.go type ErrorCode string const ( ErrorCodeValidation ErrorCode = "validation_error" ErrorCodeAuthentication ErrorCode = "authentication_error" ErrorCodeAuthorization ErrorCode = "authorization_error" ErrorCodeNotFound ErrorCode = "not_found" ErrorCodeConflict ErrorCode = "conflict" ErrorCodeInternal ErrorCode = "internal_error" ErrorCodeRateLimit ErrorCode = "rate_limit_exceeded" ErrorCodeBadRequest ErrorCode = "bad_request" ) type APIError struct { Code ErrorCode `json:"code"` Message string `json:"message"` Details interface{} `json:"details,omitempty"` HTTPStatus int `json:"-"` Cause error `json:"-"` } ``` ### Error Factory Functions ```go func NewValidationError(message string, details interface{}) *APIError { return &APIError{ Code: ErrorCodeValidation, Message: message, Details: details, HTTPStatus: http.StatusBadRequest, } } func NewAuthenticationError(message string) *APIError { return &APIError{ Code: ErrorCodeAuthentication, Message: message, HTTPStatus: http.StatusUnauthorized, } } ``` ### Error Handler Middleware ```go // File: internal/errors/secure_responses.go func (e *ErrorHandler) HandleError(c *gin.Context, err error) { var apiErr *APIError if errors.As(err, &apiErr) { // Log error with context e.logger.Error("API error", zap.String("error_code", string(apiErr.Code)), zap.String("message", apiErr.Message), zap.Int("http_status", apiErr.HTTPStatus), zap.Error(apiErr.Cause)) c.JSON(apiErr.HTTPStatus, gin.H{ "error": apiErr.Code, "message": apiErr.Message, "details": apiErr.Details, }) return } // Handle unexpected errors e.logger.Error("Unexpected error", zap.Error(err)) c.JSON(http.StatusInternalServerError, gin.H{ "error": ErrorCodeInternal, "message": "An internal error occurred", }) } ``` --- ## Validation System ### Validator Implementation ```go // File: internal/validation/validator.go type Validator struct { validator *validator.Validate logger *zap.Logger } func NewValidator(logger *zap.Logger) *Validator { v := validator.New() // Register custom validators v.RegisterValidation("app_id", validateAppID) v.RegisterValidation("token_type", validateTokenType) v.RegisterValidation("permission_scope", validatePermissionScope) return &Validator{ validator: v, logger: logger, } } ``` ### Custom Validation Rules ```go func validateAppID(fl validator.FieldLevel) bool { appID := fl.Field().String() // App ID format: domain.app (e.g., com.example.app) pattern := `^[a-z0-9]+(\.[a-z0-9]+)*\.[a-z0-9]+$` match, _ := regexp.MatchString(pattern, appID) return match && len(appID) >= 3 && len(appID) <= 100 } func validatePermissionScope(fl validator.FieldLevel) bool { scope := fl.Field().String() // Permission format: domain.action (e.g., app.read) pattern := `^[a-z_]+(\.[a-z_]+)*$` match, _ := regexp.MatchString(pattern, scope) return match && len(scope) >= 1 && len(scope) <= 50 } ``` ### Middleware Integration ```go // File: internal/middleware/validation.go func (v *ValidationMiddleware) ValidateJSON(schema interface{}) gin.HandlerFunc { return gin.HandlerFunc(func(c *gin.Context) { if err := c.ShouldBindJSON(schema); err != nil { var validationErrors []ValidationError if errs, ok := err.(validator.ValidationErrors); ok { for _, e := range errs { validationErrors = append(validationErrors, ValidationError{ Field: e.Field(), Message: e.Tag(), Value: e.Value(), }) } } apiErr := errors.NewValidationError("Request validation failed", validationErrors) v.errorHandler.HandleError(c, apiErr) return } c.Next() }) } ``` --- ## Metrics and Monitoring ### Prometheus Integration ```go // File: internal/metrics/metrics.go type Metrics struct { // HTTP metrics httpRequestsTotal *prometheus.CounterVec httpRequestDuration *prometheus.HistogramVec httpRequestsInFlight prometheus.Gauge // Auth metrics authAttemptsTotal *prometheus.CounterVec authSuccessTotal *prometheus.CounterVec authFailuresTotal *prometheus.CounterVec // Token metrics tokensIssuedTotal *prometheus.CounterVec tokenValidationsTotal *prometheus.CounterVec // Business metrics applicationsTotal prometheus.Gauge activeSessionsTotal prometheus.Gauge } ``` ### Metrics Collection ```go func (m *Metrics) RecordHTTPRequest(method, path string, statusCode int, duration time.Duration) { m.httpRequestsTotal.WithLabelValues(method, path, strconv.Itoa(statusCode)).Inc() m.httpRequestDuration.WithLabelValues(method, path).Observe(duration.Seconds()) } func (m *Metrics) RecordAuthAttempt(provider, result string) { m.authAttemptsTotal.WithLabelValues(provider, result).Inc() if result == "success" { m.authSuccessTotal.WithLabelValues(provider).Inc() } else { m.authFailuresTotal.WithLabelValues(provider).Inc() } } ``` ### Dashboard Configuration ```yaml # Grafana dashboard config panels: - title: "Request Rate" type: "graph" targets: - expr: "rate(http_requests_total[5m])" legendFormat: "{{method}} {{path}}" - title: "Authentication Success Rate" type: "stat" targets: - expr: "rate(auth_success_total[5m]) / rate(auth_attempts_total[5m]) * 100" legendFormat: "Success Rate %" - title: "Active Applications" type: "stat" targets: - expr: "applications_total" legendFormat: "Applications" ``` --- ## Database Migration System ### Migration Structure ``` migrations/ ├── 001_initial_schema.up.sql ├── 001_initial_schema.down.sql ├── 002_user_sessions.up.sql ├── 002_user_sessions.down.sql ├── 003_add_token_prefix.up.sql ├── 003_add_token_prefix.down.sql ├── 004_add_audit_events.up.sql └── 004_add_audit_events.down.sql ``` ### Migration Runner ```go // File: internal/database/postgres.go func RunMigrations(db *sql.DB, migrationPath string) error { driver, err := postgres.WithInstance(db, &postgres.Config{}) if err != nil { return fmt.Errorf("failed to create migration driver: %w", err) } m, err := migrate.NewWithDatabaseInstance( fmt.Sprintf("file://%s", migrationPath), "postgres", driver) if err != nil { return fmt.Errorf("failed to create migration instance: %w", err) } if err := m.Up(); err != nil && err != migrate.ErrNoChange { return fmt.Errorf("failed to run migrations: %w", err) } return nil } ``` ### Migration Best Practices 1. **Always create both up and down migrations** 2. **Test migrations on copy of production data** 3. **Make migrations idempotent** 4. **Add proper indexes for performance** 5. **Include rollback procedures** ### Example Migration ```sql -- 005_add_oauth_providers.up.sql CREATE TABLE IF NOT EXISTS oauth_providers ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), name VARCHAR(100) NOT NULL UNIQUE, client_id VARCHAR(255) NOT NULL, client_secret_encrypted TEXT NOT NULL, authorization_url TEXT NOT NULL, token_url TEXT NOT NULL, user_info_url TEXT NOT NULL, scopes TEXT[] DEFAULT ARRAY['openid', 'profile', 'email'], enabled BOOLEAN DEFAULT true, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW() ); CREATE INDEX idx_oauth_providers_name ON oauth_providers(name); CREATE INDEX idx_oauth_providers_enabled ON oauth_providers(enabled) WHERE enabled = true; ``` --- ## Frontend Architecture ### Component Structure ``` src/ ├── components/ │ ├── Applications.tsx # Application management │ ├── Tokens.tsx # Token operations │ ├── Users.tsx # User management │ ├── Audit.tsx # Audit log viewer │ ├── Dashboard.tsx # Main dashboard │ ├── Login.tsx # Authentication │ ├── TokenTester.tsx # Token testing utility │ └── TokenTesterCallback.tsx ├── contexts/ │ └── AuthContext.tsx # Authentication state ├── services/ │ └── apiService.ts # API client ├── App.tsx # Main application └── index.tsx # Entry point ``` ### API Service Implementation ```typescript // File: kms-frontend/src/services/apiService.ts class APIService { private baseURL: string; private token: string | null = null; constructor(baseURL: string) { this.baseURL = baseURL; } async request(endpoint: string, options: RequestInit = {}): Promise { const url = `${this.baseURL}${endpoint}`; const headers = { 'Content-Type': 'application/json', 'X-User-Email': this.getUserEmail(), ...options.headers, }; const response = await fetch(url, { ...options, headers, }); if (!response.ok) { const error = await response.json().catch(() => ({})); throw new APIError(error.message || 'Request failed', response.status); } return response.json(); } // Application management async getApplications(): Promise { return this.request('/api/applications'); } // Audit log access async getAuditEvents(params: AuditQueryParams): Promise { const queryString = new URLSearchParams(params).toString(); return this.request(`/api/audit/events?${queryString}`); } } ``` ### Authentication Context ```typescript // File: kms-frontend/src/contexts/AuthContext.tsx interface AuthContextType { user: User | null; login: (email: string) => Promise; logout: () => void; isAuthenticated: boolean; isLoading: boolean; } export const AuthContext = React.createContext(null); export const AuthProvider: React.FC<{ children: React.ReactNode }> = ({ children }) => { const [user, setUser] = useState(null); const [isLoading, setIsLoading] = useState(true); const login = async (email: string) => { try { setIsLoading(true); const response = await apiService.login(email); setUser({ email, token: response.token }); localStorage.setItem('kms_user', JSON.stringify({ email })); } catch (error) { throw error; } finally { setIsLoading(false); } }; // ... rest of implementation }; ``` --- ## Configuration Management ### Configuration Interface ```go // File: internal/config/config.go type ConfigProvider interface { GetString(key string) string GetInt(key string) int GetBool(key string) bool GetDuration(key string) time.Duration GetStringSlice(key string) []string IsSet(key string) bool Validate() error GetDatabaseDSN() string GetServerAddress() string IsDevelopment() bool IsProduction() bool } ``` ### Configuration Validation ```go func (c *Config) Validate() error { var errors []string // Required configuration required := []string{ "INTERNAL_HMAC_KEY", "JWT_SECRET", "AUTH_SIGNING_KEY", "DB_HOST", "DB_NAME", } for _, key := range required { if !c.IsSet(key) { errors = append(errors, fmt.Sprintf("required configuration %s is not set", key)) } } // Validate key lengths if len(c.GetString("INTERNAL_HMAC_KEY")) < 32 { errors = append(errors, "INTERNAL_HMAC_KEY must be at least 32 characters") } if len(errors) > 0 { return fmt.Errorf("configuration validation failed: %s", strings.Join(errors, ", ")) } return nil } ``` ### Environment Configuration ```bash # Security Configuration INTERNAL_HMAC_KEY=3924f352b7ea63b27db02bf4b0014f2961a5d2f7c27643853a4581bb3a5457cb JWT_SECRET=7f5e11d55e957988b00ce002418680af384219ef98c50d08cbbbdd541978450c AUTH_SIGNING_KEY=484f921b39c383e6b3e0cc5a7cef3c2cec3d7c8d474ab5102891dc4c2bf63a68 # Database Configuration DB_HOST=postgres DB_PORT=5432 DB_NAME=kms DB_USER=postgres DB_PASSWORD=postgres # Feature Flags RATE_LIMIT_ENABLED=true CACHE_ENABLED=false METRICS_ENABLED=true SAML_ENABLED=false ``` --- ## Implementation Best Practices ### Code Organization 1. **Follow clean architecture principles** 2. **Use dependency injection throughout** 3. **Implement comprehensive error handling** 4. **Add structured logging to all components** 5. **Write unit tests for business logic** ### Security Guidelines 1. **Always validate input at API boundaries** 2. **Use parameterized database queries** 3. **Implement proper authentication and authorization** 4. **Log all security-relevant events** 5. **Follow principle of least privilege** ### Performance Considerations 1. **Implement caching for frequently accessed data** 2. **Use database indexes appropriately** 3. **Monitor and optimize slow queries** 4. **Implement proper connection pooling** 5. **Use asynchronous operations where beneficial** ### Testing Strategy 1. **Unit tests for business logic** 2. **Integration tests for API endpoints** 3. **End-to-end tests for critical workflows** 4. **Load testing for performance validation** 5. **Security testing for vulnerability assessment** --- *This document serves as a comprehensive implementation guide for the KMS system. It should be updated as the system evolves and new features are added.*