Debugging und Troubleshooting

Stellen Sie sich vor, Sie sind ein Detektiv, der einen rätselhaften Fall lösen muss. Sie haben Hinweise, aber diese sind über verschiedene Schauplätze verstreut, manchmal widersprüchlich und oft unvollständig. Genau so fühlt sich Debugging in komplexen NestJS-Anwendungen an. Der Unterschied ist, dass Sie als Entwickler nicht nur der Detektiv sind, sondern auch derjenige, der die Werkzeuge und Strategien entwickeln kann, um solche Rätsel systematisch zu lösen.

Debugging und Troubleshooting sind Fähigkeiten, die über den reinen Code hinausgehen. Sie erfordern analytisches Denken, Geduld und vor allem die richtigen Werkzeuge und Techniken. In modernen NestJS-Anwendungen, die oft aus dutzenden von Modulen bestehen, mit externen APIs kommunizieren und in komplexen Infrastrukturen laufen, wird effektives Debugging zu einer kritischen Kompetenz.

Was macht Debugging in NestJS-Anwendungen besonders herausfordernd? Die Antwort liegt in der Architektur selbst. Dependency Injection, Decorators, Guards, Interceptors und Pipes schaffen zwar eine elegante und modulare Struktur, aber sie können auch dazu führen, dass der Ausführungsfluss nicht immer offensichtlich ist. Ein Request durchläuft möglicherweise mehrere Layer, bevor er die eigentliche Business Logic erreicht, und an jedem Punkt können Probleme auftreten.

Denken Sie an eine NestJS-Anwendung wie an ein Orchester. Wenn ein Ton falsch klingt, müssen Sie nicht nur herausfinden, welches Instrument das Problem verursacht, sondern auch verstehen, wie alle Instrumente zusammenarbeiten und sich gegenseitig beeinflussen. Diese ganzheitliche Sichtweise ist der Schlüssel zu erfolgreichem Debugging.

31.1 Debugging-Strategien für NestJS

Die Grundlage jeder erfolgreichen Debugging-Session ist eine systematische Herangehensweise. Wie ein Wissenschaftler, der ein Experiment durchführt, sollten Sie Hypothesen aufstellen, diese testen und basierend auf den Ergebnissen Ihre nächsten Schritte planen. Diese methodische Herangehensweise verhindert, dass Sie in die Falle des “Trial-and-Error”-Debuggings tappen, bei dem Sie zufällige Änderungen vornehmen und hoffen, dass das Problem verschwindet.

Der erste Schritt einer effektiven Debugging-Strategie ist das Verstehen des Problemkontexts. Tritt das Problem nur unter bestimmten Bedingungen auf? Betrifft es bestimmte Benutzer oder Daten? Ist es ein neues Problem oder bestand es schon länger? Diese Fragen helfen Ihnen, den Suchbereich einzugrenzen und Ihre Debugging-Bemühungen zu fokussieren.

Eine bewährte Technik ist das “Divide and Conquer”-Prinzip. Anstatt zu versuchen, das gesamte System auf einmal zu verstehen, teilen Sie es in kleinere, handhabbare Teile auf. In NestJS bedeutet das oft, Module für Module zu isolieren und zu testen.

// debugging/src/utils/debug-logger.service.ts
// Ein spezialisierter Logger für Debugging-Zwecke
import { Injectable, Logger, LogLevel } from '@nestjs/common';
import { ConfigService } from '@nestjs/config';

interface DebugContext {
  requestId?: string;
  userId?: string;
  module?: string;
  method?: string;
  metadata?: Record<string, any>;
}

@Injectable()
export class DebugLoggerService extends Logger {
  private readonly isDebugMode: boolean;
  private readonly debugModules: Set<string>;

  constructor(private readonly configService: ConfigService) {
    super('DebugLogger');
    
    // Debug-Modus aus Umgebungsvariablen bestimmen
    this.isDebugMode = this.configService.get<string>('NODE_ENV') === 'development' ||
                       this.configService.get<boolean>('DEBUG_MODE', false);
    
    // Spezifische Module für Debugging aktivieren
    const debugModulesStr = this.configService.get<string>('DEBUG_MODULES', '');
    this.debugModules = new Set(debugModulesStr.split(',').filter(Boolean));
  }

  /**
   * Erweiterte Debug-Logging-Methode mit Kontext
   * Diese Methode hilft dabei, den Ausführungsfluss zu verfolgen
   */
  debugWithContext(message: string, context: DebugContext, data?: any): void {
    if (!this.shouldLog(context.module)) {
      return;
    }

    const enrichedMessage = this.buildContextualMessage(message, context);
    
    if (data) {
      // Komplexe Objekte werden strukturiert ausgegeben für bessere Lesbarkeit
      this.debug(enrichedMessage);
      this.debug(`Data: ${JSON.stringify(data, null, 2)}`);
    } else {
      this.debug(enrichedMessage);
    }
  }

  /**
   * Verfolgt den Eintritt in eine Methode
   * Nützlich für das Verstehen des Ausführungsflusses
   */
  traceMethodEntry(className: string, methodName: string, params?: any[], context?: DebugContext): void {
    if (!this.isDebugMode) return;

    const baseMessage = `→ Entering ${className}.${methodName}()`;
    const fullContext = { ...context, module: className, method: methodName };
    
    if (params && params.length > 0) {
      this.debugWithContext(baseMessage, fullContext, { parameters: params });
    } else {
      this.debugWithContext(baseMessage, fullContext);
    }
  }

  /**
   * Verfolgt den Austritt aus einer Methode
   * Hilfreich für das Verstehen von Rückgabewerten und Ausführungszeit
   */
  traceMethodExit(className: string, methodName: string, result?: any, executionTime?: number, context?: DebugContext): void {
    if (!this.isDebugMode) return;

    const baseMessage = `← Exiting ${className}.${methodName}()`;
    const fullContext = { ...context, module: className, method: methodName };
    
    const data: any = {};
    if (result !== undefined) data.result = result;
    if (executionTime !== undefined) data.executionTimeMs = executionTime;
    
    this.debugWithContext(baseMessage, fullContext, Object.keys(data).length > 0 ? data : undefined);
  }

  /**
   * Protokolliert Datenbankoperationen für Performance-Analyse
   */
  traceQuery(query: string, parameters?: any[], executionTime?: number, context?: DebugContext): void {
    if (!this.shouldLog('database')) return;

    const message = `🗄️ Database Query`;
    const fullContext = { ...context, module: 'database' };
    
    this.debugWithContext(message, fullContext, {
      query: query.replace(/\s+/g, ' ').trim(), // Normalisiere Whitespace
      parameters,
      executionTimeMs: executionTime,
    });
  }

  /**
   * Protokolliert HTTP-Requests zu externen Services
   */
  traceHttpRequest(method: string, url: string, headers?: any, body?: any, context?: DebugContext): void {
    if (!this.shouldLog('http')) return;

    const message = `🌐 HTTP Request: ${method} ${url}`;
    const fullContext = { ...context, module: 'http' };
    
    this.debugWithContext(message, fullContext, {
      method,
      url,
      headers: this.sanitizeHeaders(headers),
      body: this.sanitizeBody(body),
    });
  }

  /**
   * Protokolliert Fehler mit zusätzlichem Kontext
   * Essentiell für das Verstehen von Fehlerzuständen
   */
  traceError(error: Error, context: DebugContext, additionalData?: any): void {
    const message = `❌ Error in ${context.module || 'unknown'}.${context.method || 'unknown'}`;
    
    this.error(this.buildContextualMessage(message, context));
    this.error(`Error Type: ${error.name}`);
    this.error(`Error Message: ${error.message}`);
    
    if (error.stack) {
      this.error(`Stack Trace:\n${error.stack}`);
    }
    
    if (additionalData) {
      this.error(`Additional Data: ${JSON.stringify(additionalData, null, 2)}`);
    }
  }

  private shouldLog(module?: string): boolean {
    if (!this.isDebugMode) return false;
    if (!module) return true;
    if (this.debugModules.size === 0) return true;
    return this.debugModules.has(module) || this.debugModules.has('*');
  }

  private buildContextualMessage(message: string, context: DebugContext): string {
    const parts = [message];
    
    if (context.requestId) parts.push(`[Req: ${context.requestId}]`);
    if (context.userId) parts.push(`[User: ${context.userId}]`);
    if (context.metadata) {
      const metaStr = Object.entries(context.metadata)
        .map(([key, value]) => `${key}=${value}`)
        .join(', ');
      parts.push(`[${metaStr}]`);
    }
    
    return parts.join(' ');
  }

  private sanitizeHeaders(headers: any): any {
    if (!headers) return undefined;
    
    const sanitized = { ...headers };
    // Entferne sensible Header für Logging
    delete sanitized.authorization;
    delete sanitized.cookie;
    delete sanitized['x-api-key'];
    
    return sanitized;
  }

  private sanitizeBody(body: any): any {
    if (!body) return undefined;
    if (typeof body !== 'object') return body;
    
    const sanitized = { ...body };
    // Entferne sensible Felder
    delete sanitized.password;
    delete sanitized.token;
    delete sanitized.secret;
    
    return sanitized;
  }
}

Ein wichtiger Aspekt des Debuggings in NestJS ist das Verstehen der Decorator-Pipeline. Jeder Request durchläuft mehrere Stufen: Guards, Interceptors, Pipes und schließlich die eigentliche Route Handler. Ein Decorator, der speziell für das Debugging entwickelt wurde, kann Ihnen helfen, diesen Fluss zu visualisieren.

// debugging/src/decorators/debug-trace.decorator.ts
// Ein Decorator für automatisches Method-Tracing
import { SetMetadata } from '@nestjs/common';

export const DEBUG_TRACE_KEY = 'debug_trace';

/**
 * Decorator für automatisches Tracing von Methodenaufrufen
 * Verwenden Sie diesen Decorator auf Controller-Methoden oder Service-Methoden,
 * um automatisch Entry/Exit-Logs zu erhalten
 */
export const DebugTrace = (options?: {
  logParams?: boolean;
  logResult?: boolean;
  logExecutionTime?: boolean;
}) => {
  const defaultOptions = {
    logParams: true,
    logResult: true,
    logExecutionTime: true,
    ...options,
  };
  
  return SetMetadata(DEBUG_TRACE_KEY, defaultOptions);
};

// debugging/src/interceptors/debug-trace.interceptor.ts
// Der zugehörige Interceptor für den DebugTrace Decorator
import { Injectable, NestInterceptor, ExecutionContext, CallHandler } from '@nestjs/common';
import { Reflector } from '@nestjs/core';
import { Observable } from 'rxjs';
import { tap } from 'rxjs/operators';
import { DebugLoggerService } from '../utils/debug-logger.service';
import { DEBUG_TRACE_KEY } from '../decorators/debug-trace.decorator';

@Injectable()
export class DebugTraceInterceptor implements NestInterceptor {
  constructor(
    private readonly debugLogger: DebugLoggerService,
    private readonly reflector: Reflector,
  ) {}

  intercept(context: ExecutionContext, next: CallHandler): Observable<any> {
    // Prüfe, ob die Methode mit @DebugTrace markiert ist
    const traceOptions = this.reflector.get<any>(DEBUG_TRACE_KEY, context.getHandler());
    
    if (!traceOptions) {
      return next.handle();
    }

    const className = context.getClass().name;
    const methodName = context.getHandler().name;
    const request = context.switchToHttp().getRequest();
    
    // Extrahiere relevante Kontext-Informationen
    const debugContext = {
      requestId: request.requestId || request.headers['x-request-id'],
      userId: request.user?.id,
      module: className,
      method: methodName,
    };

    const startTime = Date.now();
    
    // Protokolliere Methoden-Eintritt
    if (traceOptions.logParams) {
      const params = this.extractMethodParameters(context);
      this.debugLogger.traceMethodEntry(className, methodName, params, debugContext);
    } else {
      this.debugLogger.traceMethodEntry(className, methodName, undefined, debugContext);
    }

    return next.handle().pipe(
      tap({
        next: (result) => {
          const executionTime = traceOptions.logExecutionTime ? Date.now() - startTime : undefined;
          const resultToLog = traceOptions.logResult ? result : undefined;
          
          this.debugLogger.traceMethodExit(className, methodName, resultToLog, executionTime, debugContext);
        },
        error: (error) => {
          const executionTime = Date.now() - startTime;
          
          this.debugLogger.traceError(error, debugContext, {
            executionTimeMs: executionTime,
            methodParameters: traceOptions.logParams ? this.extractMethodParameters(context) : undefined,
          });
        },
      }),
    );
  }

  private extractMethodParameters(context: ExecutionContext): any[] {
    const request = context.switchToHttp().getRequest();
    const params = [];
    
    // Extrahiere verschiedene Arten von Parametern
    if (request.params && Object.keys(request.params).length > 0) {
      params.push({ type: 'params', data: request.params });
    }
    
    if (request.query && Object.keys(request.query).length > 0) {
      params.push({ type: 'query', data: request.query });
    }
    
    if (request.body && Object.keys(request.body).length > 0) {
      // Sensible Daten aus Body entfernen für Logging
      const sanitizedBody = this.sanitizeObject(request.body);
      params.push({ type: 'body', data: sanitizedBody });
    }
    
    return params;
  }

  private sanitizeObject(obj: any): any {
    if (typeof obj !== 'object' || obj === null) {
      return obj;
    }
    
    const sanitized = { ...obj };
    const sensitiveFields = ['password', 'token', 'secret', 'apiKey', 'authorization'];
    
    for (const field of sensitiveFields) {
      if (field in sanitized) {
        sanitized[field] = '[REDACTED]';
      }
    }
    
    return sanitized;
  }
}

Das Verstehen von NestJS Exception Flows ist ein weiterer kritischer Aspekt des Debuggings. Exceptions können an verschiedenen Punkten in der Request-Pipeline auftreten, und es ist wichtig zu verstehen, wie sie sich durch das System bewegen.

31.2 Performance Profiling

Performance-Probleme sind oft wie unsichtbare Krankheiten - die Symptome sind offensichtlich (langsame Response-Zeiten, hohe CPU-Auslastung), aber die Ursachen können tief im Code verborgen sein. Performance Profiling ist der Prozess, diese versteckten Engpässe aufzuspüren und zu verstehen, wo Ihre Anwendung Zeit und Ressourcen verbraucht.

Denken Sie an Performance Profiling wie an eine Gesundheitsuntersuchung für Ihre Anwendung. Genau wie ein Arzt verschiedene Tests durchführt, um herauszufinden, warum Sie sich unwohl fühlen, verwenden Sie verschiedene Profiling-Techniken, um herauszufinden, warum Ihre Anwendung langsam ist.

Der erste Schritt beim Performance Profiling ist das Sammeln von Baseline-Metriken. Ohne zu wissen, wie sich Ihre Anwendung unter normalen Bedingungen verhält, können Sie nicht erkennen, was abnormal ist.

// profiling/src/services/performance-profiler.service.ts
// Ein umfassendes Performance-Profiling-System
import { Injectable, Logger } from '@nestjs/common';
import { performance, PerformanceObserver } from 'perf_hooks';
import * as os from 'os';
import * as process from 'process';

interface PerformanceMetrics {
  timestamp: Date;
  cpuUsage: NodeJS.CpuUsage;
  memoryUsage: NodeJS.MemoryUsage;
  systemLoad: number[];
  eventLoopDelay: number;
  activeHandles: number;
  activeRequests: number;
}

interface MethodPerformance {
  methodName: string;
  className: string;
  executionTime: number;
  memoryDelta: number;
  cpuTime: number;
  callCount: number;
  lastCalled: Date;
}

@Injectable()
export class PerformanceProfilerService {
  private readonly logger = new Logger(PerformanceProfilerService.name);
  
  // Speichert Performance-Metriken für verschiedene Methoden
  private readonly methodMetrics = new Map<string, MethodPerformance>();
  
  // Sammelt System-Metriken über Zeit
  private readonly systemMetrics: PerformanceMetrics[] = [];
  private readonly maxMetricsHistory = 1000; // Speichere nur die letzten 1000 Einträge
  
  // Performance Observer für automatische Messungen
  private performanceObserver: PerformanceObserver;

  constructor() {
    this.initializePerformanceObserver();
    this.startSystemMetricsCollection();
  }

  /**
   * Startet die Profiling-Messung für eine spezifische Operation
   * Verwenden Sie dies am Anfang einer Methode, die Sie profilen möchten
   */
  startProfiling(className: string, methodName: string): string {
    const profilingId = `${className}.${methodName}_${Date.now()}_${Math.random()}`;
    
    // Markiere den Start-Zeitpunkt
    performance.mark(`${profilingId}_start`);
    
    return profilingId;
  }

  /**
   * Beendet die Profiling-Messung und protokolliert die Ergebnisse
   * Verwenden Sie dies am Ende der Methode, zusammen mit der startProfiling-ID
   */
  endProfiling(profilingId: string, additionalData?: Record<string, any>): PerformanceMetrics {
    const endMark = `${profilingId}_end`;
    const measureName = `${profilingId}_duration`;
    
    // Markiere den End-Zeitpunkt
    performance.mark(endMark);
    
    // Messe die Dauer zwischen Start und Ende
    performance.measure(measureName, `${profilingId}_start`, endMark);
    
    const measure = performance.getEntriesByName(measureName)[0];
    const executionTime = measure.duration;
    
    // Extrahiere Klassen- und Methodennamen aus der Profiling-ID
    const [classAndMethod] = profilingId.split('_');
    const [className, methodName] = classAndMethod.split('.');
    
    // Aktualisiere Methoden-Metriken
    this.updateMethodMetrics(className, methodName, executionTime);
    
    // Sammle aktuelle System-Metriken
    const currentMetrics = this.getCurrentSystemMetrics();
    
    // Protokolliere Performance-Daten für spätere Analyse
    this.logger.debug(`Performance: ${classAndMethod} executed in ${executionTime.toFixed(2)}ms`, {
      className,
      methodName,
      executionTime,
      systemMetrics: currentMetrics,
      additionalData,
    });
    
    // Aufräumen: Entferne Performance-Marks
    performance.clearMarks(`${profilingId}_start`);
    performance.clearMarks(endMark);
    performance.clearMeasures(measureName);
    
    return currentMetrics;
  }

  /**
   * Erstellt ein detailliertes Performance-Profil der Anwendung
   * Nützlich für die Analyse von Performance-Trends und Engpässen
   */
  getPerformanceProfile(): {
    systemMetrics: PerformanceMetrics;
    methodMetrics: MethodPerformance[];
    topSlowMethods: MethodPerformance[];
    memoryHotspots: MethodPerformance[];
    recommendations: string[];
  } {
    const currentSystemMetrics = this.getCurrentSystemMetrics();
    const allMethodMetrics = Array.from(this.methodMetrics.values());
    
    // Identifiziere die langsamsten Methoden
    const topSlowMethods = allMethodMetrics
      .sort((a, b) => b.executionTime - a.executionTime)
      .slice(0, 10);
    
    // Identifiziere Memory-Hotspots
    const memoryHotspots = allMethodMetrics
      .filter(m => m.memoryDelta > 0)
      .sort((a, b) => b.memoryDelta - a.memoryDelta)
      .slice(0, 10);
    
    // Generiere Performance-Empfehlungen
    const recommendations = this.generatePerformanceRecommendations(currentSystemMetrics, allMethodMetrics);
    
    return {
      systemMetrics: currentSystemMetrics,
      methodMetrics: allMethodMetrics,
      topSlowMethods,
      memoryHotspots,
      recommendations,
    };
  }

  /**
   * Analysiert Performance-Trends über Zeit
   * Hilfreich für das Erkennen von Performance-Degradationen
   */
  getPerformanceTrends(timeWindowMinutes: number = 60): {
    averageExecutionTime: number;
    memoryTrend: 'increasing' | 'decreasing' | 'stable';
    cpuTrend: 'increasing' | 'decreasing' | 'stable';
    alertsGenerated: string[];
  } {
    const cutoffTime = new Date(Date.now() - timeWindowMinutes * 60 * 1000);
    const recentMetrics = this.systemMetrics.filter(m => m.timestamp > cutoffTime);
    
    if (recentMetrics.length < 2) {
      return {
        averageExecutionTime: 0,
        memoryTrend: 'stable',
        cpuTrend: 'stable',
        alertsGenerated: ['Insufficient data for trend analysis'],
      };
    }
    
    // Berechne Durchschnittswerte
    const avgExecutionTime = this.calculateAverageExecutionTime(recentMetrics);
    
    // Analysiere Trends
    const memoryTrend = this.analyzeTrend(recentMetrics.map(m => m.memoryUsage.heapUsed));
    const cpuTrend = this.analyzeTrend(recentMetrics.map(m => m.cpuUsage.user + m.cpuUsage.system));
    
    // Generiere Alerts bei problematischen Trends
    const alerts = this.generatePerformanceAlerts(recentMetrics);
    
    return {
      averageExecutionTime: avgExecutionTime,
      memoryTrend,
      cpuTrend,
      alertsGenerated: alerts,
    };
  }

  private initializePerformanceObserver(): void {
    this.performanceObserver = new PerformanceObserver((list) => {
      const entries = list.getEntries();
      
      for (const entry of entries) {
        // Protokolliere automatisch langsame Operationen
        if (entry.duration > 100) { // Mehr als 100ms
          this.logger.warn(`Slow operation detected: ${entry.name} took ${entry.duration.toFixed(2)}ms`);
        }
      }
    });
    
    // Überwache verschiedene Performance-Entry-Typen
    this.performanceObserver.observe({ entryTypes: ['measure', 'function'] });
  }

  private startSystemMetricsCollection(): void {
    // Sammle System-Metriken alle 30 Sekunden
    setInterval(() => {
      const metrics = this.getCurrentSystemMetrics();
      
      this.systemMetrics.push(metrics);
      
      // Halte nur die letzten N Einträge
      if (this.systemMetrics.length > this.maxMetricsHistory) {
        this.systemMetrics.shift();
      }
      
      // Protokolliere kritische System-Zustände
      this.checkSystemHealth(metrics);
      
    }, 30000);
  }

  private getCurrentSystemMetrics(): PerformanceMetrics {
    return {
      timestamp: new Date(),
      cpuUsage: process.cpuUsage(),
      memoryUsage: process.memoryUsage(),
      systemLoad: os.loadavg(),
      eventLoopDelay: this.measureEventLoopDelay(),
      activeHandles: (process as any)._getActiveHandles().length,
      activeRequests: (process as any)._getActiveRequests().length,
    };
  }

  private measureEventLoopDelay(): number {
    // Vereinfachte Event Loop Delay-Messung
    // In einer produktiven Anwendung würden Sie ein spezialisiertes Paket verwenden
    const start = process.hrtime.bigint();
    setImmediate(() => {
      const delay = Number(process.hrtime.bigint() - start) / 1000000; // Konvertiere zu Millisekunden
      return delay;
    });
    return 0; // Placeholder für diese vereinfachte Implementierung
  }

  private updateMethodMetrics(className: string, methodName: string, executionTime: number): void {
    const key = `${className}.${methodName}`;
    const existing = this.methodMetrics.get(key);
    
    if (existing) {
      // Aktualisiere bestehende Metriken
      existing.executionTime = (existing.executionTime + executionTime) / 2; // Gleitender Durchschnitt
      existing.callCount += 1;
      existing.lastCalled = new Date();
    } else {
      // Erstelle neue Metriken
      this.methodMetrics.set(key, {
        methodName,
        className,
        executionTime,
        memoryDelta: 0, // Würde in einer vollständigen Implementierung berechnet
        cpuTime: 0, // Würde in einer vollständigen Implementierung berechnet
        callCount: 1,
        lastCalled: new Date(),
      });
    }
  }

  private checkSystemHealth(metrics: PerformanceMetrics): void {
    const memoryUsagePercent = (metrics.memoryUsage.heapUsed / metrics.memoryUsage.heapTotal) * 100;
    const loadAverage = metrics.systemLoad[0]; // 1-Minuten-Load-Average
    
    // Speicher-Warnung
    if (memoryUsagePercent > 85) {
      this.logger.warn(`High memory usage detected: ${memoryUsagePercent.toFixed(1)}%`);
    }
    
    // CPU-Load-Warnung
    const cpuCount = os.cpus().length;
    if (loadAverage > cpuCount * 0.8) {
      this.logger.warn(`High CPU load detected: ${loadAverage.toFixed(2)} (${cpuCount} cores available)`);
    }
    
    // Event Loop-Blockierung-Warnung
    if (metrics.eventLoopDelay > 50) {
      this.logger.warn(`Event loop delay detected: ${metrics.eventLoopDelay.toFixed(2)}ms`);
    }
  }

  private calculateAverageExecutionTime(metrics: PerformanceMetrics[]): number {
    // Diese Implementierung ist vereinfacht
    // In einer echten Anwendung würden Sie spezifische Method-Execution-Times verfolgen
    return metrics.reduce((sum, m) => sum + m.eventLoopDelay, 0) / metrics.length;
  }

  private analyzeTrend(values: number[]): 'increasing' | 'decreasing' | 'stable' {
    if (values.length < 3) return 'stable';
    
    const first = values.slice(0, values.length / 3).reduce((a, b) => a + b) / (values.length / 3);
    const last = values.slice(-values.length / 3).reduce((a, b) => a + b) / (values.length / 3);
    
    const percentChange = ((last - first) / first) * 100;
    
    if (percentChange > 10) return 'increasing';
    if (percentChange < -10) return 'decreasing';
    return 'stable';
  }

  private generatePerformanceRecommendations(
    systemMetrics: PerformanceMetrics,
    methodMetrics: MethodPerformance[]
  ): string[] {
    const recommendations: string[] = [];
    
    const memoryUsagePercent = (systemMetrics.memoryUsage.heapUsed / systemMetrics.memoryUsage.heapTotal) * 100;
    
    if (memoryUsagePercent > 80) {
      recommendations.push('Consider implementing memory caching strategies to reduce heap usage');
    }
    
    const slowMethods = methodMetrics.filter(m => m.executionTime > 100);
    if (slowMethods.length > 0) {
      recommendations.push(`Optimize ${slowMethods.length} methods with execution times > 100ms`);
    }
    
    if (systemMetrics.activeHandles > 100) {
      recommendations.push('High number of active handles detected - check for potential resource leaks');
    }
    
    return recommendations;
  }

  private generatePerformanceAlerts(metrics: PerformanceMetrics[]): string[] {
    const alerts: string[] = [];
    
    const latestMetrics = metrics[metrics.length - 1];
    const memoryUsagePercent = (latestMetrics.memoryUsage.heapUsed / latestMetrics.memoryUsage.heapTotal) * 100;
    
    if (memoryUsagePercent > 90) {
      alerts.push('CRITICAL: Memory usage above 90%');
    }
    
    if (latestMetrics.systemLoad[0] > os.cpus().length) {
      alerts.push('WARNING: System load exceeds CPU core count');
    }
    
    return alerts;
  }
}

Die Integration von Performance Profiling in Ihre bestehende NestJS-Anwendung sollte so nahtlos wie möglich erfolgen. Ein Decorator-basierter Ansatz ermöglicht es Ihnen, Profiling selektiv auf kritische Methoden anzuwenden, ohne den bestehenden Code zu verändern.

31.3 Memory Leak Detection

Memory Leaks sind wie undichte Rohre in Ihrem Haus - sie mögen anfangs unbemerkt bleiben, aber über Zeit können sie ernsthafte Schäden verursachen. In Node.js-Anwendungen können Memory Leaks dazu führen, dass der Speicherverbrauch kontinuierlich steigt, bis die Anwendung schließlich abstürzt oder vom System beendet wird.

Das Tückische an Memory Leaks ist, dass sie oft erst in der Produktion unter realen Lasten sichtbar werden. In der Entwicklungsumgebung, wo Requests sporadisch sind und die Anwendung regelmäßig neu gestartet wird, bleiben sie meist unentdeckt.

Verstehen Sie zunächst, wie Garbage Collection in Node.js funktioniert. Node.js verwendet die V8 JavaScript Engine, die automatisch Speicher freigibt, der nicht mehr erreichbar ist. Ein Memory Leak entsteht, wenn Objekte im Speicher verbleiben, obwohl sie nicht mehr benötigt werden, aber trotzdem von irgendwo referenziert werden.

// memory-analysis/src/services/memory-leak-detector.service.ts
// Ein System zur Erkennung und Analyse von Memory Leaks
import { Injectable, Logger, OnModuleInit } from '@nestjs/common';
import * as v8 from 'v8';
import * as fs from 'fs';
import * as path from 'path';

interface MemorySnapshot {
  timestamp: Date;
  heapUsed: number;
  heapTotal: number;
  external: number;
  rss: number;
  arrayBuffers: number;
  objectCounts: Map<string, number>;
}

interface LeakDetectionResult {
  isLeakDetected: boolean;
  leakRate: number; // MB per hour
  suspiciousObjects: string[];
  recommendations: string[];
  snapshotPath?: string;
}

@Injectable()
export class MemoryLeakDetectorService implements OnModuleInit {
  private readonly logger = new Logger(MemoryLeakDetectorService.name);
  
  // Speichert Memory-Snapshots über Zeit
  private readonly memorySnapshots: MemorySnapshot[] = [];
  private readonly maxSnapshots = 100;
  
  // Tracking für spezifische Objekt-Typen
  private readonly objectCounters = new Map<string, number>();
  
  // Konfiguration für Leak-Detection
  private readonly leakDetectionConfig = {
    snapshotInterval: 5 * 60 * 1000, // 5 Minuten
    leakThreshold: 10, // MB pro Stunde
    minSnapshotsForDetection: 6, // Mindestens 30 Minuten Daten
  };

  async onModuleInit() {
    // Starte automatische Memory-Überwachung
    this.startMemoryMonitoring();
    
    // Registriere Process-Event-Handler für Speicher-Warnungen
    this.registerMemoryWarningHandlers();
  }

  /**
   * Erstellt einen detaillierten Memory-Snapshot
   * Diese Methode sammelt umfassende Informationen über den aktuellen Speicherzustand
   */
  async createMemorySnapshot(): Promise<MemorySnapshot> {
    // Force Garbage Collection für genauere Messungen (nur in Development)
    if (process.env.NODE_ENV === 'development' && global.gc) {
      global.gc();
    }
    
    const memoryUsage = process.memoryUsage();
    const heapStatistics = v8.getHeapStatistics();
    
    // Analysiere Objekt-Typen im Heap
    const objectCounts = await this.analyzeObjectTypes();
    
    const snapshot: MemorySnapshot = {
      timestamp: new Date(),
      heapUsed: memoryUsage.heapUsed,
      heapTotal: memoryUsage.heapTotal,
      external: memoryUsage.external,
      rss: memoryUsage.rss,
      arrayBuffers: memoryUsage.arrayBuffers,
      objectCounts,
    };
    
    // Speichere Snapshot in der Historie
    this.memorySnapshots.push(snapshot);
    
    // Halte nur die letzten N Snapshots
    if (this.memorySnapshots.length > this.maxSnapshots) {
      this.memorySnapshots.shift();
    }
    
    this.logger.debug('Memory snapshot created', {
      heapUsedMB: (snapshot.heapUsed / 1024 / 1024).toFixed(2),
      heapTotalMB: (snapshot.heapTotal / 1024 / 1024).toFixed(2),
      rssMB: (snapshot.rss / 1024 / 1024).toFixed(2),
    });
    
    return snapshot;
  }

  /**
   * Analysiert Memory-Trends und erkennt potentielle Leaks
   * Verwendet statistische Analyse um zwischen normalen Schwankungen und echten Leaks zu unterscheiden
   */
  async detectMemoryLeaks(): Promise<LeakDetectionResult> {
    if (this.memorySnapshots.length < this.leakDetectionConfig.minSnapshotsForDetection) {
      return {
        isLeakDetected: false,
        leakRate: 0,
        suspiciousObjects: [],
        recommendations: ['Insufficient data for leak detection - need at least 30 minutes of monitoring'],
      };
    }
    
    // Analysiere Heap-Größen-Trend
    const heapTrend = this.analyzeHeapTrend();
    const objectGrowthAnalysis = this.analyzeObjectGrowth();
    
    // Berechne Leak-Rate (MB pro Stunde)
    const leakRate = this.calculateLeakRate();
    
    const isLeakDetected = leakRate > this.leakDetectionConfig.leakThreshold;
    
    let snapshotPath: string | undefined;
    if (isLeakDetected) {
      // Erstelle detaillierten Heap-Snapshot für weitere Analyse
      snapshotPath = await this.createHeapSnapshot();
    }
    
    const recommendations = this.generateLeakRecommendations(heapTrend, objectGrowthAnalysis, leakRate);
    
    return {
      isLeakDetected,
      leakRate,
      suspiciousObjects: objectGrowthAnalysis.rapidlyGrowingObjects,
      recommendations,
      snapshotPath,
    };
  }

  /**
   * Überwacht spezifische Objekt-Instanzen
   * Nützlich für das Tracking von bekannten problematischen Objekten
   */
  trackObjectInstances(objectName: string, currentCount: number): void {
    const previousCount = this.objectCounters.get(objectName) || 0;
    this.objectCounters.set(objectName, currentCount);
    
    // Warne bei ungewöhnlichem Anstieg
    if (currentCount > previousCount * 2 && previousCount > 0) {
      this.logger.warn(`Rapid increase in ${objectName} instances detected`, {
        previous: previousCount,
        current: currentCount,
        increasePercent: ((currentCount - previousCount) / previousCount * 100).toFixed(1),
      });
    }
  }

  /**
   * Erstellt einen detaillierten Report über den Speicherzustand
   * Hilfreich für die Analyse und das Debugging von Memory-Problemen
   */
  generateMemoryReport(): {
    currentState: MemorySnapshot;
    trends: any;
    suspiciousPatterns: string[];
    historicalData: MemorySnapshot[];
  } {
    const currentState = this.memorySnapshots[this.memorySnapshots.length - 1];
    const trends = this.analyzeMemoryTrends();
    const suspiciousPatterns = this.identifySuspiciousPatterns();
    
    return {
      currentState,
      trends,
      suspiciousPatterns,
      historicalData: this.memorySnapshots.slice(-20), // Letzte 20 Snapshots
    };
  }

  private startMemoryMonitoring(): void {
    // Automatische Snapshot-Erstellung
    setInterval(async () => {
      try {
        await this.createMemorySnapshot();
        
        // Periodische Leak-Detection
        const leakDetection = await this.detectMemoryLeaks();
        if (leakDetection.isLeakDetected) {
          this.logger.error('Memory leak detected!', {
            leakRate: leakDetection.leakRate,
            suspiciousObjects: leakDetection.suspiciousObjects,
            snapshotPath: leakDetection.snapshotPath,
          });
        }
        
      } catch (error) {
        this.logger.error('Error during memory monitoring:', error);
      }
    }, this.leakDetectionConfig.snapshotInterval);
  }

  private registerMemoryWarningHandlers(): void {
    // Handler für Speicher-Warnungen des Systems
    process.on('warning', (warning) => {
      if (warning.name === 'MaxListenersExceededWarning') {
        this.logger.warn('Potential EventEmitter memory leak detected', {
          warning: warning.message,
        });
      }
    });
    
    // Handler für unkritische Speicher-Druck-Situationen
    if (process.memoryUsage && typeof process.memoryUsage === 'function') {
      setInterval(() => {
        const usage = process.memoryUsage();
        const heapUsagePercent = (usage.heapUsed / usage.heapTotal) * 100;
        
        if (heapUsagePercent > 85) {
          this.logger.warn('High heap usage detected', {
            heapUsagePercent: heapUsagePercent.toFixed(1),
            heapUsedMB: (usage.heapUsed / 1024 / 1024).toFixed(2),
            heapTotalMB: (usage.heapTotal / 1024 / 1024).toFixed(2),
          });
        }
      }, 60000); // Prüfe jede Minute
    }
  }

  private async analyzeObjectTypes(): Promise<Map<string, number>> {
    const objectCounts = new Map<string, number>();
    
    // Diese Implementierung ist vereinfacht
    // In einer produktiven Anwendung würden Sie Tools wie @memlab/core verwenden
    // oder native V8-Profiling-APIs für detailliertere Objektanalyse
    
    try {
      const heapStats = v8.getHeapStatistics();
      objectCounts.set('total_heap_size', heapStats.total_heap_size);
      objectCounts.set('used_heap_size', heapStats.used_heap_size);
      objectCounts.set('heap_size_limit', heapStats.heap_size_limit);
      
    } catch (error) {
      this.logger.error('Error analyzing object types:', error);
    }
    
    return objectCounts;
  }

  private analyzeHeapTrend(): { trend: 'increasing' | 'decreasing' | 'stable'; rate: number } {
    if (this.memorySnapshots.length < 3) {
      return { trend: 'stable', rate: 0 };
    }
    
    const recent = this.memorySnapshots.slice(-6); // Letzte 30 Minuten
    const values = recent.map(s => s.heapUsed);
    
    // Einfache lineare Regression für Trend-Analyse
    const n = values.length;
    const sumX = Array.from({ length: n }, (_, i) => i).reduce((a, b) => a + b, 0);
    const sumY = values.reduce((a, b) => a + b, 0);
    const sumXY = values.reduce((sum, y, x) => sum + x * y, 0);
    const sumXX = Array.from({ length: n }, (_, i) => i * i).reduce((a, b) => a + b, 0);
    
    const slope = (n * sumXY - sumX * sumY) / (n * sumXX - sumX * sumX);
    
    if (Math.abs(slope) < 1000) { // Weniger als 1KB pro Snapshot
      return { trend: 'stable', rate: slope };
    }
    
    return {
      trend: slope > 0 ? 'increasing' : 'decreasing',
      rate: slope,
    };
  }

  private analyzeObjectGrowth(): { rapidlyGrowingObjects: string[] } {
    const rapidlyGrowingObjects: string[] = [];
    
    if (this.memorySnapshots.length < 2) {
      return { rapidlyGrowingObjects };
    }
    
    const current = this.memorySnapshots[this.memorySnapshots.length - 1];
    const previous = this.memorySnapshots[this.memorySnapshots.length - 2];
    
    // Vergleiche Objekt-Counts zwischen Snapshots
    for (const [objectType, currentCount] of current.objectCounts) {
      const previousCount = previous.objectCounts.get(objectType) || 0;
      
      if (previousCount > 0 && currentCount > previousCount * 1.5) {
        rapidlyGrowingObjects.push(objectType);
      }
    }
    
    return { rapidlyGrowingObjects };
  }

  private calculateLeakRate(): number {
    if (this.memorySnapshots.length < 2) {
      return 0;
    }
    
    const first = this.memorySnapshots[0];
    const last = this.memorySnapshots[this.memorySnapshots.length - 1];
    
    const timeDiffHours = (last.timestamp.getTime() - first.timestamp.getTime()) / (1000 * 60 * 60);
    const memoryDiffMB = (last.heapUsed - first.heapUsed) / (1024 * 1024);
    
    return timeDiffHours > 0 ? memoryDiffMB / timeDiffHours : 0;
  }

  private async createHeapSnapshot(): Promise<string> {
    const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
    const filename = `heap-snapshot-${timestamp}.heapsnapshot`;
    const filepath = path.join(process.cwd(), 'logs', filename);
    
    try {
      // Stelle sicher, dass das Logs-Verzeichnis existiert
      const logsDir = path.dirname(filepath);
      if (!fs.existsSync(logsDir)) {
        fs.mkdirSync(logsDir, { recursive: true });
      }
      
      // Erstelle Heap-Snapshot
      const heapSnapshot = v8.getHeapSnapshot();
      const writeStream = fs.createWriteStream(filepath);
      
      heapSnapshot.pipe(writeStream);
      
      return new Promise((resolve, reject) => {
        writeStream.on('finish', () => resolve(filepath));
        writeStream.on('error', reject);
      });
      
    } catch (error) {
      this.logger.error('Failed to create heap snapshot:', error);
      throw error;
    }
  }

  private generateLeakRecommendations(
    heapTrend: any,
    objectGrowth: any,
    leakRate: number
  ): string[] {
    const recommendations: string[] = [];
    
    if (leakRate > 20) {
      recommendations.push('CRITICAL: High memory leak rate detected - immediate investigation required');
    } else if (leakRate > 10) {
      recommendations.push('WARNING: Moderate memory leak detected - monitor closely');
    }
    
    if (heapTrend.trend === 'increasing') {
      recommendations.push('Heap size is continuously growing - check for retained references');
    }
    
    if (objectGrowth.rapidlyGrowingObjects.length > 0) {
      recommendations.push(`Investigate rapidly growing objects: ${objectGrowth.rapidlyGrowingObjects.join(', ')}`);
    }
    
    recommendations.push('Consider implementing explicit resource cleanup in finally blocks');
    recommendations.push('Review EventEmitter usage for proper listener removal');
    recommendations.push('Check for circular references in cached objects');
    
    return recommendations;
  }

  private analyzeMemoryTrends(): any {
    // Vereinfachte Trend-Analyse
    return {
      averageHeapUsage: this.memorySnapshots.reduce((sum, s) => sum + s.heapUsed, 0) / this.memorySnapshots.length,
      peakHeapUsage: Math.max(...this.memorySnapshots.map(s => s.heapUsed)),
      memoryGrowthRate: this.calculateLeakRate(),
    };
  }

  private identifySuspiciousPatterns(): string[] {
    const patterns: string[] = [];
    
    if (this.memorySnapshots.length < 3) {
      return patterns;
    }
    
    // Prüfe auf kontinuierliches Wachstum
    const recentGrowth = this.memorySnapshots.slice(-5).every((snapshot, index, array) => {
      return index === 0 || snapshot.heapUsed > array[index - 1].heapUsed;
    });
    
    if (recentGrowth) {
      patterns.push('Continuous memory growth pattern detected');
    }
    
    // Prüfe auf Speicher-Spikes
    const averageHeap = this.memorySnapshots.reduce((sum, s) => sum + s.heapUsed, 0) / this.memorySnapshots.length;
    const hasSpikes = this.memorySnapshots.some(s => s.heapUsed > averageHeap * 2);
    
    if (hasSpikes) {
      patterns.push('Memory usage spikes detected');
    }
    
    return patterns;
  }
}

31.4 Database Query Analysis

Datenbankabfragen sind oft die Achillesferse von Web-Anwendungen. Stellen Sie sich vor, Ihre Datenbank wäre eine Bibliothek und jede Abfrage ein Bibliothekar, der ein bestimmtes Buch suchen muss. Ein effizienter Bibliothekar weiß genau, wo er suchen muss und findet das Buch schnell. Ein ineffizienter Bibliothekar durchsucht wahllos alle Regale. In NestJS-Anwendungen können schlecht optimierte Datenbankabfragen ähnliche Probleme verursachen - sie verlangsamen nicht nur die spezifische Operation, sondern können das gesamte System beeinträchtigen.

Database Query Analysis geht über das einfache Loggen von SQL-Statements hinaus. Es bedeutet, zu verstehen, wie Ihre Abfragen ausgeführt werden, welche Ressourcen sie verbrauchen und wie sie sich unter verschiedenen Lasten verhalten. Moderne ORMs wie TypeORM abstrahieren zwar viele Details, aber diese Abstraktion kann auch dazu führen, dass ineffiziente Abfragen unbemerkt bleiben.

// database-analysis/src/services/query-analyzer.service.ts
// Ein umfassendes System zur Analyse von Datenbankabfragen
import { Injectable, Logger } from '@nestjs/common';
import { DataSource, QueryRunner } from 'typeorm';
import { performance } from 'perf_hooks';

interface QueryAnalysis {
  queryId: string;
  sql: string;
  parameters: any[];
  executionTime: number;
  timestamp: Date;
  affectedRows?: number;
  executionPlan?: any;
  warnings: string[];
  optimizationSuggestions: string[];
}

interface QueryPerformanceMetrics {
  totalQueries: number;
  averageExecutionTime: number;
  slowestQueries: QueryAnalysis[];
  mostFrequentQueries: Map<string, number>;
  nPlusOneDetections: number;
}

@Injectable()
export class QueryAnalyzerService {
  private readonly logger = new Logger(QueryAnalyzerService.name);
  
  // Sammelt alle Query-Analysen für spätere Auswertung
  private readonly queryHistory: QueryAnalysis[] = [];
  private readonly maxHistorySize = 1000;
  
  // Trackt Query-Häufigkeiten für N+1-Detection
  private readonly queryFrequency = new Map<string, number>();
  private readonly suspiciousQueryPatterns = new Set<string>();

  constructor(private readonly dataSource: DataSource) {
    this.setupQueryLogging();
  }

  /**
   * Analysiert eine spezifische Datenbankabfrage
   * Diese Methode führt eine detaillierte Analyse durch, einschließlich Execution Plan
   */
  async analyzeQuery(sql: string, parameters: any[] = []): Promise<QueryAnalysis> {
    const queryId = this.generateQueryId(sql, parameters);
    const startTime = performance.now();
    
    let queryRunner: QueryRunner | undefined;
    
    try {
      queryRunner = this.dataSource.createQueryRunner();
      await queryRunner.connect();
      
      // Führe die Abfrage aus und sammle Metriken
      const result = await queryRunner.query(sql, parameters);
      const executionTime = performance.now() - startTime;
      
      // Hole Execution Plan (PostgreSQL-spezifisch, anpassbar für andere DBs)
      const executionPlan = await this.getExecutionPlan(queryRunner, sql, parameters);
      
      // Analysiere die Abfrage auf potentielle Probleme
      const warnings = this.identifyQueryWarnings(sql, executionTime, result);
      const optimizationSuggestions = this.generateOptimizationSuggestions(sql, executionPlan, executionTime);
      
      const analysis: QueryAnalysis = {
        queryId,
        sql: this.normalizeQuery(sql),
        parameters,
        executionTime,
        timestamp: new Date(),
        affectedRows: Array.isArray(result) ? result.length : result.affectedRows,
        executionPlan,
        warnings,
        optimizationSuggestions,
      };
      
      this.recordQueryAnalysis(analysis);
      
      // Warne bei langsamen Abfragen
      if (executionTime > 100) {
        this.logger.warn(`Slow query detected (${executionTime.toFixed(2)}ms): ${this.normalizeQuery(sql)}`);
      }
      
      return analysis;
      
    } catch (error) {
      const executionTime = performance.now() - startTime;
      
      this.logger.error(`Query analysis failed (${executionTime.toFixed(2)}ms): ${sql}`, error);
      
      return {
        queryId,
        sql: this.normalizeQuery(sql),
        parameters,
        executionTime,
        timestamp: new Date(),
        warnings: [`Query execution failed: ${error.message}`],
        optimizationSuggestions: [],
      };
      
    } finally {
      if (queryRunner) {
        await queryRunner.release();
      }
    }
  }

  /**
   * Erkennt N+1-Query-Probleme durch Analyse von Query-Mustern
   * N+1-Probleme sind einer der häufigsten Performance-Killer in ORM-basierten Anwendungen
   */
  detectNPlusOneProblems(): {
    detectedProblems: Array<{
      pattern: string;
      occurrences: number;
      suggestedSolution: string;
    }>;
    totalNPlusOneQueries: number;
  } {
    const problems: Array<{ pattern: string; occurrences: number; suggestedSolution: string }> = [];
    let totalNPlusOneQueries = 0;
    
    // Analysiere Query-Muster in der letzten Zeit (z.B. letzte 100 Queries)
    const recentQueries = this.queryHistory.slice(-100);
    const queryPatterns = new Map<string, QueryAnalysis[]>();
    
    // Gruppiere ähnliche Queries
    for (const query of recentQueries) {
      const pattern = this.extractQueryPattern(query.sql);
      if (!queryPatterns.has(pattern)) {
        queryPatterns.set(pattern, []);
      }
      queryPatterns.get(pattern)!.push(query);
    }
    
    // Suche nach verdächtigen Mustern
    for (const [pattern, queries] of queryPatterns) {
      // N+1-Indikatoren:
      // 1. Viele ähnliche SELECT-Queries in kurzer Zeit
      // 2. Queries, die nur durch Parameter unterscheiden
      // 3. Queries mit WHERE-Klauseln auf Foreign Keys
      
      if (queries.length > 5 && this.isSelectQuery(pattern) && this.hasForeignKeyFilter(pattern)) {
        const timeDifference = queries[queries.length - 1].timestamp.getTime() - queries[0].timestamp.getTime();
        
        // Wenn viele ähnliche Queries in weniger als 1 Sekunde ausgeführt wurden
        if (timeDifference < 1000) {
          problems.push({
            pattern: pattern,
            occurrences: queries.length,
            suggestedSolution: this.suggestNPlusOneSolution(pattern),
          });
          
          totalNPlusOneQueries += queries.length;
        }
      }
    }
    
    return {
      detectedProblems: problems,
      totalNPlusOneQueries,
    };
  }

  /**
   * Generiert einen umfassenden Performance-Report
   * Hilfreich für regelmäßige Performance-Reviews und Optimierungen
   */
  generatePerformanceReport(): QueryPerformanceMetrics {
    const totalQueries = this.queryHistory.length;
    const averageExecutionTime = this.queryHistory.reduce((sum, q) => sum + q.executionTime, 0) / totalQueries;
    
    // Die 10 langsamsten Queries
    const slowestQueries = this.queryHistory
      .slice() // Kopie für Sortierung
      .sort((a, b) => b.executionTime - a.executionTime)
      .slice(0, 10);
    
    // Häufigste Query-Muster
    const patternFrequency = new Map<string, number>();
    for (const query of this.queryHistory) {
      const pattern = this.extractQueryPattern(query.sql);
      patternFrequency.set(pattern, (patternFrequency.get(pattern) || 0) + 1);
    }
    
    const nPlusOneDetection = this.detectNPlusOneProblems();
    
    return {
      totalQueries,
      averageExecutionTime,
      slowestQueries,
      mostFrequentQueries: patternFrequency,
      nPlusOneDetections: nPlusOneDetection.totalNPlusOneQueries,
    };
  }

  /**
   * Überwacht Query-Performance in Echtzeit
   * Sendet Alerts bei kritischen Performance-Problemen
   */
  async monitorQueryPerformance(): Promise<void> {
    const recentQueries = this.queryHistory.slice(-50); // Letzte 50 Queries
    
    if (recentQueries.length === 0) return;
    
    const averageTime = recentQueries.reduce((sum, q) => sum + q.executionTime, 0) / recentQueries.length;
    const slowQueries = recentQueries.filter(q => q.executionTime > 200);
    
    // Alert bei erhöhter durchschnittlicher Query-Zeit
    if (averageTime > 100) {
      this.logger.warn(`High average query execution time detected: ${averageTime.toFixed(2)}ms`, {
        recentQueriesCount: recentQueries.length,
        slowQueriesCount: slowQueries.length,
      });
    }
    
    // Alert bei vielen langsamen Queries
    if (slowQueries.length > recentQueries.length * 0.2) {
      this.logger.error(`High percentage of slow queries detected: ${((slowQueries.length / recentQueries.length) * 100).toFixed(1)}%`);
    }
    
    // N+1-Detection
    const nPlusOneProblems = this.detectNPlusOneProblems();
    if (nPlusOneProblems.detectedProblems.length > 0) {
      this.logger.error(`N+1 query problems detected:`, nPlusOneProblems.detectedProblems);
    }
  }

  private setupQueryLogging(): void {
    // Diese Implementierung hängt vom verwendeten ORM ab
    // Für TypeORM können Sie einen benutzerdefinierten Logger erstellen
    
    const originalQuery = this.dataSource.query.bind(this.dataSource);
    
    this.dataSource.query = async (sql: string, parameters?: any[]): Promise<any> => {
      const analysis = await this.analyzeQuery(sql, parameters);
      return originalQuery(sql, parameters);
    };
  }

  private async getExecutionPlan(queryRunner: QueryRunner, sql: string, parameters: any[]): Promise<any> {
    try {
      // PostgreSQL EXPLAIN ANALYZE
      const explainQuery = `EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) ${sql}`;
      const plan = await queryRunner.query(explainQuery, parameters);
      return plan[0]['QUERY PLAN'][0];
    } catch (error) {
      // Fallback für Datenbanken, die EXPLAIN nicht unterstützen
      return null;
    }
  }

  private identifyQueryWarnings(sql: string, executionTime: number, result: any): string[] {
    const warnings: string[] = [];
    
    // Performance-Warnungen
    if (executionTime > 500) {
      warnings.push(`Very slow query: ${executionTime.toFixed(2)}ms execution time`);
    }
    
    // Query-Pattern-Warnungen
    if (sql.toLowerCase().includes('select *')) {
      warnings.push('Using SELECT * - consider specifying specific columns');
    }
    
    if (!sql.toLowerCase().includes('limit') && sql.toLowerCase().includes('select')) {
      warnings.push('No LIMIT clause found - potential for large result sets');
    }
    
    if (sql.toLowerCase().includes('or')) {
      warnings.push('OR clause detected - may prevent index usage');
    }
    
    // Große Ergebnismengen
    if (Array.isArray(result) && result.length > 1000) {
      warnings.push(`Large result set: ${result.length} rows returned`);
    }
    
    return warnings;
  }

  private generateOptimizationSuggestions(sql: string, executionPlan: any, executionTime: number): string[] {
    const suggestions: string[] = [];
    
    if (executionTime > 100) {
      suggestions.push('Consider adding appropriate database indexes');
      suggestions.push('Review WHERE clause for optimization opportunities');
    }
    
    if (executionPlan) {
      // Analysiere Execution Plan für spezifische Optimierungen
      if (this.hasSequentialScan(executionPlan)) {
        suggestions.push('Sequential scan detected - consider adding index');
      }
      
      if (this.hasNestedLoop(executionPlan)) {
        suggestions.push('Nested loop join detected - verify join conditions');
      }
    }
    
    if (sql.toLowerCase().includes('like %')) {
      suggestions.push('LIKE with leading wildcard prevents index usage - consider full-text search');
    }
    
    return suggestions;
  }

  private recordQueryAnalysis(analysis: QueryAnalysis): void {
    this.queryHistory.push(analysis);
    
    // Halte History-Größe begrenzt
    if (this.queryHistory.length > this.maxHistorySize) {
      this.queryHistory.shift();
    }
    
    // Update Query-Frequency-Tracking
    const pattern = this.extractQueryPattern(analysis.sql);
    this.queryFrequency.set(pattern, (this.queryFrequency.get(pattern) || 0) + 1);
  }

  private generateQueryId(sql: string, parameters: any[]): string {
    const normalizedSql = this.normalizeQuery(sql);
    const paramString = JSON.stringify(parameters);
    return `query_${Date.now()}_${normalizedSql.substring(0, 50).replace(/\s+/g, '_')}`;
  }

  private normalizeQuery(sql: string): string {
    return sql.replace(/\s+/g, ' ').trim();
  }

  private extractQueryPattern(sql: string): string {
    // Ersetzt Parameter-Platzhalter für Pattern-Erkennung
    return sql
      .replace(/\$\d+/g, '?') // PostgreSQL-Parameter
      .replace(/'[^']*'/g, '?') // String-Literals
      .replace(/\b\d+\b/g, '?') // Numerische Literals
      .replace(/\s+/g, ' ')
      .trim();
  }

  private isSelectQuery(pattern: string): boolean {
    return pattern.toLowerCase().startsWith('select');
  }

  private hasForeignKeyFilter(pattern: string): boolean {
    // Vereinfachte Heuristik für Foreign Key-Filter
    return /where.*_id\s*=\s*\?/i.test(pattern);
  }

  private suggestNPlusOneSolution(pattern: string): string {
    if (pattern.includes('JOIN')) {
      return 'Consider using eager loading with JOIN FETCH or relations in TypeORM';
    }
    return 'Use batch loading or eager loading to reduce N+1 queries';
  }

  private hasSequentialScan(plan: any): boolean {
    if (!plan || !plan.Plan) return false;
    return plan.Plan['Node Type'] === 'Seq Scan';
  }

  private hasNestedLoop(plan: any): boolean {
    if (!plan || !plan.Plan) return false;
    return plan.Plan['Node Type'] === 'Nested Loop';
  }
}

31.5 Distributed Tracing

In modernen Mikroservice-Architekturen ist eine einzelne Benutzeranfrage wie ein Staffellauf - sie wird von einem Service zum nächsten weitergegeben, wobei jeder Service einen Teil der Arbeit erledigt. Distributed Tracing ist wie ein GPS-System, das den gesamten Weg dieser Anfrage verfolgt und Ihnen zeigt, wo sie war, wie lange sie an jedem Punkt verbracht hat und wo möglicherweise Probleme aufgetreten sind.

Stellen Sie sich vor, Sie versuchen herauszufinden, warum eine Online-Bestellung langsam verarbeitet wird. Die Anfrage könnte durch einen Authentifizierungs-Service, einen Produktkatalog-Service, einen Inventory-Service, einen Payment-Service und schließlich einen Fulfillment-Service laufen. Ohne Distributed Tracing wäre es wie das Suchen nach einer Nadel im Heuhaufen - Sie wüssten nicht, welcher Service das Problem verursacht.

// tracing/src/services/distributed-tracer.service.ts
// Ein umfassendes Distributed Tracing-System für NestJS-Anwendungen
import { Injectable, Logger } from '@nestjs/common';
import { AsyncLocalStorage } from 'async_hooks';
import { v4 as uuidv4 } from 'uuid';

interface TraceContext {
  traceId: string;
  spanId: string;
  parentSpanId?: string;
  baggage?: Record<string, string>;
  startTime: bigint;
  serviceName: string;
  operationName: string;
}

interface Span {
  traceId: string;
  spanId: string;
  parentSpanId?: string;
  serviceName: string;
  operationName: string;
  startTime: bigint;
  endTime?: bigint;
  duration?: number;
  tags: Record<string, any>;
  logs: Array<{ timestamp: bigint; fields: Record<string, any> }>;
  status: 'ok' | 'error' | 'timeout';
  errorMessage?: string;
}

@Injectable()
export class DistributedTracerService {
  private readonly logger = new Logger(DistributedTracerService.name);
  
  // Async Local Storage für Trace-Kontext-Propagation
  private readonly asyncLocalStorage = new AsyncLocalStorage<TraceContext>();
  
  // Speichert abgeschlossene Spans für Export
  private readonly completedSpans: Span[] = [];
  private readonly maxSpansInMemory = 10000;
  
  // Service-Name für diesen Service
  private readonly serviceName = process.env.SERVICE_NAME || 'nestjs-app';

  /**
   * Startet eine neue Trace oder erstellt einen Child-Span
   * Diese Methode ist der Einstiegspunkt für alle Tracing-Operationen
   */
  startSpan(operationName: string, parentContext?: TraceContext): TraceContext {
    const spanId = this.generateSpanId();
    const traceId = parentContext?.traceId || this.generateTraceId();
    const parentSpanId = parentContext?.spanId;
    
    const context: TraceContext = {
      traceId,
      spanId,
      parentSpanId,
      baggage: parentContext?.baggage || {},
      startTime: process.hrtime.bigint(),
      serviceName: this.serviceName,
      operationName,
    };
    
    this.logger.debug(`Started span: ${operationName}`, {
      traceId,
      spanId,
      parentSpanId,
    });
    
    return context;
  }

  /**
   * Beendet einen Span und protokolliert die Ergebnisse
   * Sammelt alle relevanten Informationen für die spätere Analyse
   */
  finishSpan(
    context: TraceContext,
    tags: Record<string, any> = {},
    error?: Error
  ): Span {
    const endTime = process.hrtime.bigint();
    const duration = Number(endTime - context.startTime) / 1000000; // Konvertiere zu Millisekunden
    
    const span: Span = {
      traceId: context.traceId,
      spanId: context.spanId,
      parentSpanId: context.parentSpanId,
      serviceName: context.serviceName,
      operationName: context.operationName,
      startTime: context.startTime,
      endTime,
      duration,
      tags: {
        'service.name': context.serviceName,
        'span.kind': this.inferSpanKind(context.operationName),
        ...tags,
      },
      logs: [],
      status: error ? 'error' : 'ok',
      errorMessage: error?.message,
    };
    
    // Error-spezifische Tags hinzufügen
    if (error) {
      span.tags['error'] = true;
      span.tags['error.type'] = error.name;
      span.tags['error.message'] = error.message;
      if (error.stack) {
        span.tags['error.stack'] = error.stack;
      }
    }
    
    this.recordSpan(span);
    
    this.logger.debug(`Finished span: ${context.operationName} (${duration.toFixed(2)}ms)`, {
      traceId: context.traceId,
      spanId: context.spanId,
      duration,
      status: span.status,
    });
    
    return span;
  }

  /**
   * Führt eine Operation mit automatischem Tracing aus
   * Diese Utility-Methode vereinfacht das Tracing für einfache Operationen
   */
  async traceOperation<T>(
    operationName: string,
    operation: (context: TraceContext) => Promise<T>,
    tags: Record<string, any> = {}
  ): Promise<T> {
    const context = this.startSpan(operationName);
    
    try {
      // Führe die Operation im Trace-Kontext aus
      const result = await this.asyncLocalStorage.run(context, async () => {
        return await operation(context);
      });
      
      this.finishSpan(context, tags);
      return result;
      
    } catch (error) {
      this.finishSpan(context, tags, error);
      throw error;
    }
  }

  /**
   * Fügt strukturierte Logs zu einem aktiven Span hinzu
   * Hilfreich für das Protokollieren wichtiger Ereignisse während der Span-Ausführung
   */
  logToSpan(fields: Record<string, any>, context?: TraceContext): void {
    const spanContext = context || this.getCurrentContext();
    if (!spanContext) {
      this.logger.warn('Attempted to log to span, but no active span found');
      return;
    }
    
    // Finde den entsprechenden Span (wenn er noch aktiv ist)
    // In einer vollständigen Implementierung würden Sie aktive Spans tracken
    this.logger.debug(`Span log: ${spanContext.operationName}`, {
      traceId: spanContext.traceId,
      spanId: spanContext.spanId,
      ...fields,
    });
  }

  /**
   * Extrahiert Trace-Kontext aus HTTP-Headern
   * Implementiert W3C Trace Context Specification
   */
  extractTraceContext(headers: Record<string, string>): TraceContext | null {
    const traceparent = headers['traceparent'] || headers['x-trace-id'];
    
    if (!traceparent) {
      return null;
    }
    
    // Parse W3C traceparent header: version-trace_id-parent_id-trace_flags
    const parts = traceparent.split('-');
    if (parts.length !== 4) {
      this.logger.warn(`Invalid traceparent header format: ${traceparent}`);
      return null;
    }
    
    const [version, traceId, parentSpanId, flags] = parts;
    
    // Extrahiere Baggage (falls vorhanden)
    const baggage: Record<string, string> = {};
    const baggageHeader = headers['baggage'];
    if (baggageHeader) {
      const baggageItems = baggageHeader.split(',');
      for (const item of baggageItems) {
        const [key, value] = item.trim().split('=');
        if (key && value) {
          baggage[key] = decodeURIComponent(value);
        }
      }
    }
    
    return {
      traceId,
      spanId: this.generateSpanId(), // Neuer Span für diesen Service
      parentSpanId,
      baggage,
      startTime: process.hrtime.bigint(),
      serviceName: this.serviceName,
      operationName: 'http_request',
    };
  }

  /**
   * Injiziert Trace-Kontext in HTTP-Headers für ausgehende Requests
   * Sorgt für Kontext-Propagation zwischen Services
   */
  injectTraceContext(context: TraceContext): Record<string, string> {
    const headers: Record<string, string> = {};
    
    // W3C Trace Context
    headers['traceparent'] = `00-${context.traceId}-${context.spanId}-01`;
    
    // Baggage propagieren
    if (context.baggage && Object.keys(context.baggage).length > 0) {
      const baggageItems = Object.entries(context.baggage)
        .map(([key, value]) => `${key}=${encodeURIComponent(value)}`)
        .join(',');
      headers['baggage'] = baggageItems;
    }
    
    // Custom Headers für Debugging
    headers['x-trace-id'] = context.traceId;
    headers['x-span-id'] = context.spanId;
    headers['x-service-name'] = context.serviceName;
    
    return headers;
  }

  /**
   * Analysiert Trace-Daten für Performance-Insights
   * Identifiziert Bottlenecks und Anomalien in der Service-Kommunikation
   */
  analyzeTracePerformance(traceId: string): {
    totalDuration: number;
    criticalPath: Span[];
    bottlenecks: Span[];
    serviceBreakdown: Record<string, number>;
    anomalies: string[];
  } {
    const traceSpans = this.completedSpans.filter(span => span.traceId === traceId);
    
    if (traceSpans.length === 0) {
      throw new Error(`No spans found for trace ${traceId}`);
    }
    
    // Sortiere Spans nach Start-Zeit
    traceSpans.sort((a, b) => Number(a.startTime - b.startTime));
    
    const totalDuration = Math.max(...traceSpans.map(span => span.duration || 0));
    
    // Identifiziere Critical Path (längste Kette von abhängigen Spans)
    const criticalPath = this.findCriticalPath(traceSpans);
    
    // Identifiziere Bottlenecks (langsamste Spans)
    const bottlenecks = traceSpans
      .filter(span => span.duration && span.duration > totalDuration * 0.2)
      .sort((a, b) => (b.duration || 0) - (a.duration || 0));
    
    // Service-Performance-Aufschlüsselung
    const serviceBreakdown: Record<string, number> = {};
    for (const span of traceSpans) {
      if (span.duration) {
        serviceBreakdown[span.serviceName] = (serviceBreakdown[span.serviceName] || 0) + span.duration;
      }
    }
    
    // Identifiziere Anomalien
    const anomalies = this.identifyTraceAnomalies(traceSpans, totalDuration);
    
    return {
      totalDuration,
      criticalPath,
      bottlenecks,
      serviceBreakdown,
      anomalies,
    };
  }

  /**
   * Holt den aktuellen Trace-Kontext aus dem AsyncLocalStorage
   */
  getCurrentContext(): TraceContext | undefined {
    return this.asyncLocalStorage.getStore();
  }

  /**
   * Exportiert Trace-Daten für externe Analyse-Tools
   * Kompatibel mit OpenTelemetry und Jaeger
   */
  exportTraces(format: 'jaeger' | 'zipkin' | 'otlp' = 'jaeger'): any[] {
    const exports: any[] = [];
    
    for (const span of this.completedSpans) {
      let exportedSpan: any;
      
      switch (format) {
        case 'jaeger':
          exportedSpan = this.convertToJaegerFormat(span);
          break;
        case 'zipkin':
          exportedSpan = this.convertToZipkinFormat(span);
          break;
        case 'otlp':
          exportedSpan = this.convertToOTLPFormat(span);
          break;
      }
      
      exports.push(exportedSpan);
    }
    
    return exports;
  }

  private generateTraceId(): string {
    return uuidv4().replace(/-/g, '');
  }

  private generateSpanId(): string {
    return Math.random().toString(16).substr(2, 16);
  }

  private inferSpanKind(operationName: string): string {
    if (operationName.includes('http') || operationName.includes('request')) {
      return 'client';
    }
    if (operationName.includes('database') || operationName.includes('query')) {
      return 'client';
    }
    if (operationName.includes('handler') || operationName.includes('controller')) {
      return 'server';
    }
    return 'internal';
  }

  private recordSpan(span: Span): void {
    this.completedSpans.push(span);
    
    // Halte Speicherverbrauch begrenzt
    if (this.completedSpans.length > this.maxSpansInMemory) {
      // Entferne die ältesten 10% der Spans
      const toRemove = Math.floor(this.maxSpansInMemory * 0.1);
      this.completedSpans.splice(0, toRemove);
    }
    
    // Potentiell: Export zu externem Tracing-System
    // this.exportSpanToExternalSystem(span);
  }

  private findCriticalPath(spans: Span[]): Span[] {
    // Vereinfachte Critical Path-Analyse
    // In einer vollständigen Implementierung würden Sie Span-Abhängigkeiten verfolgen
    return spans
      .filter(span => span.duration && span.duration > 0)
      .sort((a, b) => (b.duration || 0) - (a.duration || 0))
      .slice(0, 5); // Top 5 längste Spans als Approximation
  }

  private identifyTraceAnomalies(spans: Span[], totalDuration: number): string[] {
    const anomalies: string[] = [];
    
    // Lange Gaps zwischen Spans
    for (let i = 1; i < spans.length; i++) {
      const gap = Number(spans[i].startTime - spans[i-1].startTime) / 1000000;
      if (gap > totalDuration * 0.1) {
        anomalies.push(`Large gap detected between spans: ${gap.toFixed(2)}ms`);
      }
    }
    
    // Error-Rate-Analyse
    const errorSpans = spans.filter(span => span.status === 'error');
    if (errorSpans.length > spans.length * 0.1) {
      anomalies.push(`High error rate: ${((errorSpans.length / spans.length) * 100).toFixed(1)}%`);
    }
    
    // Ungewöhnlich langsame Spans
    const averageDuration = spans.reduce((sum, span) => sum + (span.duration || 0), 0) / spans.length;
    const slowSpans = spans.filter(span => span.duration && span.duration > averageDuration * 3);
    
    if (slowSpans.length > 0) {
      anomalies.push(`${slowSpans.length} spans significantly slower than average`);
    }
    
    return anomalies;
  }

  private convertToJaegerFormat(span: Span): any {
    return {
      traceID: span.traceId,
      spanID: span.spanId,
      parentSpanID: span.parentSpanId,
      operationName: span.operationName,
      startTime: Number(span.startTime) / 1000, // Konvertiere zu Mikrosekunden
      duration: (span.duration || 0) * 1000, // Konvertiere zu Mikrosekunden
      tags: Object.entries(span.tags).map(([key, value]) => ({
        key,
        type: typeof value === 'string' ? 'string' : 'number',
        value: String(value),
      })),
      process: {
        serviceName: span.serviceName,
        tags: [],
      },
    };
  }

  private convertToZipkinFormat(span: Span): any {
    return {
      traceId: span.traceId,
      id: span.spanId,
      parentId: span.parentSpanId,
      name: span.operationName,
      timestamp: Number(span.startTime) / 1000, // Mikrosekunden
      duration: (span.duration || 0) * 1000, // Mikrosekunden
      localEndpoint: {
        serviceName: span.serviceName,
      },
      tags: span.tags,
    };
  }

  private convertToOTLPFormat(span: Span): any {
    return {
      traceId: span.traceId,
      spanId: span.spanId,
      parentSpanId: span.parentSpanId,
      name: span.operationName,
      startTimeUnixNano: String(span.startTime),
      endTimeUnixNano: span.endTime ? String(span.endTime) : undefined,
      attributes: Object.entries(span.tags).map(([key, value]) => ({
        key,
        value: { stringValue: String(value) },
      })),
      status: {
        code: span.status === 'ok' ? 1 : 2,
        message: span.errorMessage,
      },
    };
  }
}

31.6 Production Debugging

Production Debugging ist wie Herzchirurgie am offenen Herzen - Sie müssen äußerst vorsichtig vorgehen, da jeder Fehler katastrophale Auswirkungen haben kann. In der Produktion haben Sie nicht den Luxus, die Anwendung zu stoppen, Debugger anzuhängen oder experimentelle Änderungen vorzunehmen. Sie müssen mit den verfügbaren Informationen arbeiten und dabei sicherstellen, dass Sie das System nicht weiter destabilisieren.

Der Schlüssel zu erfolgreichem Production Debugging liegt in der Vorbereitung. Wie ein Feuerwehrmann, der seine Ausrüstung vorbereitet hat, bevor der Alarm ertönt, müssen Sie Ihre Debugging-Infrastruktur einrichten, bevor Probleme auftreten. Das bedeutet: umfassende Logging-Systeme, Monitoring-Dashboards, Health Checks und die Fähigkeit, problematische Features schnell zu deaktivieren.

// production-debugging/src/services/production-debugger.service.ts
// Ein sicheres und umfassendes System für Production Debugging
import { Injectable, Logger } from '@nestjs/common';
import { ConfigService } from '@nestjs/config';
import { readFileSync, writeFileSync, existsSync } from 'fs';
import * as path from 'path';

interface DebugSession {
  sessionId: string;
  startTime: Date;
  endTime?: Date;
  debugLevel: 'minimal' | 'standard' | 'verbose';
  targetModules: string[];
  collectedData: any[];
  safetyChecks: string[];
  initiatedBy: string;
}

interface ProductionIssue {
  issueId: string;
  severity: 'low' | 'medium' | 'high' | 'critical';
  category: 'performance' | 'error' | 'availability' | 'security';
  description: string;
  affectedComponents: string[];
  detectionTime: Date;
  status: 'detected' | 'investigating' | 'mitigating' | 'resolved';
  debugActions: string[];
  evidence: any[];
}

@Injectable()
export class ProductionDebuggerService {
  private readonly logger = new Logger(ProductionDebuggerService.name);
  
  // Aktive Debug-Sessions
  private readonly activeSessions = new Map<string, DebugSession>();
  
  // Erkannte Produktionsprobleme
  private readonly activeIssues = new Map<string, ProductionIssue>();
  
  // Sichere Debug-Konfiguration
  private readonly debugConfig = {
    maxConcurrentSessions: 2,
    maxSessionDuration: 30 * 60 * 1000, // 30 Minuten
    allowedDebugLevels: ['minimal', 'standard'] as const,
    restrictedModules: ['auth', 'payment', 'security'],
    maxDataCollectionSize: 100 * 1024 * 1024, // 100MB
  };

  constructor(private readonly configService: ConfigService) {
    this.initializeProductionSafeguards();
  }

  /**
   * Startet eine sichere Debug-Session in der Produktion
   * Implementiert strenge Sicherheitskontrollen und Ressourcenlimits
   */
  async startDebugSession(
    targetModules: string[],
    debugLevel: 'minimal' | 'standard' | 'verbose',
    initiatedBy: string,
    duration: number = 15 * 60 * 1000 // Standard: 15 Minuten
  ): Promise<string> {
    // Sicherheitsprüfungen vor dem Start
    const safetyChecks = await this.performSafetyChecks(targetModules, debugLevel, duration);
    
    if (safetyChecks.length > 0) {
      throw new Error(`Debug session rejected due to safety concerns: ${safetyChecks.join(', ')}`);
    }
    
    const sessionId = this.generateSessionId();
    const session: DebugSession = {
      sessionId,
      startTime: new Date(),
      debugLevel: this.sanitizeDebugLevel(debugLevel),
      targetModules: this.sanitizeTargetModules(targetModules),
      collectedData: [],
      safetyChecks: [],
      initiatedBy,
    };
    
    this.activeSessions.set(sessionId, session);
    
    // Automatisches Session-Timeout
    setTimeout(() => {
      this.endDebugSession(sessionId, 'timeout');
    }, Math.min(duration, this.debugConfig.maxSessionDuration));
    
    this.logger.warn(`Production debug session started`, {
      sessionId,
      targetModules: session.targetModules,
      debugLevel: session.debugLevel,
      initiatedBy,
      maxDuration: duration,
    });
    
    return sessionId;
  }

  /**
   * Sammelt Debug-Informationen während einer aktiven Session
   * Respektiert Sicherheitsgrenzen und Datenschutz
   */
  async collectDebugData(
    sessionId: string,
    dataType: 'logs' | 'metrics' | 'state' | 'traces',
    filters?: Record<string, any>
  ): Promise<any> {
    const session = this.activeSessions.get(sessionId);
    if (!session) {
      throw new Error(`No active debug session found: ${sessionId}`);
    }
    
    let collectedData: any;
    
    switch (dataType) {
      case 'logs':
        collectedData = await this.collectSecureLogs(session, filters);
        break;
      case 'metrics':
        collectedData = await this.collectSystemMetrics(session, filters);
        break;
      case 'state':
        collectedData = await this.collectApplicationState(session, filters);
        break;
      case 'traces':
        collectedData = await this.collectTraceData(session, filters);
        break;
      default:
        throw new Error(`Unsupported data type: ${dataType}`);
    }
    
    // Prüfe Datengröße-Limits
    const dataSize = JSON.stringify(collectedData).length;
    const totalCollectedSize = session.collectedData.reduce((sum, data) => sum + JSON.stringify(data).length, 0);
    
    if (totalCollectedSize + dataSize > this.debugConfig.maxDataCollectionSize) {
      this.logger.warn(`Debug data collection limit reached for session ${sessionId}`);
      return { error: 'Data collection limit reached' };
    }
    
    // Sanitize sensitive data
    const sanitizedData = this.sanitizeDebugData(collectedData);
    
    session.collectedData.push({
      timestamp: new Date(),
      dataType,
      data: sanitizedData,
      filters,
    });
    
    this.logger.debug(`Debug data collected`, {
      sessionId,
      dataType,
      dataSize,
      totalCollectedSize: totalCollectedSize + dataSize,
    });
    
    return sanitizedData;
  }

  /**
   * Erstellt einen sicheren Debug-Report für Produktionsprobleme
   * Entfernt sensible Daten und strukturiert Informationen für die Analyse
   */
  async generateDebugReport(sessionId: string): Promise<{
    sessionSummary: any;
    collectedData: any[];
    analysisResults: any;
    recommendations: string[];
  }> {
    const session = this.activeSessions.get(sessionId);
    if (!session) {
      throw new Error(`Debug session not found: ${sessionId}`);
    }
    
    const sessionSummary = {
      sessionId: session.sessionId,
      duration: session.endTime 
        ? session.endTime.getTime() - session.startTime.getTime()
        : Date.now() - session.startTime.getTime(),
      debugLevel: session.debugLevel,
      targetModules: session.targetModules,
      dataCollectionCount: session.collectedData.length,
      initiatedBy: session.initiatedBy,
    };
    
    // Analysiere gesammelte Daten
    const analysisResults = this.analyzeCollectedData(session.collectedData);
    
    // Generiere Empfehlungen basierend auf den Findings
    const recommendations = this.generateDebugRecommendations(analysisResults);
    
    return {
      sessionSummary,
      collectedData: session.collectedData,
      analysisResults,
      recommendations,
    };
  }

  /**
   * Implementiert einen Circuit Breaker für problematische Components
   * Ermöglicht schnelle Deaktivierung von Features ohne Deployment
   */
  async enableCircuitBreaker(
    componentName: string,
    reason: string,
    duration: number = 60 * 60 * 1000 // 1 Stunde
  ): Promise<void> {
    const circuitBreakerState = {
      componentName,
      disabled: true,
      reason,
      disabledAt: new Date(),
      duration,
      disabledBy: 'production-debugger',
    };
    
    // Speichere Circuit Breaker State persistent
    await this.persistCircuitBreakerState(componentName, circuitBreakerState);
    
    // Automatische Reaktivierung nach Duration
    setTimeout(async () => {
      await this.disableCircuitBreaker(componentName, 'automatic-timeout');
    }, duration);
    
    this.logger.error(`Circuit breaker activated for ${componentName}`, {
      reason,
      duration,
      disabledAt: circuitBreakerState.disabledAt,
    });
  }

  /**
   * Überprüft Circuit Breaker Status für einen Component
   * Wird von der Anwendungslogik verwendet um zu entscheiden, ob ein Feature verfügbar ist
   */
  async isCircuitBreakerOpen(componentName: string): Promise<boolean> {
    try {
      const stateFilePath = path.join(process.cwd(), 'circuit-breakers', `${componentName}.json`);
      
      if (!existsSync(stateFilePath)) {
        return false;
      }
      
      const stateData = readFileSync(stateFilePath, 'utf8');
      const state = JSON.parse(stateData);
      
      // Prüfe ob Circuit Breaker noch aktiv ist
      if (state.disabled) {
        const now = new Date();
        const disabledUntil = new Date(state.disabledAt.getTime() + state.duration);
        
        if (now < disabledUntil) {
          return true;
        } else {
          // Automatisch deaktivieren wenn Zeit abgelaufen
          await this.disableCircuitBreaker(componentName, 'automatic-timeout');
          return false;
        }
      }
      
      return false;
      
    } catch (error) {
      this.logger.error(`Error checking circuit breaker for ${componentName}:`, error);
      return false; // Fail-open für bessere Verfügbarkeit
    }
  }

  /**
   * Notfall-Debugging-Modus für kritische Produktionsprobleme
   * Aktiviert erweiterte Logging und Monitoring mit minimaler Performance-Impact
   */
  async activateEmergencyMode(
    reason: string,
    duration: number = 30 * 60 * 1000 // 30 Minuten
  ): Promise<void> {
    const emergencySession = await this.startDebugSession(
      ['*'], // Alle Module
      'standard',
      'emergency-system',
      duration
    );
    
    // Erhöhe Log-Level temporär
    this.logger.debug('Emergency debugging mode activated', {
      reason,
      duration,
      sessionId: emergencySession,
    });
    
    // Aktiviere erweiterte Metriken-Sammlung
    this.enableEnhancedMetrics(duration);
    
    // Benachrichtige Operations-Team
    await this.notifyOperationsTeam({
      type: 'emergency-debug-mode',
      reason,
      sessionId: emergencySession,
      duration,
    });
  }

  private async performSafetyChecks(
    targetModules: string[],
    debugLevel: string,
    duration: number
  ): Promise<string[]> {
    const issues: string[] = [];
    
    // Prüfe maximale gleichzeitige Sessions
    if (this.activeSessions.size >= this.debugConfig.maxConcurrentSessions) {
      issues.push('Maximum concurrent debug sessions reached');
    }
    
    // Prüfe Debug-Level-Berechtigung
    if (!this.debugConfig.allowedDebugLevels.includes(debugLevel as any)) {
      issues.push(`Debug level '${debugLevel}' not allowed in production`);
    }
    
    // Prüfe eingeschränkte Module
    const restrictedModulesRequested = targetModules.filter(module => 
      this.debugConfig.restrictedModules.includes(module)
    );
    if (restrictedModulesRequested.length > 0) {
      issues.push(`Restricted modules requested: ${restrictedModulesRequested.join(', ')}`);
    }
    
    // Prüfe System-Load
    const currentLoad = await this.getSystemLoad();
    if (currentLoad > 0.8) {
      issues.push(`High system load detected: ${(currentLoad * 100).toFixed(1)}%`);
    }
    
    // Prüfe maximale Session-Dauer
    if (duration > this.debugConfig.maxSessionDuration) {
      issues.push(`Requested duration exceeds maximum allowed: ${duration}ms > ${this.debugConfig.maxSessionDuration}ms`);
    }
    
    return issues;
  }

  private sanitizeDebugLevel(debugLevel: string): 'minimal' | 'standard' {
    if (debugLevel === 'verbose' && !this.configService.get<boolean>('ALLOW_VERBOSE_DEBUG', false)) {
      this.logger.warn('Verbose debug level requested but not allowed, downgrading to standard');
      return 'standard';
    }
    return debugLevel as 'minimal' | 'standard';
  }

  private sanitizeTargetModules(modules: string[]): string[] {
    return modules.filter(module => 
      !this.debugConfig.restrictedModules.includes(module)
    );
  }

  private sanitizeDebugData(data: any): any {
    // Entferne sensible Felder rekursiv
    const sensitiveFields = [
      'password', 'token', 'secret', 'key', 'auth', 'credential',
      'ssn', 'creditcard', 'cvv', 'pin', 'private'
    ];
    
    const sanitize = (obj: any): any => {
      if (typeof obj !== 'object' || obj === null) {
        return obj;
      }
      
      if (Array.isArray(obj)) {
        return obj.map(sanitize);
      }
      
      const sanitized: any = {};
      for (const [key, value] of Object.entries(obj)) {
        const lowerKey = key.toLowerCase();
        const isSensitive = sensitiveFields.some(field => lowerKey.includes(field));
        
        if (isSensitive) {
          sanitized[key] = '[REDACTED]';
        } else {
          sanitized[key] = sanitize(value);
        }
      }
      
      return sanitized;
    };
    
    return sanitize(data);
  }

  private generateSessionId(): string {
    return `debug_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`;
  }

  private async getSystemLoad(): Promise<number> {
    // Vereinfachte System-Load-Berechnung
    const memoryUsage = process.memoryUsage();
    return memoryUsage.heapUsed / memoryUsage.heapTotal;
  }

  private initializeProductionSafeguards(): void {
    // Verhindere versehentliche Debug-Aktivierung in Produktion
    if (this.configService.get<string>('NODE_ENV') === 'production') {
      this.logger.warn('Production debugging service initialized with safety constraints');
    }
    
    // Cleanup von alten Debug-Sessions beim Start
    this.cleanupExpiredSessions();
  }

  private cleanupExpiredSessions(): void {
    setInterval(() => {
      const now = Date.now();
      for (const [sessionId, session] of this.activeSessions) {
        const sessionAge = now - session.startTime.getTime();
        if (sessionAge > this.debugConfig.maxSessionDuration) {
          this.endDebugSession(sessionId, 'expired');
        }
      }
    }, 60000); // Prüfe jede Minute
  }

  private async endDebugSession(sessionId: string, reason: string): Promise<void> {
    const session = this.activeSessions.get(sessionId);
    if (session) {
      session.endTime = new Date();
      this.activeSessions.delete(sessionId);
      
      this.logger.warn(`Debug session ended`, {
        sessionId,
        reason,
        duration: session.endTime.getTime() - session.startTime.getTime(),
      });
    }
  }

  // Weitere private Methoden für spezifische Debug-Datensammlung...
  private async collectSecureLogs(session: DebugSession, filters?: any): Promise<any> {
    // Implementierung für sichere Log-Sammlung
    return { logs: 'Implementation pending' };
  }

  private async collectSystemMetrics(session: DebugSession, filters?: any): Promise<any> {
    // Implementierung für System-Metriken
    return { metrics: 'Implementation pending' };
  }

  private async collectApplicationState(session: DebugSession, filters?: any): Promise<any> {
    // Implementierung für Anwendungsstatus
    return { state: 'Implementation pending' };
  }

  private async collectTraceData(session: DebugSession, filters?: any): Promise<any> {
    // Implementierung für Trace-Daten
    return { traces: 'Implementation pending' };
  }

  private analyzeCollectedData(collectedData: any[]): any {
    // Implementierung für Datenanalyse
    return { analysis: 'Implementation pending' };
  }

  private generateDebugRecommendations(analysisResults: any): string[] {
    // Implementierung für Empfehlungen
    return ['Implementation pending'];
  }

  private async persistCircuitBreakerState(componentName: string, state: any): Promise<void> {
    // Implementierung für persistente Speicherung
  }

  private async disableCircuitBreaker(componentName: string, reason: string): Promise<void> {
    // Implementierung für Circuit Breaker-Deaktivierung
  }

  private enableEnhancedMetrics(duration: number): void {
    // Implementierung für erweiterte Metriken
  }

  private async notifyOperationsTeam(notification: any): Promise<void> {
    // Implementierung für Team-Benachrichtigungen
  }
}

Diese umfassende Betrachtung von Debugging und Troubleshooting zeigt, dass erfolgreiches Problem-Solving in NestJS-Anwendungen eine Kombination aus systematischen Strategien, den richtigen Werkzeugen und einem tiefen Verständnis der zugrunde liegenden Systeme erfordert. Wie ein erfahrener Detektiv sammeln Sie Hinweise, analysieren Muster und ziehen logische Schlüsse, um auch die schwierigsten Probleme zu lösen.

Von strategischem Debugging über Performance Profiling bis hin zu Memory Leak Detection und Production Debugging - jede Technik hat ihren Platz in Ihrem Debugging-Arsenal. Database Query Analysis hilft Ihnen dabei, eine der häufigsten Ursachen für Performance-Probleme zu identifizieren, während Distributed Tracing in komplexen Mikroservice-Umgebungen unerlässlich ist.

Der Schlüssel liegt darin, die richtige Technik für das jeweilige Problem zu wählen und dabei immer die Auswirkungen auf das laufende System zu berücksichtigen. Denken Sie daran: Debugging ist nicht nur das Beheben von Problemen, sondern auch das Verstehen Ihrer Anwendung auf einer tieferen Ebene. Jedes gelöste Problem macht Sie zu einem besseren Entwickler und trägt zur langfristigen Stabilität und Performance Ihrer Anwendung bei.

In der Produktion gilt das Prinzip “Do no harm” - Ihre Debugging-Bemühungen dürfen niemals das System weiter destabilisieren. Mit den hier vorgestellten sicheren Debugging-Strategien und Werkzeugen können Sie auch die komplexesten Produktionsprobleme effektiv angehen, ohne dabei Kompromisse bei der Stabilität oder Sicherheit einzugehen.

31 Debugging und Troubleshooting