cortif.ai logo
Cortif.ai

Drift Detection

Monitor and detect data drift and model drift in your AI systems

Drift Detection

Drift detection is a critical component of AI monitoring that identifies when your model's performance degrades due to changes in data distribution or model behavior over time.

What is Drift?

Drift occurs when the statistical properties of your model's inputs or outputs change over time, potentially leading to degraded performance.

Types of Drift

graph TB
    A[Drift Types] --> B[Data Drift]
    A --> C[Concept Drift]
    A --> D[Prediction Drift]
    B --> E[Feature Distribution Change]
    C --> F[Target Relationship Change]
    D --> G[Output Distribution Change]

Types of Drift

1. Data Drift (Covariate Shift)

Data drift occurs when the distribution of input features changes over time.

Example:

  • A recommendation model trained on user behavior from 2023 sees different patterns in 2024
  • A fraud detection model encounters new types of transactions
  • An image classifier sees images with different lighting conditions

Detection Methods:

  • Population Stability Index (PSI)
  • Kolmogorov-Smirnov Test
  • Chi-Square Test
  • Jensen-Shannon Divergence

2. Concept Drift

Concept drift happens when the relationship between inputs and outputs changes.

Example:

  • Customer preferences change over time
  • Market conditions shift
  • User behavior evolves

Detection Methods:

  • Performance degradation monitoring
  • A/B testing with baseline
  • Statistical process control

3. Prediction Drift

Prediction drift occurs when the distribution of model predictions changes.

Example:

  • Model suddenly predicts more positive cases
  • Output confidence scores decrease
  • Classification distribution shifts

Drift Detection Object

interface DriftDetection {
  id: string;
  projectId: string;
  runId: string;
  driftType: 'data' | 'concept' | 'prediction' | 'target';
  status: 'no_drift' | 'warning' | 'critical';
  severity: number; // 0-100
  detectedAt: string;
  metrics: {
    psi?: number; // Population Stability Index
    ks?: number; // Kolmogorov-Smirnov statistic
    chiSquare?: number;
    jsDivergence?: number; // Jensen-Shannon Divergence
    wasserstein?: number;
  };
  features: {
    featureName: string;
    baselineDistribution: Distribution;
    currentDistribution: Distribution;
    driftScore: number;
    isDrifted: boolean;
  }[];
  threshold: {
    warning: number;
    critical: number;
  };
  metadata?: {
    baselineStartDate: string;
    baselineEndDate: string;
    comparisonStartDate: string;
    comparisonEndDate: string;
    samplesBaseline: number;
    samplesCurrent: number;
  };
}

interface Distribution {
  mean?: number;
  std?: number;
  min?: number;
  max?: number;
  quantiles?: number[];
  histogram?: {
    bins: number[];
    counts: number[];
  };
}

Detect Drift

Analyze drift between baseline and current data distributions.

POST /api/drift/detect
Content-Type: application/json
Cookie: session-token=your_session_token

{
  "projectId": "proj_abc123",
  "runId": "run_xyz789",
  "baselineWindow": {
    "startDate": "2024-01-01T00:00:00Z",
    "endDate": "2024-01-31T23:59:59Z"
  },
  "comparisonWindow": {
    "startDate": "2024-09-01T00:00:00Z",
    "endDate": "2024-09-30T23:59:59Z"
  },
  "features": ["feature1", "feature2", "feature3"],
  "methods": ["psi", "ks", "js_divergence"],
  "thresholds": {
    "warning": 0.1,
    "critical": 0.25
  }
}

Request Body:

  • projectId (required): Project identifier
  • runId (optional): Specific run to analyze
  • baselineWindow (required): Reference time period
  • comparisonWindow (required): Current time period to compare
  • features (optional): Specific features to analyze (all if not specified)
  • methods (optional): Detection methods to use
  • thresholds (optional): Custom warning and critical thresholds
{
  "success": true,
  "data": {
    "id": "drift_def456",
    "projectId": "proj_abc123",
    "runId": "run_xyz789",
    "overallStatus": "warning",
    "overallSeverity": 45,
    "detectedAt": "2024-09-30T15:30:00Z",
    "summary": {
      "totalFeatures": 15,
      "driftedFeatures": 3,
      "noDriftFeatures": 12,
      "warningCount": 2,
      "criticalCount": 1
    },
    "features": [
      {
        "featureName": "user_age",
        "driftScore": 0.28,
        "status": "critical",
        "isDrifted": true,
        "metrics": {
          "psi": 0.28,
          "ks": 0.22,
          "jsDivergence": 0.19
        },
        "baselineDistribution": {
          "mean": 34.5,
          "std": 12.3,
          "min": 18,
          "max": 75,
          "quantiles": [25, 32, 38, 45]
        },
        "currentDistribution": {
          "mean": 29.8,
          "std": 10.5,
          "min": 18,
          "max": 68,
          "quantiles": [22, 28, 34, 40]
        }
      },
      {
        "featureName": "transaction_amount",
        "driftScore": 0.15,
        "status": "warning",
        "isDrifted": true,
        "metrics": {
          "psi": 0.15,
          "ks": 0.12,
          "jsDivergence": 0.08
        },
        "baselineDistribution": {
          "mean": 125.50,
          "std": 45.20
        },
        "currentDistribution": {
          "mean": 98.30,
          "std": 38.10
        }
      }
    ],
    "metadata": {
      "baselineStartDate": "2024-01-01T00:00:00Z",
      "baselineEndDate": "2024-01-31T23:59:59Z",
      "comparisonStartDate": "2024-09-01T00:00:00Z",
      "comparisonEndDate": "2024-09-30T23:59:59Z",
      "samplesBaseline": 15000,
      "samplesCurrent": 18000
    }
  }
}
curl -X POST https://api.cortif.ai/api/drift/detect \
  -H "Content-Type: application/json" \
  -H "Cookie: session-token=your_session_token" \
  -d '{
    "projectId": "proj_abc123",
    "baselineWindow": {
      "startDate": "2024-01-01T00:00:00Z",
      "endDate": "2024-01-31T23:59:59Z"
    },
    "comparisonWindow": {
      "startDate": "2024-09-01T00:00:00Z",
      "endDate": "2024-09-30T23:59:59Z"
    },
    "methods": ["psi", "ks"],
    "thresholds": {
      "warning": 0.1,
      "critical": 0.25
    }
  }'

Get Drift History

Retrieve historical drift detection results for a project.

GET /api/drift/history?projectId=proj_abc123&page=1&limit=20&status=warning
Cookie: session-token=your_session_token

Query Parameters:

  • projectId (required): Project identifier
  • page (optional): Page number (default: 1)
  • limit (optional): Items per page (default: 20)
  • status (optional): Filter by status (no_drift, warning, critical)
  • startDate (optional): Filter from date
  • endDate (optional): Filter to date
  • driftType (optional): Filter by drift type
{
  "success": true,
  "data": [
    {
      "id": "drift_def456",
      "projectId": "proj_abc123",
      "overallStatus": "warning",
      "overallSeverity": 45,
      "detectedAt": "2024-09-30T15:30:00Z",
      "driftedFeatures": 3,
      "totalFeatures": 15
    }
  ],
  "pagination": {
    "page": 1,
    "limit": 20,
    "total": 45,
    "totalPages": 3
  }
}
curl -X GET "https://api.cortif.ai/api/drift/history?projectId=proj_abc123&status=warning" \
  -H "Cookie: session-token=your_session_token"

Configure Drift Monitoring

Set up automatic drift monitoring for a project.

POST /api/drift/configure
Content-Type: application/json
Cookie: session-token=your_session_token

{
  "projectId": "proj_abc123",
  "enabled": true,
  "schedule": "daily",
  "baselineStrategy": "rolling_window",
  "baselineWindowDays": 30,
  "comparisonWindowDays": 7,
  "features": ["feature1", "feature2"],
  "methods": ["psi", "ks"],
  "thresholds": {
    "warning": 0.1,
    "critical": 0.25
  },
  "alerts": {
    "onWarning": true,
    "onCritical": true,
    "channels": ["email", "webhook"]
  },
  "actions": {
    "autoRetrain": false,
    "pausePredictions": false,
    "notifyTeam": true
  }
}

Request Body:

  • projectId (required): Project identifier
  • enabled (required): Enable/disable monitoring
  • schedule (required): hourly, daily, weekly
  • baselineStrategy (required): fixed or rolling_window
  • baselineWindowDays (required): Days for baseline
  • comparisonWindowDays (required): Days for comparison
  • features (optional): Features to monitor
  • methods (required): Detection methods
  • thresholds (required): Warning and critical thresholds
  • alerts (required): Alert configuration
  • actions (optional): Automated actions
{
  "success": true,
  "data": {
    "id": "config_ghi789",
    "projectId": "proj_abc123",
    "enabled": true,
    "schedule": "daily",
    "nextRunAt": "2024-10-02T00:00:00Z",
    "createdAt": "2024-10-01T10:00:00Z"
  }
}
curl -X POST https://api.cortif.ai/api/drift/configure \
  -H "Content-Type: application/json" \
  -H "Cookie: session-token=your_session_token" \
  -d '{
    "projectId": "proj_abc123",
    "enabled": true,
    "schedule": "daily",
    "baselineStrategy": "rolling_window",
    "baselineWindowDays": 30,
    "comparisonWindowDays": 7,
    "methods": ["psi", "ks"],
    "thresholds": {
      "warning": 0.1,
      "critical": 0.25
    }
  }'

Drift Metrics Explained

Population Stability Index (PSI)

PSI measures the shift in a variable's distribution between two samples.

Formula:

PSI = Σ (Actual% - Expected%) × ln(Actual% / Expected%)

Interpretation:

  • PSI < 0.1: No significant change
  • 0.1 ≤ PSI < 0.25: Moderate change, investigation recommended
  • PSI ≥ 0.25: Significant change, action required

Kolmogorov-Smirnov (KS) Statistic

KS measures the maximum distance between two cumulative distribution functions.

Interpretation:

  • KS < 0.1: Distributions are similar
  • 0.1 ≤ KS < 0.3: Moderate difference
  • KS ≥ 0.3: Significant difference

Jensen-Shannon Divergence

Symmetric measure of similarity between two probability distributions.

Interpretation:

  • JS = 0: Identical distributions
  • JS = 1: Completely different distributions

Best Practices

1. Choose the Right Baseline

  • Fixed Baseline: Use training data or a known good period
  • Rolling Window: Adapt to gradual changes over time
  • Seasonal Baseline: Account for cyclical patterns

2. Set Appropriate Thresholds

const thresholds = {
  // Conservative approach
  conservative: {
    warning: 0.05,
    critical: 0.15
  },
  // Balanced approach
  balanced: {
    warning: 0.1,
    critical: 0.25
  },
  // Lenient approach
  lenient: {
    warning: 0.15,
    critical: 0.35
  }
};

3. Monitor Critical Features

Prioritize features that:

  • Have the most impact on predictions
  • Are known to be volatile
  • Are business-critical

4. Automate Response

{
  "actions": {
    "autoRetrain": true,
    "retrainThreshold": 0.25,
    "pausePredictions": false,
    "notifyTeam": true,
    "createTicket": true
  }
}

5. Regular Review

  • Weekly: Review drift reports
  • Monthly: Analyze trends
  • Quarterly: Update baselines and thresholds

Common Drift Scenarios

E-commerce Recommendation System

{
  "scenario": "Seasonal shopping patterns",
  "solution": {
    "baselineStrategy": "seasonal",
    "compareToSamePeriodLastYear": true,
    "features": ["purchase_amount", "category", "time_of_day"]
  }
}

Fraud Detection Model

{
  "scenario": "New fraud patterns emerge",
  "solution": {
    "schedule": "hourly",
    "baselineStrategy": "rolling_window",
    "baselineWindowDays": 7,
    "alerts": {
      "onCritical": true,
      "realtime": true
    }
  }
}

Credit Scoring Model

{
  "scenario": "Economic conditions change",
  "solution": {
    "baselineStrategy": "fixed",
    "compareToTrainingData": true,
    "features": ["income", "debt_ratio", "employment_status"],
    "thresholds": {
      "warning": 0.05,
      "critical": 0.15
    }
  }
}

Handling Drift

When drift is detected:

  1. Investigate: Analyze which features are drifting
  2. Validate: Confirm drift is real, not data quality issues
  3. Assess Impact: Measure impact on model performance
  4. Take Action:
    • Retrain model with recent data
    • Adjust feature engineering
    • Update model architecture
    • Implement domain adaptation techniques
  5. Monitor: Continue monitoring after remediation

Error Handling

Common Errors

  • INSUFFICIENT_BASELINE_DATA: Not enough data in baseline window
  • INSUFFICIENT_COMPARISON_DATA: Not enough data in comparison window
  • FEATURE_NOT_FOUND: Specified feature does not exist
  • INVALID_WINDOW: Invalid date range specified
{
  "success": false,
  "error": "INSUFFICIENT_BASELINE_DATA",
  "message": "Baseline window requires at least 100 samples, found 45",
  "details": {
    "requiredSamples": 100,
    "foundSamples": 45,
    "baselineWindow": {
      "startDate": "2024-01-01T00:00:00Z",
      "endDate": "2024-01-05T23:59:59Z"
    }
  }
}

SDK Examples

JavaScript/TypeScript

import { CortifClient } from '@cortif/sdk';

const client = new CortifClient({ apiKey: 'your_api_key' });

// Detect drift
const driftResult = await client.drift.detect({
  projectId: 'proj_abc123',
  baselineWindow: {
    startDate: '2024-01-01T00:00:00Z',
    endDate: '2024-01-31T23:59:59Z',
  },
  comparisonWindow: {
    startDate: '2024-09-01T00:00:00Z',
    endDate: '2024-09-30T23:59:59Z',
  },
  methods: ['psi', 'ks'],
});

if (driftResult.overallStatus === 'critical') {
  console.log('Critical drift detected!');
  // Trigger retraining
  await client.models.retrain({
    projectId: 'proj_abc123',
    reason: 'drift_detected',
  });
}

Python

from cortif import CortifClient
from datetime import datetime, timedelta

client = CortifClient(api_key='your_api_key')

# Configure automatic drift monitoring
config = client.drift.configure(
    project_id='proj_abc123',
    enabled=True,
    schedule='daily',
    baseline_strategy='rolling_window',
    baseline_window_days=30,
    comparison_window_days=7,
    methods=['psi', 'ks', 'js_divergence'],
    thresholds={
        'warning': 0.1,
        'critical': 0.25
    },
    alerts={
        'on_warning': True,
        'on_critical': True,
        'channels': ['email', 'slack']
    }
)

print(f"Drift monitoring configured: {config.id}")

Rate Limits

  • Drift Detection: 100 requests per hour
  • History Queries: 1,000 requests per hour
  • Configuration: 20 updates per hour

For higher limits, contact support@cortif.ai