Drift Detection
Monitor and detect data drift and model drift in your AI systems
Drift Detection
Drift detection is a critical component of AI monitoring that identifies when your model's performance degrades due to changes in data distribution or model behavior over time.
What is Drift?
Drift occurs when the statistical properties of your model's inputs or outputs change over time, potentially leading to degraded performance.
Types of Drift
graph TB
A[Drift Types] --> B[Data Drift]
A --> C[Concept Drift]
A --> D[Prediction Drift]
B --> E[Feature Distribution Change]
C --> F[Target Relationship Change]
D --> G[Output Distribution Change]Types of Drift
1. Data Drift (Covariate Shift)
Data drift occurs when the distribution of input features changes over time.
Example:
- A recommendation model trained on user behavior from 2023 sees different patterns in 2024
- A fraud detection model encounters new types of transactions
- An image classifier sees images with different lighting conditions
Detection Methods:
- Population Stability Index (PSI)
- Kolmogorov-Smirnov Test
- Chi-Square Test
- Jensen-Shannon Divergence
2. Concept Drift
Concept drift happens when the relationship between inputs and outputs changes.
Example:
- Customer preferences change over time
- Market conditions shift
- User behavior evolves
Detection Methods:
- Performance degradation monitoring
- A/B testing with baseline
- Statistical process control
3. Prediction Drift
Prediction drift occurs when the distribution of model predictions changes.
Example:
- Model suddenly predicts more positive cases
- Output confidence scores decrease
- Classification distribution shifts
Drift Detection Object
interface DriftDetection {
id: string;
projectId: string;
runId: string;
driftType: 'data' | 'concept' | 'prediction' | 'target';
status: 'no_drift' | 'warning' | 'critical';
severity: number; // 0-100
detectedAt: string;
metrics: {
psi?: number; // Population Stability Index
ks?: number; // Kolmogorov-Smirnov statistic
chiSquare?: number;
jsDivergence?: number; // Jensen-Shannon Divergence
wasserstein?: number;
};
features: {
featureName: string;
baselineDistribution: Distribution;
currentDistribution: Distribution;
driftScore: number;
isDrifted: boolean;
}[];
threshold: {
warning: number;
critical: number;
};
metadata?: {
baselineStartDate: string;
baselineEndDate: string;
comparisonStartDate: string;
comparisonEndDate: string;
samplesBaseline: number;
samplesCurrent: number;
};
}
interface Distribution {
mean?: number;
std?: number;
min?: number;
max?: number;
quantiles?: number[];
histogram?: {
bins: number[];
counts: number[];
};
}Detect Drift
Analyze drift between baseline and current data distributions.
POST /api/drift/detect
Content-Type: application/json
Cookie: session-token=your_session_token
{
"projectId": "proj_abc123",
"runId": "run_xyz789",
"baselineWindow": {
"startDate": "2024-01-01T00:00:00Z",
"endDate": "2024-01-31T23:59:59Z"
},
"comparisonWindow": {
"startDate": "2024-09-01T00:00:00Z",
"endDate": "2024-09-30T23:59:59Z"
},
"features": ["feature1", "feature2", "feature3"],
"methods": ["psi", "ks", "js_divergence"],
"thresholds": {
"warning": 0.1,
"critical": 0.25
}
}Request Body:
projectId(required): Project identifierrunId(optional): Specific run to analyzebaselineWindow(required): Reference time periodcomparisonWindow(required): Current time period to comparefeatures(optional): Specific features to analyze (all if not specified)methods(optional): Detection methods to usethresholds(optional): Custom warning and critical thresholds
{
"success": true,
"data": {
"id": "drift_def456",
"projectId": "proj_abc123",
"runId": "run_xyz789",
"overallStatus": "warning",
"overallSeverity": 45,
"detectedAt": "2024-09-30T15:30:00Z",
"summary": {
"totalFeatures": 15,
"driftedFeatures": 3,
"noDriftFeatures": 12,
"warningCount": 2,
"criticalCount": 1
},
"features": [
{
"featureName": "user_age",
"driftScore": 0.28,
"status": "critical",
"isDrifted": true,
"metrics": {
"psi": 0.28,
"ks": 0.22,
"jsDivergence": 0.19
},
"baselineDistribution": {
"mean": 34.5,
"std": 12.3,
"min": 18,
"max": 75,
"quantiles": [25, 32, 38, 45]
},
"currentDistribution": {
"mean": 29.8,
"std": 10.5,
"min": 18,
"max": 68,
"quantiles": [22, 28, 34, 40]
}
},
{
"featureName": "transaction_amount",
"driftScore": 0.15,
"status": "warning",
"isDrifted": true,
"metrics": {
"psi": 0.15,
"ks": 0.12,
"jsDivergence": 0.08
},
"baselineDistribution": {
"mean": 125.50,
"std": 45.20
},
"currentDistribution": {
"mean": 98.30,
"std": 38.10
}
}
],
"metadata": {
"baselineStartDate": "2024-01-01T00:00:00Z",
"baselineEndDate": "2024-01-31T23:59:59Z",
"comparisonStartDate": "2024-09-01T00:00:00Z",
"comparisonEndDate": "2024-09-30T23:59:59Z",
"samplesBaseline": 15000,
"samplesCurrent": 18000
}
}
}curl -X POST https://api.cortif.ai/api/drift/detect \
-H "Content-Type: application/json" \
-H "Cookie: session-token=your_session_token" \
-d '{
"projectId": "proj_abc123",
"baselineWindow": {
"startDate": "2024-01-01T00:00:00Z",
"endDate": "2024-01-31T23:59:59Z"
},
"comparisonWindow": {
"startDate": "2024-09-01T00:00:00Z",
"endDate": "2024-09-30T23:59:59Z"
},
"methods": ["psi", "ks"],
"thresholds": {
"warning": 0.1,
"critical": 0.25
}
}'Get Drift History
Retrieve historical drift detection results for a project.
GET /api/drift/history?projectId=proj_abc123&page=1&limit=20&status=warning
Cookie: session-token=your_session_tokenQuery Parameters:
projectId(required): Project identifierpage(optional): Page number (default: 1)limit(optional): Items per page (default: 20)status(optional): Filter by status (no_drift,warning,critical)startDate(optional): Filter from dateendDate(optional): Filter to datedriftType(optional): Filter by drift type
{
"success": true,
"data": [
{
"id": "drift_def456",
"projectId": "proj_abc123",
"overallStatus": "warning",
"overallSeverity": 45,
"detectedAt": "2024-09-30T15:30:00Z",
"driftedFeatures": 3,
"totalFeatures": 15
}
],
"pagination": {
"page": 1,
"limit": 20,
"total": 45,
"totalPages": 3
}
}curl -X GET "https://api.cortif.ai/api/drift/history?projectId=proj_abc123&status=warning" \
-H "Cookie: session-token=your_session_token"Configure Drift Monitoring
Set up automatic drift monitoring for a project.
POST /api/drift/configure
Content-Type: application/json
Cookie: session-token=your_session_token
{
"projectId": "proj_abc123",
"enabled": true,
"schedule": "daily",
"baselineStrategy": "rolling_window",
"baselineWindowDays": 30,
"comparisonWindowDays": 7,
"features": ["feature1", "feature2"],
"methods": ["psi", "ks"],
"thresholds": {
"warning": 0.1,
"critical": 0.25
},
"alerts": {
"onWarning": true,
"onCritical": true,
"channels": ["email", "webhook"]
},
"actions": {
"autoRetrain": false,
"pausePredictions": false,
"notifyTeam": true
}
}Request Body:
projectId(required): Project identifierenabled(required): Enable/disable monitoringschedule(required):hourly,daily,weeklybaselineStrategy(required):fixedorrolling_windowbaselineWindowDays(required): Days for baselinecomparisonWindowDays(required): Days for comparisonfeatures(optional): Features to monitormethods(required): Detection methodsthresholds(required): Warning and critical thresholdsalerts(required): Alert configurationactions(optional): Automated actions
{
"success": true,
"data": {
"id": "config_ghi789",
"projectId": "proj_abc123",
"enabled": true,
"schedule": "daily",
"nextRunAt": "2024-10-02T00:00:00Z",
"createdAt": "2024-10-01T10:00:00Z"
}
}curl -X POST https://api.cortif.ai/api/drift/configure \
-H "Content-Type: application/json" \
-H "Cookie: session-token=your_session_token" \
-d '{
"projectId": "proj_abc123",
"enabled": true,
"schedule": "daily",
"baselineStrategy": "rolling_window",
"baselineWindowDays": 30,
"comparisonWindowDays": 7,
"methods": ["psi", "ks"],
"thresholds": {
"warning": 0.1,
"critical": 0.25
}
}'Drift Metrics Explained
Population Stability Index (PSI)
PSI measures the shift in a variable's distribution between two samples.
Formula:
PSI = Σ (Actual% - Expected%) × ln(Actual% / Expected%)Interpretation:
- PSI < 0.1: No significant change
- 0.1 ≤ PSI < 0.25: Moderate change, investigation recommended
- PSI ≥ 0.25: Significant change, action required
Kolmogorov-Smirnov (KS) Statistic
KS measures the maximum distance between two cumulative distribution functions.
Interpretation:
- KS < 0.1: Distributions are similar
- 0.1 ≤ KS < 0.3: Moderate difference
- KS ≥ 0.3: Significant difference
Jensen-Shannon Divergence
Symmetric measure of similarity between two probability distributions.
Interpretation:
- JS = 0: Identical distributions
- JS = 1: Completely different distributions
Best Practices
1. Choose the Right Baseline
- Fixed Baseline: Use training data or a known good period
- Rolling Window: Adapt to gradual changes over time
- Seasonal Baseline: Account for cyclical patterns
2. Set Appropriate Thresholds
const thresholds = {
// Conservative approach
conservative: {
warning: 0.05,
critical: 0.15
},
// Balanced approach
balanced: {
warning: 0.1,
critical: 0.25
},
// Lenient approach
lenient: {
warning: 0.15,
critical: 0.35
}
};3. Monitor Critical Features
Prioritize features that:
- Have the most impact on predictions
- Are known to be volatile
- Are business-critical
4. Automate Response
{
"actions": {
"autoRetrain": true,
"retrainThreshold": 0.25,
"pausePredictions": false,
"notifyTeam": true,
"createTicket": true
}
}5. Regular Review
- Weekly: Review drift reports
- Monthly: Analyze trends
- Quarterly: Update baselines and thresholds
Common Drift Scenarios
E-commerce Recommendation System
{
"scenario": "Seasonal shopping patterns",
"solution": {
"baselineStrategy": "seasonal",
"compareToSamePeriodLastYear": true,
"features": ["purchase_amount", "category", "time_of_day"]
}
}Fraud Detection Model
{
"scenario": "New fraud patterns emerge",
"solution": {
"schedule": "hourly",
"baselineStrategy": "rolling_window",
"baselineWindowDays": 7,
"alerts": {
"onCritical": true,
"realtime": true
}
}
}Credit Scoring Model
{
"scenario": "Economic conditions change",
"solution": {
"baselineStrategy": "fixed",
"compareToTrainingData": true,
"features": ["income", "debt_ratio", "employment_status"],
"thresholds": {
"warning": 0.05,
"critical": 0.15
}
}
}Handling Drift
When drift is detected:
- Investigate: Analyze which features are drifting
- Validate: Confirm drift is real, not data quality issues
- Assess Impact: Measure impact on model performance
- Take Action:
- Retrain model with recent data
- Adjust feature engineering
- Update model architecture
- Implement domain adaptation techniques
- Monitor: Continue monitoring after remediation
Error Handling
Common Errors
- INSUFFICIENT_BASELINE_DATA: Not enough data in baseline window
- INSUFFICIENT_COMPARISON_DATA: Not enough data in comparison window
- FEATURE_NOT_FOUND: Specified feature does not exist
- INVALID_WINDOW: Invalid date range specified
{
"success": false,
"error": "INSUFFICIENT_BASELINE_DATA",
"message": "Baseline window requires at least 100 samples, found 45",
"details": {
"requiredSamples": 100,
"foundSamples": 45,
"baselineWindow": {
"startDate": "2024-01-01T00:00:00Z",
"endDate": "2024-01-05T23:59:59Z"
}
}
}SDK Examples
JavaScript/TypeScript
import { CortifClient } from '@cortif/sdk';
const client = new CortifClient({ apiKey: 'your_api_key' });
// Detect drift
const driftResult = await client.drift.detect({
projectId: 'proj_abc123',
baselineWindow: {
startDate: '2024-01-01T00:00:00Z',
endDate: '2024-01-31T23:59:59Z',
},
comparisonWindow: {
startDate: '2024-09-01T00:00:00Z',
endDate: '2024-09-30T23:59:59Z',
},
methods: ['psi', 'ks'],
});
if (driftResult.overallStatus === 'critical') {
console.log('Critical drift detected!');
// Trigger retraining
await client.models.retrain({
projectId: 'proj_abc123',
reason: 'drift_detected',
});
}Python
from cortif import CortifClient
from datetime import datetime, timedelta
client = CortifClient(api_key='your_api_key')
# Configure automatic drift monitoring
config = client.drift.configure(
project_id='proj_abc123',
enabled=True,
schedule='daily',
baseline_strategy='rolling_window',
baseline_window_days=30,
comparison_window_days=7,
methods=['psi', 'ks', 'js_divergence'],
thresholds={
'warning': 0.1,
'critical': 0.25
},
alerts={
'on_warning': True,
'on_critical': True,
'channels': ['email', 'slack']
}
)
print(f"Drift monitoring configured: {config.id}")Rate Limits
- Drift Detection: 100 requests per hour
- History Queries: 1,000 requests per hour
- Configuration: 20 updates per hour
For higher limits, contact support@cortif.ai