Quickstart
Guides
Tutorials
< Home

Data drift monitoring

Basic concepts: 

  • Baseline: Users can define the baseline basis on ‘Tag’ or segment of data based on ‘date’.
  • Frequency: Users can define how frequently they want to calculate the monitoring metrics
  • Alerts frequency: Users can configure how frequently they want to be notified about the alerts

Through GUI

You can monitor your models for data drift by using AryaXAI ML monitoring. You can create a new dashboard by defining the Baseline & Current metrics, pick the statistical method to calculate drift and customize thresholds if needed. Your dashboard will get generated. You can create or modify the dashboard any number of times.

Drift Metrics: These statistical tests are available to analyze data drifts, namely the Chi-square test, Jensen-Shannon distance, Kolmogorov-Smirnov (K-S) test, Kullback-Leibler Divergence, Population Stability index (Psi), Wasserstein distance and Z-test.

You can learn more about these tests in our wiki section

Selecting dates: If you are selecting the dates, then the entire data under that tag will be used for calculating the drift. When you are selecting the date variable, ensure that there is data within these dates.

Mixing multiple tags: If you want to merge data from different tags, you can simply select multiple tags in the segment(baseline/current).

Tip: If you want to see the drift in only one feature, you can simply select that feature under 'Features to select' and calculate the drift.

Alerts and Monitors

From here you can easily create and view customized alerts for Data drift, target drift and model performance through the alerts dashboard. For this, select ‘Create alerts' in the 'Monitors' tab and define the baseline and current data parameters like we did above and set the frequency of alerts, which can be daily, weekly, monthly, quarterly or yearly. 

To create new alerts, go to:

 ML Monitoring (Main menu on left) > select ‘Monitoring’ (from the sub-tabs) > click ‘Create Alerts’

All the newly created and existing alerts are displayed on this dashboard, along with details of trigger creator, name, type and options. 

Data Drift Monitors:

You can not only track drift but can get notified if there is an identified drift in your data. To set up a 'Data drift', select 'Data drift' under 'Monitor type', post which you can the specific details.

Select drift calculation metrics: You can choose drift calculation metrics from the list provided, and set the threshold for data drift and dataset drift (when the dataset itself was drifting).

Select the baseline and current: Use the tags to define the baseline and current. 'Current' is your production data if you are tracking drift in your production data.

Select features: You can either create 1 monitor to track all the features in your data or you can select the specific feature for which you are tracking the drift.

Segmenting the baseline or current: You can use date features to further segment your baseline. You can also 'Time period in days' to dynamically select the recent 'n' days as the current data. If you have added 'Time period in days', it'll use that value as the time period the day it calculated the drift as the end date.

Alert Report

The ‘Alert’ tab (beside the Monitoring sub-tab) displays the list of alerts that have been triggered. Clicking ‘View trigger info’ displays the Trigger details, such as the current data size, data drift triggered, drift percentage, etc.

Notifications

If there is an identified drift, you'll get the alert for the same in both the web app and email at the specified frequency.

Web app alerts: Any alert triggered will be displayed as a notification on the top right corner. You can view all notifications from the tab and clear them. 

AryaXAI: Notifications

Email Alerts: The admin of the workspace will get an email if there is an identified drift.

Through SDK

You can also set up Data drift monitoring and diagnosis through AryaXAI Python SDK. To fetch the default dashboard, you don't need to pass any payload, but to create a new one, you need to pass the following parameters:

Default dashboard:


#default 
project.get_data_drift_dashboard()

#make new dashboard
project.get_data_drift_dashboard({
    "base_line_tag": ["eda"],
    "current_tag": ["XGBoost_default_data"],
    "stat_test_name": "psi"
})

You can also use the help function to get all parameters and payloads:


help(project.get_data_drift_dashboard) 

Available Stat. tests:

Page URL copied to clipboard