Data drift monitoring

Basic concepts:

‍Baseline: Users can define the baseline basis on ‘Tag’ or segment of data based on ‘date’.
‍Frequency: Users can define how frequently they want to calculate the monitoring metrics
‍Alerts frequency: Users can configure how frequently they want to be notified about the alerts

‍

Through GUI

You can monitor your models for data drift by using AryaXAI ML monitoring. You can create a new dashboard by defining the Baseline & Current metrics, pick the statistical method to calculate drift and customize thresholds if needed. Your dashboard will get generated. You can create or modify the dashboard any number of times.

Drift Metrics: These statistical tests are available to analyze data drifts, namely the Chi-square test, Jensen-Shannon distance, Kolmogorov-Smirnov (K-S) test, Kullback-Leibler Divergence, Population Stability index (Psi), Wasserstein distance and Z-test.

You can learn more about these tests in our wiki section.

Selecting dates: If you are selecting the dates, then the entire data under that tag will be used for calculating the drift. When you are selecting the date variable, ensure that there is data within these dates.

Mixing multiple tags: If you want to merge data from different tags, you can simply select multiple tags in the segment(baseline/current).

‍

Tip: If you want to see the drift in only one feature, you can simply select that feature under 'Features to select' and calculate the drift.

Alerts and Monitors

From here you can easily create and view customized alerts for Data drift, target drift and model performance through the alerts dashboard. For this, select ‘Create alerts' in the 'Monitors' tab and define the baseline and current data parameters like we did above and set the frequency of alerts, which can be daily, weekly, monthly, quarterly or yearly.

To create new alerts, go to:

ML Monitoring (Main menu on left) > select ‘Monitoring’ (from the sub-tabs) > click ‘Create Alerts’

All the newly created and existing alerts are displayed on this dashboard, along with details of trigger creator, name, type and options.

Data Drift Monitors:

You can not only track drift but can get notified if there is an identified drift in your data. To set up a 'Data drift', select 'Data drift' under 'Monitor type', post which you can the specific details.

Select drift calculation metrics: You can choose drift calculation metrics from the list provided, and set the threshold for data drift and dataset drift (when the dataset itself was drifting).

Select the baseline and current: Use the tags to define the baseline and current. 'Current' is your production data if you are tracking drift in your production data.

Select features: You can either create 1 monitor to track all the features in your data or you can select the specific feature for which you are tracking the drift.

Segmenting the baseline or current: You can use date features to further segment your baseline. You can also 'Time period in days' to dynamically select the recent 'n' days as the current data. If you have added 'Time period in days', it'll use that value as the time period the day it calculated the drift as the end date.

Alert Report

The ‘Alert’ tab (beside the Monitoring sub-tab) displays the list of alerts that have been triggered. Clicking ‘View trigger info’ displays the Trigger details, such as the current data size, data drift triggered, drift percentage, etc.

Notifications

If there is an identified drift, you'll get the alert for the same in both the web app and email at the specified frequency.

Web app alerts: Any alert triggered will be displayed as a notification on the top right corner. You can view all notifications from the tab and clear them.

Email Alerts: The admin of the workspace will get an email if there is an identified drift.

Through SDK

You can also set up Data drift monitoring and diagnosis through AryaXAI Python SDK. To fetch the default dashboard, you don't need to pass any payload, but to create a new one, you need to pass the following parameters:

Default dashboard:


#default 
project.get_data_drift_dashboard()

#make new dashboard
project.get_data_drift_dashboard({
    "base_line_tag": ["eda"],
    "current_tag": ["XGBoost_default_data"],
    "stat_test_name": "psi"
})

You can also use the help function to get all parameters and payloads:


help(project.get_data_drift_dashboard)

‍

Available Stat. tests:

Copy Link

ON THIS PAGE

Data drift monitoring

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Synthetic AI

Synthetic data is one of the techniques to align models. However, the efficacy highly depends on the ability to create high-quality synthetic data sets. AryaXAI offers advanced 'Synthetic AI' techniques like GPT-2 and CTGAN to create high-quality synthetic data sets.

Train Model

Through GUI

To generate synthetic data in AryaXAI, the initial step involves training a 'Synthetic model.' This model creates initial data based on the uploaded training data. Once the quality of this generated data is assessed and approved by the user, additional synthetic data can be produced.

To begin, the model needs training. Follow these steps:

Access the 'Synthetic AI' tab in the Main menu (on the left).
Switch to the 'Train Model' tab at the top.
Choose your preferred model and click 'Train' to begin the model training process. This action will redirect you to 'Data Configuration' for customization, where you set the 'Initial Configuration' and 'Model Parameters'

In the 'Data Configuration' section:

- Under 'Initial Configuration,' select the relevant data tag from the dropdown for creating synthetic data. Exclude specific features if needed and click 'Save initial configuration.'

- Proceed to 'Model Parameters' and input details such as Batch size, Early stopping patience, Early stopping threshold, Epochs, Model type, Random state, and Tabular config.

After completing the above steps, await the 'Model Training complete' notification.

Through SDK

Help function method train_synthetic_model:


help(project.train_synthetic_model)

‍

Define parameters for your synthetic model:


data_config = {
    "tags": ["Training"],
    "feature_include": feature_include # data used for training/generating synthetic data
}
hyper_params = {
    "epochs": 2,          # epochs are no of iteration of data into model (more the better, but longer) # Max 100 supported
    "test_ratio": 0.2     # Data used for training/generating synthetic data. how much to keep aside for testing
}

project.train_synthetic_model(
    model_name='CTGAN',                    # CTGAN / GPT2 , models are avaialable
    data_config=data_config,
    hyper_params=hyper_params
)

‍

To fetch trained models:


project.synthetic_models()

Synthetic Models

Through GUI

After the model training, the 'Synthetic Models' tab showcases the trained models and their status. This comprehensive list includes key details such as the model's Name, creator, creation date, overall quality score, Column shapes, and Column pair trends.

In the 'Options' column within the list, selecting 'Show' unveils additional model details, including:

- Synthetic Data Quality

- Training: Detailed training logs and associated data tags. If the model training fails, the log will provide reasons for the failure.

- Synthetic Data generation

- Anonymity test

Clicking on any of these sections reveals further details. Using the saved model, you can generate additional synthetic data.

Through SDK

To generate data and analyze the synthetic model quality via SDK:


# select model you want
project.synthetic_model(model_name='CTGAN_v14') 
model = project.synthetic_model(model_name='CTGAN_v1')

model.get_data_quality()

‍

Synthetic Data

Through GUI

In the 'Synthetic Data' tab, you'll find the initial data generated post-model training. This list showcases the data's creation date and time, along with the following details:

   - Overall quality score: Represents the mean of Column Shapes and Column pair trends, providing an overview of data quality.
   - Column Shapes: Indicates the similarity between uploaded and synthetic data for individual columns. A higher score implies closer resemblance. A score of 1 signifies significant       divergence, while a score between 5-7 suggests considerable similarity.
   - Column pair trends: Reflects similarity between uploaded and synthetic data for pairs of columns.
   - The PSI plot graph visualizes data distribution congruence, followed by the count of rows and features used in generating the synthetic data.

Synthetic Data generation

To generate additional synthetic data rows:

Visit the 'Synthetic Models' tab and choose 'Show' for your preferred model.
Scroll down to locate 'Synthetic Data generation' under 'Training.'
Specify the number of 'Synthetic Rows' required and click 'Generate.'

AryaXAI will store the newly generated data in the 'Synthetic Data' tab, identified under the same naming convention with '_1.'

Anonymity test

Anonymeter is a comprehensive statistical system that assesses privacy risks in synthetic tabular datasets. It includes evaluators that gauge the likelihood of identifying individuals, linking data, and making inferences, all of which could pose risks to data donors after publishing a synthetic dataset.

To perform an Anonymity test on your data:

Navigate to the 'Synthetic Models' tab and opt for 'Show' for your intended model.
Scroll down to locate 'Anonymity test' under 'Synthetic Data generation.'
In the 'Aux Columns' dropdown, select the Auxiliary columns for comparing data values. Choose tags from the 'Control tags' dropdown, which were not utilized during training, and click 'Submit.'

Upon successful execution, the screen displays the metric values associated with Privacy Evaluation. AryaXAI measures this on four metrics:

- Univariate: Looks at individual variables in isolation

- Multivariate: Considers the combined effect or correlation among various attributes

- Linkability: Focuses on assessing the risk of connecting or linking sensitive information across different datasets or sources

- Inference: Involves deducing or predicting sensitive details by analyzing patterns, correlations, or statistical relationships present within the data

Through SDK

To generate Synthetic Data via SDK:


model.generate_synthetic_datapoints(1000)

‍

To get the Population Stability Index (psi) plot for synthetic model data via SDK:


model.plot_psi()

‍

To fetch existing anonymity scores for model synthetic data:


model.anonymity_score()

‍

Create new anonymity scores for model via SDK:


model.generate_anonymity_score(
    aux_columns=["Alley","3SsnPorch"],
    control_tag='Training'
)

‍

Prompting

Following the model's training phase, before synthetic data generation, AryaXAI offers a feature called 'Prompting' that allows you to establish specific conditions for the data generation process.

Through GUI

To create a new prompt, go to the 'Prompting' tab in Synthetic AI. Select the 'Create Prompt' button on the right.

Navigate to the 'Prompting' tab within Synthetic AI and click the 'Create Prompt' button located on the right.
Fill in the Prompt name and specify features while setting conditional operators.
Add the Feature value as required, then save the prompt.

The created prompt will appear in the Prompting tab, displaying its name, creation and update details, and status. Here, you can deactivate or delete the prompt. An 'Active' status indicates that the conditions specified in the prompt will be applied during the generation of new synthetic data.

Through SDK

List existing prompts


project.get_synthetic_prompts()

‍

Create Synthetic Promopts


project.create_synthetic_prompt(
    name='Grade A synths',
    expression='(grade = A)'
)

‍

Additional functions:


# project.get_synthetic_model_params()
# project.get_synthetic_models()
# project.get_synthetic_model(model_name='CTGAN_v17')
# model = project.get_synthetic_model(model_name='CTGAN_v17')
# model.get_model_type()
# model.get_data_quality()
# model.plot_psi()
# model.generate_datapoints(1000)
# model.generate_anonymity_score(
#     aux_columns=["Alley","3SsnPorch"],
#     control_tag='CTGAN_v17_SyntheticData1'
# )
# model.get_anonymity_score()
# model.delete()
# model.get_tags()
# tag = model.get_tag('CTGAN_v17_SyntheticData1')
# tag.get_model_name()
# tag.view_metadata()
# tag.get_datapoints()
# tag.delete()
# project.get_synthetic_prompts()
# prompt = project.get_synthetic_prompt(prompt_id='6a5a494156d6')
# prompt.get_expression()
# prompt.get_config()
# prompt.activate()
# prompt.deactivate()
# project.get_observation_params()
# project.create_synthetic_prompt(
#     name='prompt sdk 5',
#     expression='GarageCond == Ex'
# )

AutoML

If you don’t have your own model, you can create a new one through Auto ML. .

Whenever a user creates a new project, a default AutoML model, ‘XGBoost_default, gets trained for default prediction and default explainability. In case a user wants to use his own model for explainability, the same can be done through:

Uploading own model OR
Training the model, for which inbuilt modelling techniques - Tree-based, probabilistic and linear, are present in AryaXAI, namely:

- XGBoost

- LGBoost

- CatBoost

- RandomForest

- SGD

- Logistic Regression

- GaussianNaiveBayes

You can fine tune these models and tune the hyper parameters.

Through GUI

To upload a model:

Note: You can only upload a new model within an existing project where the initial data iteration has been uploaded.

‍

Once the above criteria are met:

Access the 'Settings' section and navigate to the 'Model Upload' tab
Complete the necessary fields, including:

- Model Name

- Model architecture (supporting Machine Learning and Deep Learning)

- Model Data tags (representing the data used to train the model)

- Model file upload (.pkl and .h file types are supported)

Once the above steps are executed, the newly uploaded model becomes visible in the AutoML section.

To activate the new model for use, you must manually select it under 'Options' within the 'Model Versions' tab.

Model Performance

The default performance of the model trained is tracked under the ‘Model Performance’ tab.

Train Model

To train a model:

- Upon selecting a model type, you can configure the Data Configuration, which mirrors the settings used during the initial data upload.

- Choose 'Save initial configuration' and 'Save Feature Encoding' for consistency and accuracy in the model training process.

- Customize the model parameters to tailor the training process according to specific requirements.

- After setting up the data configuration and model parameters, selecting 'Update' initiates the model training process.

Once the model is successfully trained, a comprehensive list of all versions is accessible and listed in the 'Model Versions' tab.

Note: A maximum of 10 models can be trained within a workspace. Within a project, only 2 models can be trained. (Considering workspace limitations, a maximum of 5 projects can be created.)

‍

You need to activate the new model manually under ‘Options’ in the ‘Model Versions’ tab.

Upon activating a model, detailed information becomes available within the 'Model Info' section, providing a comprehensive overview of the model, which includes:

- Model name

- Model Type

- Model Params

- Data tags

- Modelling info, which shows the details used for training the model

‍

Inferencing

For the (activated) model you have trained, if you want to derive inferences for any tag, AryaXAI offers the ‘Inferencing’ section. In this section, you can run predictions using the activated model on specific files or tags.

Post-inference execution, the results are stored as tags, which are listed in the 'Inferencing Files' section. The list displays essential details such as the model name, the creation date of the inference, and performance metrics like accuracy, recall, and precision, among others. You also have the option to download this data if needed.

The same tags generated from the inference process are accessible within the ML Monitoring section, providing a unified view of the inferences made by the model.

Through SDK

Additionally, you can also use the AryaXAI python package for tasks like training models, activating models for a project, model inferencing, or case info on projects with just a few lines of code.

‍

Upload a Model:


project.upload_model()

‍

Help function to upload a model:


help(project.upload_model)

‍

To check all the models currently active or staged in your project via SDK, you can use the below command. 'Status' column will give the current status info.


# train model
project.train_model(model_type='RandomForest')

# all trained models
project.models()

‍

The below function performs predictions on testing data using the XGBoost default model. You have the flexibility to pass any model you prefer, or leave it blank to use the default model.


testresults = project.model_inference(tag="Testing",model_name="XGBoost_v1")

‍

Additional Functions:


# Train model. This Trains Another version Model with Current config (If config not passed)
project.train_model()

# get Active model details
modelinfo = project.model_summary()

# available models to train 
project.available_models()

# set model active for project
project.activate_model('model_name')

# get current data config details. If you don't modify the data settings, any future fine-tuning will use the same data settings
modelinfo.data_config()

# remove model
project.remove_model('model_name')

# model inference
project.model_inference(tag="Training",model_name="XGBoost_default")
# model_name optional default to active model for the project

#The inferencing results are stored as 'Testing_XGBoost_v1_Inference' tag.
project.all_tags()

# project Cases 
project.cases()  # #last 20 case list
project.cases(unique_identifier='A11')

# project Case Info 
case=project.case_info(unique_identifier='A11',tag='training')

‍

To get the Help functions:


# Help on method model_inference
help(project.model_inference)

# Help on method train_model
help(project.train_model)

Policies and Policy Trail

Policies

Post doing stress testing, users can identify multiple areas where the models fail and are often contribute very high business continuity risks. Also, each business would have definitive guidelines that they would like to impose on models too. All these can be defined as 'Policy' in AryaXAI.

Essentially, Policies are the “rules/Guidelines” you can write to override a model prediction. The framework will impose the 'policies' on the models and follow the instructions as provided by the user.

Through GUI

Policies (Main menu on the left) > Create Policy

Define the policy and define the feature (data point on which you want to write the policy on). Select the conditional operators (Viz. not equal to, equal to, greater than, less than) and current expression.

Add the policy statement, select the input under ‘Decision’ and mention the decision value you want in the final prediction, and select ‘Save’.

All policies are displayed on the Policy dashboard. You can easily Activate/ Deactivate, edit or delete the policies from here.

When viewing cases, the ‘Policies’ tab (ML Explainability > View cases > ‘view’ under the Options column) will display the policy details for the particular case.

Here, ‘Model Prediction’ is the original model prediction and ‘Final prediction’ is the overridden prediction based on the custom rules defined.

Through SDK


project.create_policy()

‍

Help function to create a new policy:


help(project.create_policy)

Additional functions:


#View policies for project
project.policies()

#Delete Policy
project.delete_policy()

Policy Trail

Similar to the "Observations Trail" the "Policy Tail" functions in a comparable manner but specifically for policies. It could log events related to policy creation, modifications, updates, or any other relevant actions taken within the policy management system.

This feature assists in tracking the evolution of policies, understanding the sequence of modifications, and identifying who made which changes and when. It can be valuable for compliance, auditing, troubleshooting, and ensuring transparency and accountability.

XAI: Similar Cases

Introduction:

'Similar cases' aka reference as explanations is a parallel method of 'citing references' to a prediction. 'Similar cases' extracts the top 15 most similar cases in the training data as compared to the 'prediction case'. The similarity algorithm varies depending on the plan. For the AryaXAI Developer version, we use the 'prediction probability' similarity method whereas for AryaXAI Enterprise, other methods like 'Feature Importance Similarity', and 'Data Similarity' is also available.

View Similar Cases:

Through GUI

This tab displays similar cases from the previous data where the prediction was similar or almost similar. The features are plotted in a graph where you can filter based on the data labels. Below the graph, all similar cases are listed, which can be filtered based on the Feature name. You can also view details of any of the similar cases listed from the ‘view’ option.

‍

Through SDK

To list all Similar Cases wrt a particular case, and get data of the similar cases via SDK:


# List of Similar Cases wrt to a Case
case_info.similar_cases()

# Data of Similar Cases
case_info.explainability_similar_cases()

XAI: Observations

Introduction:

'Observations' provides the easiest and most effective way of estimating the correlation of industry knowledge vs model functioning. It allows the subject matter experts to be part of the explainability framework and provides easily understandable explainability notes to all stakeholders.

The ‘Observations’ section explains the reasoning behind the predictions made. If you want to see a causation correlation with the model prediction, you can easily define the conditions/ causes as ‘observations’.

Creating/Editing Observations:

Through GUI

Go to ML Explainability > Observations.

To create a new observation, select the ‘Create observation’ button on the right.

Next, define the observation and define the feature (data point on which you want to write the observation). Select the conditional operators (Viz. not equal to, equal to, greater than, less than) and current expression to add multiple IFTT conditions.

Once the operation is written, link them to engineered features (actual features that are going into the model). You can select multiple features here and write an observation statement. You can call for the data in the observation statement using curly brackets, i.e. {

‍

View observations:

Once saved, if any of the observations hold true for a case, it will be displayed below the case. This can be viewed at ML explainability > View cases > ‘View’ under the ‘Options’ column in the summary table.

Selecting the ‘Advanced view’ option provides additional details on the observations. The ‘Success’ column here displays whether the particular observation is running on the. ‘Triggered’ will show if the observation is relevant to the current case.

AryaXAI: Case-wise observations - Advanced view

Observations score:

observation score is the sum of feature importance of linked features.

Through SDK:

To create an observation


project.create_observation()

‍

To view observations executed for Case:


case_info.explainability_observations()

‍

To make Observation Active, Inactive and change params


project.update_observation(observation_id,observation_name,status)

‍

To delete an observation:


project.delete_observation(observation_id,observation_name)

‍

For help function to create and update an observation:


help(project.create_observation)

help(project.update_observation)

‍

Observations Trail

When an observation is created and subsequently modified, all changes are systematically logged and accessible in the "Observations Trail" section.

Observations section presents a tabulated format showcasing crucial details, including the initial creation date, any updates made, their respective dates and times, and the current status of the observation.

Moreover, within the table, the 'Options' section offers a 'Show' feature that grants access to both the Current and Old Config data. This feature reveals comprehensive information about the modifications, including the user responsible for the update, the specific statement that underwent changes, linked features affected by the modification, and the exact expression or alterations made. This thorough display ensures a comprehensive overview of the modification history and allows for a detailed examination of each update.

Through SDK

To check history of updates in observations via teh SDK:


project.observation_trail()

XAI: Feature Importance and Prediction Path

Feature importance

Feature importance is one of the standard ways to explain a model. It provides a very high-level overview of how the features are used by the model to arrive at the prediction output and also debug the issues in the model.

AryaXAI provides feature importance on two levels:

- Global Feature importance: Refers to the assessment of the significance of each feature across an entire dataset or project. It provides an overarching understanding of how different features contribute to the model's predictions or outcomes on a broader scale.

- Local explanations (At case-level): Focuses on evaluating the significance of features for individual predictions or instances within the dataset. It provides insights into how specific features contribute to the model's decision-making for each case or prediction outcome. This analysis is more granular, identifying the relative importance of features for particular instances, allowing for the understanding of how the model utilizes different features to arrive at predictions on a case-by-case basis.

AryaXAI uses the feature defined in the data settings and creates an XAI model to derive feature importance. In developer version of the product, it uses the first file that is uploaded as the training data to build the XAI model. In the AryaXAI enterprise version, it can use the model directly to build the XAI model.

‍


project.models() # list all available model

project.activate_model('model_name')  # make any model active

Data settings:

When you are defining the data settings, ensure that the features are matching with the features you actually used in your model for higher accuracy of explainability. These final features are mandatory in any new file uploaded in that project.

- Model Type: Select the model type classification/regression‍

- UID: define the variable to be used as UID. This will be used to identify duplicate cases‍

- True value: Select the true value variable in the data‍

- Predicted Value: Select the predicted value variable in the data. If the true value is missing, AryaXAI will use the 'true' value to build the XAI model‍

- Feature to exclude: Exclude all the features that are not used in your modelling or are irrelevant to your model‍

- Exclude other UID: If there is any other UID, you can exclude them by checking this option

AryaXAI will deploy AutoML and builds an XAI model based on these settings, which will be used for deriving feature importance.

Here, you can view the Global explainability, observations and Case view.

Note: When defining data features, specifically the data settings, it should be noted that these become the base for explainable model training. The feature selection that is done here should align with the final features that have been used in the model.

Global Feature importance

Through GUI

The global feature importance dashboard displays the aggregation of features and feature importance across all the baseline data.

AryaXAI: Global feature importance dashboard

Through SDK:

Note: By Default the active model is 'XGBoost_default' which is the AryaXai Surrogate Model.

‍

To view all available models and set a different model as active, use the commands mentioned below.


project.models() # list all available model

project.activate_model('model_name')  # make any model active

‍

To get the Global Feature Importance of Current active Model via SDK:


modelinfo.feature_importance()

‍

Few additional commands:


# Get Information of active model
modelinfo = project.model_summary()
modelinfo.info()

# The Tree Prediction Path the Model has using for Prediction (Only for Tree Based Models)
modelinfo.predication_path()

‍

Local explanations: Case-wise

The ‘View cases’ tab displays all your data points. You can also filter among the data points using a Unique identifier, the data upload dates or the data tag. For in-depth insights into a particular case, click on ‘View’ under the ‘Options’ column in the cases summary table. This will lead you to the case view dashboard.

Through SDK

Using the below command displays a list of cases. The list allows you to apply filters using tags and search for a particular case by using its unique identifier.


project.cases(tag='Training')

‍

Use the below command for fetching Explainability for a case. This will use the current 'active' model.


case_info = project.case_info('unique_identifer','tag')

# Case Decision
case_info.explainability_decision()

‍

Note: If you change the active model, then the prediction and explanability will change as well.

‍

Feature Importance

Through GUI

Selecting the ‘View’ option for a particular case provides a complete overview of the parameters AryaXAI is using for explainability. You can view the local features and the feature importance plot.

The feature importance plot displays the top 20 features, and you can select the ‘Show more’ tab to view all the features in your data that positively and negatively impact the prediction.

Note: The surrogate XAI model (parallel model) uses the features (or variables) configured in data settings to build the explainability model.

‍

Through SDK

For case-wise feature importance via SDK:


# feature importance plot
case.explainability_feature_importance()

‍

You can filter the list of cases by utilizing tags and search for specific cases using their unique identifiers.


project.cases(tag='Training')

‍

Additional commands:


# Case Decision
case_info.explainability_decision()

#Case Feature Importance
case_info.explainability_feature_importance()

‍

Raw data

The ‘Raw data’ tab displays the details of the data uploaded, where you can verify if the data upload was done correctly.

Through SDK

To fetch Raw Data of all features for a particular case via SDK, use the following prompt:


# raw data
case.explainability_raw_data()

Prediction Path

The Prediction path tab displays the path followed by the Tree-based models like XGBoost or LGBoost for a particular prediction. It represents the route taken through the decision trees, showcasing which features were evaluated and the decisions made at each node until the sample reaches a leaf and a prediction is generated.

To get the prediction path for a particular case via the SDK:


case_info.explainability_prediction_path()

‍

Retraining the XAI model

To retrain the explainability model, you can simply modify the data settings. This can be done by selecting the ‘Update config’ option present in ‘Data settings’. Whenever the settings are modified, the explainability model is retrained again. XAI model can be retrained as many times as needed to achieve the best correlation between model prediction and the model functioning.

‍

ML Explainability

Model performance monitoring

Basic concepts:

‍Baseline: Users can define the baseline basis on ‘Tag’ or segment of data based on ‘date’
‍Frequency: Users can define how frequently they want to calculate the monitoring metrics
‍Alerts frequency: Users can configure how frequently they want to be notified about the alerts

Model performance dashboard

The model performance dashboard lets you can analyze your model's performance over time or between model versions. This analysis is displayed across various parameters for the predicted and actual performance.

Through GUI

The model performance report will display various metrics like accuracy, precision, recall and also quality metrics.

Alerts and Monitors

To create new alerts, go to:

ML Monitoring (Main menu on left) > select ‘Monitoring’ (from the sub-tabs) > click ‘Create Alerts’

All the newly created and existing alerts are displayed on this dashboard, along with details of trigger creator, name, type and options.

Through SDK

To access the Model performance dashboard through SDK:


project.get_model_performance_dashboard()

‍

You can use the help function to get all parameters and payloads for the Model performance dashboard


help(project.get_model_performance_dashboard)

‍

To get the Model Performance of 'Active' Model through the AryaXAI SDK:


project.get_model_performance()

Model Performance Monitor

Through GUI

Be informed about your model performance proactively using 'monitors'.

AryaXAI - Setting Model performance monitors

‍Select model type: Classification/Regression. ‍
Select model performance metrics: You can define any of the following performance metrics - accuracy, f1, Auc-roc, precision and recall. ‍
Select the baseline and current: Use the tags to define the baseline and current. 'Current' is your production data if you are tracking drift in your production data. ‍
Select predicted &true label: Map the appropriate feature for 'Baseline predicted/true label' & 'Current predicted/true label'. ‍
Segmenting the baseline or current: You can use date features to further segment your baseline. You can also use 'Time period in days' to dynamically select the recent 'n' days as the current data. If you have added 'Time period in days', it'll use that value as the time period the day it calculated the drift as the end date.

‍

Tip: If you deployed multiple model versions, you can append the model prediction in the same dataset as new features instead of creating duplicate dataset copies. You can use these to track the model performance.

Alert Report

Notifications

If there is an identified drift, you'll get the alert for the same in both the web app and email at the specified frequency.

Web app alerts: Any alert triggered will be displayed as a notification on the top right corner. You can view all notifications from the tab and clear them.

Email Alerts: The admin of the workspace will get an email if there is an identified drift.

Through SDK

You can also use the SDK to create and manage monitoring triggers. To install the package, you can use the following functions:

To list all monitoring Triggers created:


# list monitoring triggers
project.monitoring_triggers()

‍

You can also use the help function to create monitoring triggers for Data Drift, Target Drift, and Model Performance using payload


help(project.create_monitoring_trigger)

‍

Additional functions:


# delete monitoring trigger
project.delete_monitoring_trigger('test trigger 5')

# Fetch details of executed Triggers
project.alerts()

Bias Monitoring

Bias monitoring plays a critical role in addressing and mitigating biases, ensuring that the system makes equitable and fair decisions across diverse groups of users or subjects.

Through GUI

To monitor your model for bias:

Select the Baseline tag, and the Baseline true and predicted labels
Select the feature to use from the dropdown
Select the Model type - Classification or Regression

Through SDK

You can also monitor bias in your models through the AryaXAI python package:


# bias monitoring dashboard
project.get_bias_monitoring_dashboard({
    "base_line_tag": ["eda"],
    "baseline_true_label": "charges",
    "baseline_pred_label": "charges",
    "model_type": "classification"
})

‍

Help function to get Bias monitoring dashboard:


help(project.get_bias_monitoring_dashboard)

Target drift monitoring

Basic concepts:

‍Baseline: Users can define the baseline basis on ‘Tag’ or segment of data based on ‘date’.
‍Frequency: Users can define how frequently they want to calculate the monitoring metrics
‍Alerts frequency: Users can configure how frequently they want to be notified about the alerts

‍

Target Drift has similar inputs to 'Data Drift'. But unlike 'Data drift', you need to define the Baseline and Current data parameters, along with the True label, predicted label and the model type.

Through GUI

The dashboard report provides a detailed analysis of the target distribution by feature.

Drift Metrics: These statistical tests are available to analyze data drifts, namely the Chi-square test, Jensen-Shannon distance, Kolmogorov-Smirnov (K-S) test, Population Stability index (Psi), and Z-test.

You can learn more about these tests in our wiki section.

Mixing multiple tags: If you want to merge data from different tags, you can simply select multiple tags in the segment(baseline/current).

Alerts and Monitors

To create new alerts, go to:

ML Monitoring (Main menu on left) > select ‘Monitoring’ (from the sub-tabs) > click ‘Create Alerts’

All the newly created and existing alerts are displayed on this dashboard, along with details of trigger creator, name, type and options.

Target Drift Monitor

You can not only track target drift but can get notified if there is an identified drift in your data. To set up a 'Target drift', select 'Target drift' under 'Monitor type', post which you can the specific details.

Select model type: Classification/Regression.
Select drift calculation metrics: You can choose drift calculation metrics from the list provided, and set the threshold for data drift and dataset drift (when the dataset itself was drifting).
Select the baseline and current: Use the tags to define the baseline and current. 'Current' is your production data if you are tracking drift in your production data.
Select Baseline/Current true label: Map the appropriate feature for 'Baseline true label' & 'Current true label'.
Segmenting the baseline or current: You can use date features to further segment your baseline. You can also 'Time period in days' to dynamically select the recent 'n' days as the current data. If you have added 'Time period in days', it'll use that value as the time period the day it calculated the drift as the end date.

‍

Alert Report

Notifications

If there is an identified drift, you'll get the alert for the same in both the web app and email at the specified frequency.

Web app alerts: Any alert triggered will be displayed as a notification on the top right corner. You can view all notifications from the tab and clear them.

Email Alerts: The admin of the workspace will get an email if there is an identified drift.

Through SDK

To fetch the default target drift dashboard, use the following command:


project.get_target_drift_dashboard()

‍

If you need to create a new dashboard:


project.get_target_drift_dashboard(payload)

You can use the help function to get all parameters and payloads


help(project.get_target_drift_dashboard)

Available Stat tests:

Data drift monitoring

Basic concepts:

‍Baseline: Users can define the baseline basis on ‘Tag’ or segment of data based on ‘date’.
‍Frequency: Users can define how frequently they want to calculate the monitoring metrics
‍Alerts frequency: Users can configure how frequently they want to be notified about the alerts

‍

Through GUI

You can learn more about these tests in our wiki section.

Mixing multiple tags: If you want to merge data from different tags, you can simply select multiple tags in the segment(baseline/current).

‍

Tip: If you want to see the drift in only one feature, you can simply select that feature under 'Features to select' and calculate the drift.

Alerts and Monitors

To create new alerts, go to:

ML Monitoring (Main menu on left) > select ‘Monitoring’ (from the sub-tabs) > click ‘Create Alerts’

All the newly created and existing alerts are displayed on this dashboard, along with details of trigger creator, name, type and options.

Data Drift Monitors:

Select drift calculation metrics: You can choose drift calculation metrics from the list provided, and set the threshold for data drift and dataset drift (when the dataset itself was drifting).

Select the baseline and current: Use the tags to define the baseline and current. 'Current' is your production data if you are tracking drift in your production data.

Select features: You can either create 1 monitor to track all the features in your data or you can select the specific feature for which you are tracking the drift.

Alert Report

Notifications

If there is an identified drift, you'll get the alert for the same in both the web app and email at the specified frequency.

Web app alerts: Any alert triggered will be displayed as a notification on the top right corner. You can view all notifications from the tab and clear them.

Email Alerts: The admin of the workspace will get an email if there is an identified drift.

Through SDK

Default dashboard:


#default 
project.get_data_drift_dashboard()

#make new dashboard
project.get_data_drift_dashboard({
    "base_line_tag": ["eda"],
    "current_tag": ["XGBoost_default_data"],
    "stat_test_name": "psi"
})

You can also use the help function to get all parameters and payloads:


help(project.get_data_drift_dashboard)

‍

Available Stat. tests:

ML Monitoring

Data addition & settings

Data addition from GUI

Upon accessing the project dashboard, the first thing you need to do is to upload the data. This can be the data used for training, testing, validation, production data or any other data that you used in your project.

While uploading the data for the first time (even if you want to use an API), you must first upload at least one sample data from the dashboard and define the data settings. The rest of the data can also be uploaded through the API.

‍

To start with this, select ‘Upload Data File’, which directs you to the data settings page. Select ‘Upload file’.

Here, to classify your data, you will see the ‘Upload Type’ dropdown, where you can specify the data type to 'Data' or 'Data description'. Next, you will see the ‘Upload Tag’ dropdown, where you can specify the data type - Training, Testing, Validation, or you can choose to add a custom tag as well.

Select the file to be uploaded.

‍Note: You can only upload one file at a time, and the file can only be in CSV format.

‍

Once the upload is complete, you will be directed to ‘Project Config’ to configure the details.

Project Config.

Project Config. is necessarily the high level details of the project. This configuration will be used for all further operations and cannot be changed once set.

Select the Project type - which can be a classification or regression problem.
Define the ‘Unique identifier’ - The identifier for every unique data point
Select the true label - The variable target you are trying to predict (Eg. for a data set from Real estate industry, the true label can be ‘Sale price’ of a house)
If your data has a predicted label, choose the label from the dropdown (This applies when you already have a model and you only want to evaluate the predictions of your model)
Select the features (data points) to be excluded such the XAI model uses the same features used in your model.

There might be multiple features within your project. You can exclude the features that might not be relevant to your project from the ‘Features exclude’ option. You can see all the features included and excluded on the right.

Note: Your data can have some duplicate unique identifiers, which can be dropped by selecting the checkbox.

‍

True and Predicated label

The predicted label is required if you want the XAI model to explain model the predictions. If the predicted label is not defined, AryaXAI will pick the true label to build the XAI model.

Once the above steps are completed, select ‘Submit’.

At this point, you can see the overview of the data submitted. The total data volume, Unique features (data points) and alerts are displayed.

Note: When defining data features, specifically the data settings, it should be noted that these become the base for explainable model training. The feature selection that is done here should align with the final features that have been used in the model.

‍

The ‘Features’ section displays the data type. The platform starts analyzing the data and creates an explainability model for you.

Note: Until the XAI model is not trained, the explanations (Feature importance) will show ‘nan’ as the values, you can upload any new file or open case view pages. Once the model is trained, the XAI model results can be seen in all these pages.

‍

Once you submit the Project Config, you will be directed to the 'Project Summary' page. This page displays 3 tabs - Summary, Data Diagnostics and Model Diagnostics.

‍

Data addition from API

First, Get the API Token for the Project. This is accessible at Workspace > Projects > Documentation.

The project token (and Client Id) is accessible only to the Admins of the particular project. You can refresh your API token through the ‘Refresh token’ option provided beside the Client Id.

Below this, the project URL for uploading the data is displayed. The Python script is present, which can be used directly in your compiler.

The header XAI token needs to be defined, whereas the Client Id and project name are automatically defined.

 
headers = 
{    
"x-api-token": your project access secret token 
} 
base_format = 
{
"client_id": test_user_arya,
"project_name": Risk-monitoring_FW6FSKJQRE
}

Next, prepare Data in Format of Dictionary (you can upload multiple data points in the list of dictionary format)

Define the unique identifier for the data:

 
 "unique_identifier":

‍

A single data set can be passed in string format. If multiple data sets are uploaded, you need to pass a list of unique identifiers.

Similarly, a single data point (with one unique identifier and 3-4 columns) can be directly passed through the API. However, for multiple data points, a list of unique identifiers and column needs to be created.

Finally, you will need to give the post request:

 
resp = requests.post(url,headers= headers json=base_format)

For every post request, data successful responses and acknowledgements are provided, so you are updated on the status.

‍

Data addition from SDK

To upload data, we need to pass the file path and Tag.

Note: If you are uploading data for the first time, you need to pass Config as well.

‍

Data can be uploaded to the project either directly with a file or by passing the Pandas DataFrame.

To configure the details in ‘Project config’ and upload data through our SDK, you can use the following commands:


config = {
            "project_type": "classification",  # The Prediction Type of your project (classification / regression)
            "unique_identifier": "Id", # unique identifier for your project
            "true_label": "SaleCondition", # Target label
            "pred_label": "", # Predicted value in case you have it
            "feature_exclude": [],  # feature you don't want Arya Xai surrogate model to use for modelling
        }

# Data is diffrentiated using Tag
Tag = 'Training'  # Data is diffrentiated using Tag

#To upload the data into the project. This will also build the initial ML model.
project.upload_data('file_path','tag', config)

‍

Once the data is uploaded, you can also view the files, and file info through SDK.


#Check the files that are uploaded in the project.
project.files()

‍

Additional functions:


#You can get the summary for the specific file: Missing values, Max/Min, Data type.
project.file_summary()

#To know all the settings: Data, Data Encoding & Model params
project.config()

project.all_tags()

‍

Additionally, you can also delete the uploaded file:


#project.delete_file('file_name')

‍

Project Summary:

As mentioned earlier, once the Project Config is complete, you will be directed to the 'Project Summary' page. This page displays 3 tabs - Summary, Data Diagnostics and Model Diagnostics.

Summary

The Summary tab displays:

- An Overview of total data volume and Unique features

- A data summary (Features) table, and

- Volume

Volume

This section displays the volume graph, which provides an overview of the data upload activity over time. You can investigate different parameters and plot the data activity, based on data label, feature name, date (of the feature name or date of creation), range and plot type, for writing codes with ease.

Data Summary table

Through GUI

The section provides you with a summary of data features and displays the data type. You can easily navigate to the different data tags by selecting them from the dropdown list on the right.

Note: If the feature table displays ‘NA’ under ‘Feature importance’, it means that the particular feature is not used in the explainability model. (This setting comes from project config. Mentioned in data settings).

Note: Using the ‘Refresh Data’ option will provide the most latest view of the data. The loading time will differ based on the data volume.

‍

Through SDK

To view the summary of data features and the data types through SDK:


# data summary
project.data_summary('tag')

# data diagnosis
project.data_diagnosis('tag')

‍

Data Diagnostics

Data is at core for building a good ML model. In this section, AryaXAI runs a full profile of all the datasets added. You can see these warnings and decide to include or not include any features.

The Data summary table in the Data Diagnostics tab displays an overview of the total data volume, unique features and warnings for any inconsistencies in the analytical data uploaded. These warnings range from missing data to high feature correlation, high cardinality, etc.

The details for Data observations and the warnings can be seen in the tables provided below ‘Overview’.

Additionally, for data drift diagnosis through SDK:


# data drift diagnosis
project.data_drift_diagnosis(['tag'],['tag'])

‍

Model Diagnostics

The ‘Model stability’ table in the Model Diagnostic tab displays the same model details as seen in the AutoML section.

‘Data Stability’, allows you to compare the data between two models for data drift overview.

Select the Baseline and Current tags for a detailed comparison, which displays the feature, drift detected, method, feature type, drift score, etc.

Data Setting

Here, the data upload details are displayed, from where you can upload data or delete uploaded data files.

Note: It is important to define the first file that is uploaded. This file which is uploaded in any of the categories (viz. training, testing, validation or custom), becomes the training data for the explainability model.

When defining data features, specifically the data settings, it should be noted that these become the base for explainable model training. The feature selection that is done here should align with the final features that have been used in the model.

Modify Data Settings

To modify the data settings, select ‘Update config’ option present in ‘Data settings’. Whenever the settings are modified, the explainability model is retrained again.

Whenever these data settings are updated, it triggers training for the XAI model.

Creating Workspace and project

To get started with AryaXAI, you need to create a workspace and a project in that work space. All user access controls are maintained at both Workspace and Project level.

Workspace

On accessing the platform, you can either set up a new workspace or access an existing workspace.

Creating a workspace

Through GUI

To set up a workspace, select ‘Add workspace’. Define a name for the workspace and submit. The workspace will be visible on the dashboard with details of the workspace owner, creation date and time.

You can also invite/add users in the workspace through the ‘Add User’ option and define their role for the workspace. The role can either be of an Owner, User or Manager. Each role has specific use access criteria as defined below. You can also revoke access through the workspace settings or modify the user role.

User role wise accessibility criteria:

Accessing an existing workspace

To access an already created workspace, the workspace owner has to send an invite through the ‘Add User’ option. This invitation to the workspace is shared to the invitee's email address provided.

The ‘Settings’ option under Actions column displays the workspace details. Here, the 'User list' tab displays the user's list for the particular workspace. This list shows all the users you have invited to the workspace and the status of their invitations. The 'Usage' tab shows the usage details, such as, User subscription details, number of projects, users, data points, etc. and the usage cap on each.

Start using workspace

To go into the workspace, select ‘Show’ under the ‘Actions’ column. You can create multiple projects within a workspace.

Project

As mentioned earlier, your workspace can have multiple projects. For example, if you created a fraud detection model, that can be one project.

Creating a project

To create a new project, select the ‘Add project’ option within the desired workspace. Define a name for the project and submit it. The project details will be visible on the dashboard with details of the project owner, creation date and time.

‍

Note: If you want a user to have access to a particular project and not the entire workspace, you can select the ‘Add user’ option under ‘Actions’.

‍

You can define the roles of new users as an Owner/ Admin, User or Manager.

Note: Workspace access overrides the project access. For e.g. If you have given a user ‘Owner’ access at the Workspace level and a ‘User’ access at the project level, the Owner access overrides the User access.

‍

User role wise accessibility criteria:

‍

Through SDK

Workspace

The XAI object instance has all the necessary methods to interact with your workspaces.


# creating a new workspapce name
workspace = aryaxai.create_workspace('workspace name')

# list all workspaces available
aryaxai.workspaces()

# select a workspace name
workspace = xai.workspace('sdk-testing')

The AryaXAI Python SDK also provides users with functionalities to edit, manage or delete workspaces


# rename workspace name
workspace.rename_workspace('new_workspace_name')

# Deleting Workspace
#workspace.delete_workspace()

# add user to workspace
workspace.add_user_to_workspace('email', 'role')

# remove user from workspace
workspace.remove_user_from_workspace('email')

# update user access for workspace
workspace.update_user_access_for_workspace('email','role')

‍

Project

With the AryaXAI Python SDK, users can efficiently handle project management tasks and carry out essential functions within those projects.



# create project name
project = workspace.create_project('project_name')

# list of projects
projects = workspace.projects()

# select a project name
project = workspace.project('project_name')

# delete project
project.delete_project()

# add user to project
project.add_user_to_project('email', 'role')

# remove user from project
project.remove_user_from_project('email')

# update user access for project
project.update_user_access_for_project('email','role')

# rename project
project.rename_project('new_project_name')

Introduction

Welcome to AryaXAI!

The ML Observability platform for mission-critical AI solutions

Introducing AryaXAI

With AryaXAI, Data science and ML teams can monitor their models in production as well as gain reliable & accurate explainability,

AryaXAI offers reliable & accurate explainability, offering evidence that can support regulatory diligence, managing AI uncertainty by providing advanced policy controls and ensuring consistency in production by monitoring data or model drift and alerting users with root cause analysis.

AryaXAI also acts as a common workflow and provides insights acceptable by all stakeholders - Data Science, IT, Risk, Operations and compliance teams, making the rollout and maintenance of AI/ML models seamless and clutter-free.

How to signup for AryaXAI

In just a few easy steps, users can sign up for AryaXAI with an invitation.

To get started, users who already have access to AryaXAI can invite others to sign up for the platform. This invitation is sent to the email address of the invitee.

From here, you can set up your account in 2 easy steps:

1. Setting your basic profile details and password
2. Setting your work profile and industry

Once these steps are completed, select ‘Finish’ to complete the account verification. A confirmation of account verification is sent to your inbox, through which your workspace can be accessed.

Through SDK

With our SDK, you can perform nearly every action available in the AryaXAI GUI.

Prerequisite:

Sign up and log in to AryaXAI using the steps mentioned above.
After logging in, generate an Access Token for your user account.
Set the environment variables ‘XAI_ACCESS_TOKEN’ with the generated value.

Once you've completed these steps, you're all set! Now, you can easily log in and start using the AryaXAI SDK:

Log in by importing the "xai" object instance from the "arya_xai" package.
Call the "login" method. This method automatically takes the access token value from the "XAI_ACCESS_TOKEN" environment variable and stores the JWT in the object instance. This means that all your future SDK operations will be authorized automatically, making it simple and hassle-free!

NOTE: You can create a maximum of 5 tokens. Each token can only be used in one place at a time.


from aryaxai import xai as aryaxai

## login() function authenticates user using token that can be generated in app.aryaxai.com/sdk

aryaxai.login()

Enter your Arya XAI Access Token: ··········
Authenticated successfully.

# See Notification you get across all workspaces and projects
aryaxai.get_notifications()

About AryaXAI