In Vertex AI Feature Store (Legacy), you can monitor and set alerts on featurestores and features. For example, an operations team might monitor a featurestore to track its CPU utilization. Feature owners, such as data scientists, might monitor feature values to detect drift over time.
The methods for monitoring featurestores and features are described in the following sections:
Featurestore monitoring
Vertex AI Feature Store (Legacy) reports metrics about your featurestore to Cloud Monitoring such as the CPU load, storage capacity, and request latencies. Vertex AI collects and reports these metrics for you. You do not need to configure or enable featurestore monitoring.
To configure thresholds and notifications, use Cloud Monitoring. For example, you can set an alert if the average CPU load exceeds 70%, which might require you to increase the number of featurestore nodes.
You can also view featurestore metrics in the Vertex AI section of the Google Cloud console to see trends over time. For some charts, the console shows aggregated or calculated values to make the information easier to consume. You can always view the raw data in Cloud Monitoring.
For more information, see Vertex AI Feature Store (Legacy) monitoring metrics on the Vertex AI Cloud Monitoring page.
Feature value monitoring
Feature value monitoring enables you to track how much a feature's value distribution changes in a featurestore. The following types of feature value monitoring are supported:
Snapshot Analysis: Vertex AI Feature Store (Legacy) takes periodic snapshots of your feature values. Over time, as you ingest more data, you might notice the distribution of your feature values change. This change indicates that any models using those features might need to be retrained. You can specify a threshold so that anomalies are logged in the Cloud Logging console whenever the distribution deviation crosses the threshold.
For datasets exceeding 5 million entity IDs, Vertex AI Feature Store (Legacy) generates snapshots based on 5 million randomly selected entity IDs within the time window that you specified as the number of staleness days.
Import Feature Analysis: Each
ImportFeatureValues
operation generates distribution statistics for the values ingested into Vertex AI Feature Store (Legacy). You can choose to detect anomalies by comparing your distribution statistics with the previously imported feature value distribution or, if enabled, the snapshot distribution.For datasets exceeding 5 million instances, Vertex AI Feature Store (Legacy) generates snapshots based on randomly selected data, as follows:
- If the number instances within the ingested dataset exceeds 5 million but does not exceed 50 million, then the snapshot is generated based on 5 million randomly selected instances.
- If the number of instances within the ingested dataset exceeds 50 million, then the snapshot is generated based on 10% of the instances, selected randomly.
For example, consider a feature that collects prices of recently sold homes and then feeds the values into a model for predicting the price of a house. The prices of recently sold homes might drift significantly over time, or the batch of imported values might contain data that deviates significantly from the training data. Vertex AI Feature Store (Legacy) alerts you of this change. You can then retrain your model to use the latest information.
Set a monitoring configuration
To start monitoring, you can define a monitoring configuration on an entity type, which enables monitoring for all features of the following types:
BOOL
STRING
DOUBLE
INT64
You can set the monitoring configuration when you create an entity type. You can also choose to opt out of monitoring for specific features by setting the disableMonitoring
property. The entity type monitoring configuration specifies the following:
- Whether to enable monitoring. Monitoring is disabled by default.
- Thresholds used to detect anomalies. Default threshold is 0.3.
- Lookback window in addition to the interval between snapshots (for snapshot analysis). Default value is 21.
- Whether to enable import feature analysis. Default is disabled.
For more information, see the FeaturestoreMonitoringConfig type in the API reference.
Create an entity type with monitoring enabled
The following example creates an entity type, where feature monitoring is enabled:
Web UI
Only snapshot analysis is supported from the UI.
- In the Vertex AI section of the Google Cloud console, go to the Features page.
- Select a region from the Region drop-down list.
- Click Create Entity Type.
- Flip the Feature monitoring section to Enabled.
- Enter the number of days between snapshots in the Monitoring time interval field.
The monitoring job for an entity type or feature runs at the nearest round hour following the time of the day when you enable monitoring for the entity type or feature. For example, if you enable monitoring at 10:30 PM on Monday and specify two days as the monitoring time interval, the first monitoring job runs at 11 PM on Wednesday. - Enter the number of days to look back for each snapshot in the Monitoring lookback window field.
- Enter the number for the threshold used for detecting anomalies for numerical features in the Numerical alerting threshold field.
- Enter the number for the threshold used for detecting anomalies for categorical features in this EntityType in the Categorical alerting threshold field.
For more information about detecting feature value anomalies, see View feature value anomalies. - Click Create.
- In the features table, click an entity type.
- To add new features to the entity, click Add Features.
- To opt out of monitoring for a specific feature, toggle off Enable monitoring.
REST
To create an entity type, send a POST request by using the entityTypes.create method.
Before using any of the request data, make the following replacements:
- LOCATION_ID: Region where the featurestore is located, such as
us-central1
. - PROJECT_ID: Your project ID.
- FEATURESTORE_ID: ID of the featurestore.
- ENTITY_TYPE_ID: ID of the entity type.
- DURATION: The interval duration between snapshots in days.
- STALENESS_DAYS: The number of days to look back when taking snapshots.
- NUMERICAL_THRESHOLD_VALUE: The threshold to detect anomalies for numerical features under this entity type. Statistics deviation is calculated by the Jenson-Shannon divergence.
- CATEGORICAL_THRESHOLD_VALUE: The threshold to detect anomalies for categorical features under this entity type. Statistics deviation is calculated by the L-Infinity distance.
- IMPORT_FEATURE_ANALYSIS_STATE: The state indicating whether to enable import feature analysis.
- IMPORT_FEATURE_ANALYSIS_BASELINE: The baseline for import feature analysis if enabled.
HTTP method and URL:
POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes?entityTypeId=ENTITY_TYPE_ID
Request JSON body:
{ "monitoringConfig": { "snapshotAnalysis": { "monitoringIntervalDays": "DURATION" "stalenessDays": "STALENESS_DAYS" } }, "numericalThresholdConfig": { "value": "NUMERICAL_THRESHOLD_VALUE" }, "categoricalThresholdConfig": { "value": "CATEGORICAL_THRESHOLD_VALUE" }, "importFeatureAnalysis": { "state": "IMPORT_FEATURE_ANALYSIS_STATE", "anomalyDetectionBaseline": "IMPORT_FEATURE_ANALYSIS_BASELINE" } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes?entityTypeId=ENTITY_TYPE_ID"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes?entityTypeId=ENTITY_TYPE_ID" | Select-Object -Expand Content
You should see output similar to the following. You can use the OPERATION_ID in the response to get the status of the operation.
{ "name": "projects/PROJECT_ID/locations/LOCATION_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.ui.CreateEntityTypeOperationMetadata", "genericMetadata": { "createTime": "2022-04-29T20:29:05.206525Z", "updateTime": "2022-04-29T20:29:05.206525Z" } } }
Java
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Additional languages
To learn how to install and use the Vertex AI SDK for Python, see Use the Vertex AI SDK for Python. For more information, see the Vertex AI SDK for Python API reference documentation.
Opt out of monitoring for a new feature
The following example creates a new feature with monitoring turned off:
REST
To create a feature, send a POST request by using the features.create method.
Before using any of the request data, make the following replacements:
- LOCATION_ID: Region where the featurestore is located, such as
us-central1
. - PROJECT_ID: Your project ID.
- FEATURESTORE_ID: ID of the featurestore.
- ENTITY_TYPE_ID: ID of the entity type.
- FEATURE_ID: ID of the feature.
- VALUE_TYPE: The value type of the feature.
- DISABLE_MONITORING: Set to true to explicitly opt out of monitoring.
HTTP method and URL:
POST https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID?featureId=/FEATURE_ID
Request JSON body:
{ "disableMonitoring": "DISABLE_MONITORING", "valueType": "VALUE_TYPE" }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID?featureId=/FEATURE_ID"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID?featureId=/FEATURE_ID" | Select-Object -Expand Content
You should see output similar to the following. You can use the OPERATION_ID in the response to get the status of the operation.
{ "name": "projects/PROJECT_ID/locations/LOCATION_ID/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.aiplatform.ui.CreateFeatureOperationMetadata", "genericMetadata": { "createTime": "2022-04-29T20:29:05.206525Z", "updateTime": "2022-04-29T20:29:05.206525Z" } } }
Update monitoring configuration
You can set the monitoring configuration when updating an entity type. You can also choose to opt out of monitoring for specific features by setting the disableMonitoring
property.
Update monitoring configuration for entity type and features
The following example updates the monitoring configuration for an existing entity type and specific features for that entity type:
Web UI
Only snapshot analysis is supported from the UI.
- In the Vertex AI section of the Google Cloud console, go to the Features page.
- Select a region from the Region drop-down list.
- In the features table, view the Entity type column to find the entity type to update.
- Click the name of the entity type name to view the entity details page.
- From the action bar, click Edit Info.
- In Monitoring time interval, enter the number of days between snapshots.
The monitoring job for an entity type or feature runs at the nearest round hour following the time of the day when you enable monitoring for the entity type or feature. For example, if you enable monitoring at 10:30 PM on Monday and specify two days as the monitoring time interval, the first monitoring job runs at 11 PM on Wednesday. - Click Update.
- Similarly, in the features table, view the Features column to find the feature to update.
- Click the feature name to view the details page.
- From the action bar, click Edit Info.
- To opt out of monitoring for a specific feature, toggle off Monitoring enabled.
REST
To update an entity type, send a PATCH request by using the entityTypes.patch method.
Before using any of the request data, make the following replacements:
- LOCATION_ID: Region where the featurestore is located, such as
us-central1
. - PROJECT_ID: Your project ID.
- FEATURESTORE_ID: ID of the featurestore.
- ENTITY_TYPE_ID: ID of the entity type.
- DURATION_IN_DAYS: The interval duration between snapshots in days.
- STALENESS_DAYS: The number of days to look back when taking snapshots.
- NUMERICAL_THRESHOLD_VALUE: The threshold to detect anomalies for numerical features under this entity type. Statistics deviation is calculated by the Jenson-Shannon divergence.
- CATEGORICAL_THRESHOLD_VALUE: The threshold to detect anomalies for categorical features under this entity type. Statistics deviation is calculated by the L-Infinity distance.
- IMPORT_FEATURE_ANALYSIS_STATE: The state indicating whether to enable import feature analysis.
- IMPORT_FEATURE_ANALYSIS_BASELINE: The baseline indicating ????
HTTP method and URL:
PATCH https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID
Request JSON body:
{ "monitoringConfig": { "snapshotAnalysis": { "monitoringIntervalDays": "DURATION_IN_DAYS", "stalenessDays": "STALENESS_DAYS" } }, "numericalThresholdConfig": { "value": "NUMERICAL_THRESHOLD_VALUE" }, "categoricalThresholdConfig": { "value": "CATEGORICAL_THRESHOLD_VALUE" }, "importFeatureAnalysis": { "state": "IMPORT_FEATURE_ANALYSIS_STATE", "anomalyDetectionBaseline": "IMPORT_FEATURE_ANALYSIS_BASELINE" } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID", "createTime": "2021-07-22T23:18:31.339972Z", "updateTime": "2021-07-29T22:24:40.221821Z", "etag": "AMEw9yPGDpwUwHx39gIDIg5mTQz65GMhnYHRzRslVPonm1g8xTnsTC5YUibmWo2MIuI=", "monitoringConfig": { "snapshotAnalysis": { "monitoringIntervalDays": "DURATION_IN_DAYS", "stalenessDays": "STALENESS_DAYS" } }, "numericalThresholdConfig": { "value": "NUMERICAL_THRESHOLD_VALUE" }, "categoricalThresholdConfig": { "value": "CATEGORICAL_THRESHOLD_VALUE" }, "importFeatureAnalysis": { "state": "IMPORT_FEATURE_ANALYSIS_STATE", "anomalyDetectionBaseline": "IMPORT_FEATURE_ANALYSIS_BASELINE" } }
Java
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Java API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries. For more information, see the Vertex AI Node.js API reference documentation.
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Opt out of monitoring for a feature
The following example turns off monitoring for an existing feature:
REST
To update a feature, send a PATCH request by using the features.patch method.
Before using any of the request data, make the following replacements:
- LOCATION_ID: Region where the featurestore is located, such as
us-central1
. - PROJECT_ID: Your project ID.
- FEATURESTORE_ID: ID of the featurestore.
- ENTITY_TYPE_ID: ID of the entity type.
- FEATURE_ID: ID of the feature to update.
- DISABLE_MONITORING: Set to true to explicitly opt out of monitoring.
HTTP method and URL:
PATCH https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID/features/FEATURE_ID
Request JSON body:
{ "disableMonitoring": "DISABLE_MONITORING" }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID/features/FEATURE_ID"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID/features/FEATURE_ID" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_NUMBER/locations/LOCATION_ID/featurestores/FEATURESTORE_ID/entityTypes/ENTITY_TYPE_ID/features/FEATURE_ID", "valueType": "FEATURE_VALUE_TYPE", "createTime": "2021-07-22T23:18:31.339972Z", "updateTime": "2021-07-29T22:24:40.221821Z", "etag": "AMEw9yPGDpwUwHx39gIDIg5mTQz65GMhnYHRzRslVPonm1g8xTnsTC5YUibmWo2MIuI=", "disableMonitoring": "DISABLE_MONITORING" }
View feature value distributions
Use the Google Cloud console to view the distribution of feature values over time.
Web UI
In the Vertex AI section of the Google Cloud console, go to the Features page.
Select a region from the Region drop-down list.
To view the feature value distributions for all features of an entity type, in the Entity type column, click the entity type.
To view feature value distribution metrics for a feature:
In the Feature column, click the feature.
Click Metrics tab to view the feature value distribution metrics.
View feature value anomalies
If the feature value distribution deviates beyond the specified threshold in a monitoring pipeline, it's considered an anomaly. There are two types of anomalies—training-serving skew and drift. To calculate the deviation, Vertex AI compares the latest feature values in production with a baseline.
To detect training-serving skew, Vertex AI compares the latest feature values in production with the statistical distribution of feature values in the training data. In this case, the statistical distribution of feature values in the training data is considered as the baseline distribution. Learn more about training-serving skew.
To detect drift, Vertex AI compares the latest feature values in production with the statistical distribution of feature values from the most recent monitoring run that occurred at least one hour ago. In this case, the statistical distribution of feature values from the most recent monitoring run is considered as the baseline distribution. Learn more about drift.
In both cases, the baseline distribution is compared to the latest feature values in production to calculate a distance score.
For categorical features, the distance score is calculated using the L-infinity distance. In this case, if the distance score exceeds the threshold you specify in the Categorical alerting threshold field, it's identified as an anomaly.
For numerical features, the distance score is calculated using the Jensen-Shannon divergence. In this case, if the distance score exceeds the threshold you specify in the Numerical alerting threshold field, it's identified as an anomaly.
In either case, the anomaly might be a training-serving skew or a drift, depending on the baseline distribution used to calculate the distance score. An anomaly log is written to Cloud Logging with the log name featurestore_log
. You can sync the logs to any downstream service Cloud Logging supports, such as Pub/Sub.
For more information about setting the alert thresholds, see Create an entity type with monitoring enabled.
Example query for all anomalies generated for a particular featurestore
logName="projects/model-monitoring-demo/logs/aiplatform.googleapis.com%2FFfeaturestore_log"
resource.labels.resource_container=<project_number>
resource.labels.featurestore_id=<featurestore_id>
Example of an anomaly log entry
{
"insertId": "ktbx5jf7vdn7b",
"jsonPayload": {
"threshold": 0.001,
"featureName": "projects/<project_number>/locations/us-central1/featurestores/<featurestore_id>/entityTypes/<entity_type_id>/features/<feature_id>",
"deviation": 1,
"@type": "type.googleapis.com/google.cloud.aiplatform.logging.FeatureAnomalyLogEntry",
"objective": "Featurestore Monitoring Snapshot Drift Anomaly"
},
"resource": {
"type": "aiplatform.googleapis.com/Featurestore",
"labels": {
"resource_container": "<project_number>",
"location": "us-central1",
"featurestore_id": "<featurestore_id>"
}
},
"timestamp": "2022-02-06T00:54:06.455501Z",
"severity": "WARNING",
"logName": "projects/model-monitoring-demo/logs/aiplatform.googleapis.com%2Ffeaturestore_log",
"receiveTimestamp": "2022-02-06T00:54:06.476107155Z"
}
Monitor offline storage write errors for streaming ingestion
Use the Google Cloud console to monitor write errors to the offline storage during streaming ingestion.
View metrics for streaming ingestion to offline storage
You can monitor the Offline storage write for streaming write metric for Vertex AI Feature Store (Legacy) in the Metrics Explorer.
Web UI
In the Google Cloud console, go to the Metrics Explorer:
In toolbar, select the Explorer tab.
In the Configuration tab, specify the data to appear on the chart:
Resource & Metric: Select the metric Vertex AI Feature Store - Offline storage write for streaming write.
Group by: Select
error_code
.Minimum alignment period: Specifies the minimum time interval for aligning the data in the chart.
After you update these fields, the chart displays the offline storage write errors for the various error codes.
After you generate the chart, you can add it to your custom dashboard. For more information, see see Save a chart for future reference.
View Vertex AI Feature Store (Legacy) logs
You can view the log entries for your featurestore, including logs generated during offline store write errors, in the Logs Explorer.
Web UI
In the Google Cloud console, go to the Logs Explorer:
In the Query builder, add the following query parameters and then click Run query:
- Resource: Select Vertex AI Feature Store.
- Log name: Under Vertex AI API, select
aiplatform.googlapis.com/featurestore_log
.