This document describes how to create and run a Batch job that automatically installs the Ops Agent. Install the Ops Agent to provide additional metrics in Cloud Monitoring about the performance of a job's resources. To learn more about using resource performance metrics for a job, see Monitor and optimize job resources by viewing metrics.
Before you begin
- If you haven't used Batch before, review Get started with Batch and enable Batch by completing the prerequisites for projects and users.
- If your project hasn't already, enable the Cloud Monitoring and Cloud Logging APIs:
-
To get the permissions that you need to create a job, ask your administrator to grant you the following IAM roles:
-
To create a job:
-
Batch Job Editor (
roles/batch.jobsEditor
) on the project -
Service Account User (
roles/iam.serviceAccountUser
) on the job's service account, which by default is the default Compute Engine service account
-
Batch Job Editor (
-
To view logs:
Logs Viewer (
roles/logging.viewer
) on the project
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
-
To create a job:
Unless you are using the default configuration for the job's service account, ensure that it has the necessary permissions.
To ensure that the job's service account has the necessary permissions to write Ops Agent metrics to Monitoring, ask your administrator to grant the job's service account the following IAM roles:
-
Monitoring Metric Writer (
roles/monitoring.metricWriter
) on the project -
Logs Writer (
roles/logging.logWriter
) on the project
-
Monitoring Metric Writer (
- Ensure that your planned job configuration meets the Ops Agent requirements.
Ops Agent requirements
To create and run a job that uses the Ops Agent, your job must comply with all the following requirements:
Ensure that the job's VMs use an operating system (OS) that the Ops Agent supports. For more information about the VM OS image for a job, see Overview of the OS environment for a job's VMs.
If your job uses a non-default networking configuration or uses VPC Service Controls, ensure that the job meets the access requirements for the Ops Agent. For more information, see VMs without remote package access in the Google Cloud Observability documentation.
Ensure that the job doesn't install a legacy Cloud Logging agent or Cloud Monitoring agent—for example, through a custom image or instance template.
For more information about the features and requirements of the Ops Agent, see Ops Agent overview in the Google Cloud Observability documentation.
Create a job that automatically installs the Ops Agent
Use the Google Cloud CLI or REST API to
create a job that
includes the
installOpsAgent
field
set to true
in the allocationPolicy.instances
field in the main body in the
JSON file:
"allocationPolicy": {
"instances": [
{
"installOpsAgent": true
}
]
}
For example, a job that automatically installs the Ops Agent can have a JSON configuration file that is similar to the following:
{
"taskGroups": [
{
"taskSpec": {
"runnables": [
{
"script": {
"text": "echo Hello World! This is task $BATCH_TASK_INDEX."
}
}
]
},
"taskCount": 3,
}
],
"allocationPolicy": {
"instances": [
{
"installOpsAgent": true
}
]
},
"logsPolicy": {
"destination": "CLOUD_LOGGING"
}
}
After the job's VMs start running, you can see the Ops Agent metrics the same as any other resource metric. For more information, see Monitor and optimize job resources by viewing metrics.
What's next
- If you have issues creating or running a job, see Troubleshooting.
- View jobs and tasks.
- Learn about more job creation options.