This page shows you how to improve workload startup latency by using secondary boot disks in Google Kubernetes Engine (GKE) to preload data or container images on new nodes. This enables workloads to achieve a fast cold start and to improve the overall utilization of provisioned resources.
Before reading this page, ensure that you're familiar with Google Cloud, Kubernetes, containers, YAML, containerd runtime, and the Google Cloud CLI.
Overview
Starting in GKE version 1.28.3-gke.1067000 in Standard clusters and in GKE version 1.30.1-gke.1329000 in Autopilot clusters, you can configure the node pool with secondary boot disks. You can tell GKE to provision the nodes and preload them with data, such as a machine learning (ML) model, or a container image. Using preloaded container images or data in a secondary disk has the following benefits for your workloads:
- Reduced latency when pulling large container images, or downloading data
- Faster autoscaling
- Quicker recovery from disruptions like maintenance events and system errors
The following sections describe how to configure the secondary boot disk in GKE Autopilot and Standard clusters.
How secondary boot disks work
Your workload can start more quickly by using the preloaded container image or data on secondary boot disks. Secondary boot disks have the following characteristics:
- Secondary boot disks are Persistent Disks which are backed by distributed block storage.
- The Persistent Disk is instantiated from disk images that you create ahead of time.
- For scalability reasons, each node gets its own Persistent Disk instance created from the disk image. These Persistent Disk instances are deleted when the node is deleted.
- If the disk image is already in use in the zone, the creation time of all subsequent disks created from the same disk image will be lower.
- The secondary boot disk type is the same as the node boot disk.
- The size of the secondary boot disk is decided by disk image size.
Adding secondary boot disks to your node pools does not normally increase the node provisioning time. GKE provisions secondary boot disks from the disk image in parallel with the node provisioning process.
To support preloaded container images, GKE extends the containerd runtime with plugins that read the container images from secondary boot disks. Container images are reused by the base layers.
Preload large base layers into the secondary boot disk, while the small upper layers can be pulled from the container registry.
Before you begin
Before you start, make sure you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,
install and then
initialize the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running
gcloud components update
.
Enable the Container File System API.
Requirements
The following requirements apply to using secondary boot disk:
- Your clusters are running GKE version 1.28.3-gke.1067000 in GKE Standard or version 1.30.1-gke.1329000 in GKE Autopilot.
- When you modify the disk image, you must create a new node pool. Updating the disk image on existing nodes is not supported.
- Configure Image streaming to use the secondary boot disk feature.
- Use the Container-Optimized OS with a containerd node image. Autopilot nodes use this node image by default.
Prepare the disk image with data ready during build time or with preloaded container images. Ensure that your cluster has access to the disk image to load onto the nodes.
Best practice: Automate the disk image in a CI/CD pipeline.
Limitations
Secondary boot disks have the following limitations:
- You can't update secondary boot disks for existing nodes. To attach a new disk image, create a new node pool.
- You can't use secondary boot disks to preload data in GKE Autopilot clusters.
Pricing
When you create node pools with secondary boot disks, GKE attaches a Persistent Disk to each node within the node pool. Persistent Disks are billed based on Compute Engine disk pricing.
Prepare the secondary boot disk image
To prepare the secondary boot disk image, choose either the Images tab for preloading container images or choose the Data tab for preloading data, then complete the following instructions:
Images
GKE provides a tool called
gke-disk-image-builder
to create a virtual machine (VM), pull the container images on a disk, and then create a disk
image from that disk.
To create a disk image with multiple preloaded container images, complete the following steps:
- Create a Cloud Storage bucket
to store the execution logs of
gke-disk-image-builder
. - Create a disk image with
gke-disk-image-builder
.
go run ./cli \
--project-name=PROJECT_ID \
--image-name=DISK_IMAGE_NAME \
--zone=LOCATION \
--gcs-path=gs://LOG_BUCKET_NAME \
--disk-size-gb=10 \
--container-image=docker.io/library/python:latest \
--container-image=docker.io/library/nginx:latest
Replace the following:
- PROJECT_ID: the name of your Google Cloud project.
- DISK_IMAGE_NAME: the name of the image of the
disk. For example,
nginx-python-image
. - LOCATION: the cluster location.
- LOG_BUCKET_NAME: the name of the Cloud Storage
bucket to store the execution logs. For example,
gke-secondary-disk-image-logs/
.
When you create a disk image with
gke-disk-image-builder
,
Google Cloud creates multiple resources to complete the process (for
example, a VM instance, a temporary disk, and a persistent disk). After its execution, the
image builder cleans up all the resources, except the disk image that you
created.
Data
Create a custom disk image as the data source by completing the following steps:
Configure the secondary boot disk
You can configure the secondary boot disk in a GKE Autopilot or Standard cluster.
Use an Autopilot cluster for a fully managed Kubernetes experience. To choose the GKE mode of operation that's the best fit for your workloads, see Choose a GKE mode of operation.
Use GKE Autopilot
In this section, you create a disk image allowlist to allow the disk image in an existing GKE Autopilot cluster. Then, you modify the Pod node selector to use a secondary boot disk.
Allow the disk images in your project
In this section, you create a GCPResourceAllowlist
to allow GKE
to create nodes with secondary boot disks from the disk images in your
Google Cloud project.
Save the following manifest as
allowlist-disk.yaml
:apiVersion: "node.gke.io/v1" kind: GCPResourceAllowlist metadata: name: gke-secondary-boot-disk-allowlist spec: allowedResourcePatterns: - "projects/PROJECT_ID/global/images/.*"
Replace the PROJECT_ID with your project ID to host the disk image.
Apply the manifest:
kubectl apply -f allowlist-disk.yaml
GKE creates nodes with secondary boot disks from all disk images in the project.
Update the Pod node selector to use a secondary boot disk
In this section, you modify the Pod specification so that GKE creates the nodes with the secondary boot disk.
Add a
nodeSelector
to your Pod template:nodeSelector: cloud.google.com.node-restriction.kubernetes.io/gke-secondary-boot-disk-DISK_IMAGE_NAME: CONTAINER_IMAGE_CACHE.PROJECT_ID
Replace the following:
- DISK_IMAGE_NAME: the name of your disk image.
- PROJECT_ID: your project ID to host the disk image.
Use the
kubectl apply
command to apply the Kubernetes specification with the Pod template.Confirm that the secondary boot disk cache is in use:
kubectl get events --all-namespaces
The output is similar to the following:
75s Normal SecondaryDiskCachin node/gke-pd-cache-demo-default-pool-75e78709-zjfm Image gcr.io/k8s-staging-jobsejt/pytorch-mnist:latest is backed by secondary disk cache
Check the image pull latency:
kubectl describe pod POD_NAME
Replace POD_NAME with the name of the Pod.
The output is similar to following:
… Normal Pulled 15m kubelet Successfully pulled image "docker.io/library/nginx:latest" in 0.879149587s …
The expected image pull latency for the cached container image should be significantly reduced, regardless of image size.
Use GKE Standard
To create a GKE Standard cluster and a node pool, complete the following instructions, choosing either the Images or Data tab based on whether you want to preload container images or preload data onto the secondary boot disk:
Images
To configure a secondary boot disk, either use the Google Cloud CLI or Terraform:
gcloud
Create a GKE Standard cluster with image streaming enabled:
gcloud container clusters create CLUSTER_NAME \ --location=LOCATION \ --cluster-version=VERSION \ --enable-image-streaming
Replace the following:
- CLUSTER_NAME: the name of your cluster.
- LOCATION: the cluster location.
- VERSION: the GKE version to use. The
GKE version must be
1.28.3-gke.1067000
or later.
Create a node pool with a secondary boot disk in the same project:
gcloud container node-pools create NODE_POOL_NAME \ --cluster=CLUSTER_NAME \ --location LOCATION \ --enable-image-streaming \ --secondary-boot-disk=disk-image=global/images/DISK_IMAGE_NAME,mode=CONTAINER_IMAGE_CACHE
Replace the following:
- NODE_POOL_NAME: the name of the node pool.
- CLUSTER_NAME: the name of the existing cluster.
- LOCATION: the compute zone or zones separated by comma of the cluster.
- DISK_IMAGE_NAME: the name of your disk image.
To create a node pool with a secondary boot disk from the disk image in a different project, complete the steps in Use a secondary boot disk in a different project.
Add a
nodeSelector
to your Pod template:nodeSelector: cloud.google.com/gke-nodepool: NODE_POOL_NAME
Confirm that the secondary boot disk cache is in use:
kubectl get events --all-namespaces
The output is similar to the following:
75s Normal SecondaryDiskCachin node/gke-pd-cache-demo-default-pool-75e78709-zjfm Image gcr.io/k8s-staging-jobsejt/pytorch-mnist:latest is backed by secondary disk cache
Check the image pull latency by running the following command:
kubectl describe pod POD_NAME
Replace
POD_NAME
with the name of the Pod.The output is similar to following:
… Normal Pulled 15m kubelet Successfully pulled image "docker.io/library/nginx:latest" in 0.879149587s …
The expected image pull latency for the cached container image should be no more than a few seconds, regardless of image size.
Terraform
To create a cluster with the default node pool using Terraform, refer to the following example:
Create a node pool with a secondary boot disk in the same project:
To learn more about using Terraform, see Terraform support for GKE.
Add a
nodeSelector
to your Pod template:nodeSelector: cloud.google.com/gke-nodepool: NODE_POOL_NAME
Confirm that the secondary boot disk cache is in use:
kubectl get events --all-namespaces
The output is similar to the following:
75s Normal SecondaryDiskCachin node/gke-pd-cache-demo-default-pool-75e78709-zjfm Image gcr.io/k8s-staging-jobsejt/pytorch-mnist:latest is backed by secondary disk cache
Check the image pull latency by running the following command:
kubectl describe pod POD_NAME
Replace POD_NAME with the name of the Pod.
The output is similar to following:
… Normal Pulled 15m kubelet Successfully pulled image "docker.io/library/nginx:latest" in 0.879149587s …
The expected image pull latency for the cached container image should be no more than a few seconds, regardless of image size.
To learn more about using Terraform, see Terraform support for GKE.
Data
You can configure a secondary boot disk and preload data by using the Google Cloud CLI or Terraform:
gcloud
Create a GKE Standard cluster with image streaming enabled:
gcloud container clusters create CLUSTER_NAME \ --location=LOCATION \ --cluster-version=VERSION \ --enable-image-streaming
Replace the following:
- CLUSTER_NAME: the name of your cluster.
- LOCATION: the cluster location.
- VERSION: the GKE version to use. The GKE version must be 1.28.3-gke.1067000 or later.
Create a node pool with a secondary boot disk by using the
--secondary-boot-disk
flag:gcloud container node-pools create NODE_POOL_NAME \ --cluster=CLUSTER_NAME \ --location LOCATION \ --enable-image-streaming \ --secondary-boot-disk=disk-image=global/images/DISK_IMAGE_NAME
Replace the following:
- NODE_POOL_NAME: the name of the node pool.
- CLUSTER_NAME: the name of the existing cluster.
- LOCATION: the compute zone or zones separated by comma of the cluster.
- DISK_IMAGE_NAME: the name of your disk image.
To create a node pool with a secondary boot disk from the disk image in a different project, complete the steps in Use a secondary boot disk in a different project.
GKE creates a node pool where each node has a secondary disk with preloaded data. GKE attaches and mounts the secondary boot disk on the node.
To access the data, mount the secondary boot disk image in the Pod containers by using a hostPath volume mount. Set
/usr/local/data_path_sbd
to the path in your container where you want the data to reside:apiVersion: v1 kind: Pod metadata: name: pod-name spec: containers: ... volumeMounts: - mountPath: /usr/local/data_path_sbd name: data-path-sbd ... volumes: - name: data-path-sbd hostPath: path: /mnt/disks/gke-secondary-disks/gke-DISK_IMAGE_NAME-disk
Replace DISK_IMAGE_NAME with the name of your disk image.
Terraform
To create a cluster with the default node pool using Terraform, refer to the following example:
Create a node pool with a secondary boot disk in the same project:
To learn more about using Terraform, see Terraform support for GKE.
To access the data, mount the secondary boot disk image in the Pod containers by using a hostPath volume mount. Set
/usr/local/data_path_sbd
to the path in your container where you want the data to reside:apiVersion: v1 kind: Pod metadata: name: pod-name spec: containers: ... volumeMounts: - mountPath: /usr/local/data_path_sbd name: data-path-sbd ... volumes: - name: data-path-sbd hostPath: path: /mnt/disks/gke-secondary-disks/gke-DISK_IMAGE_NAME-disk
Replace the DISK_IMAGE_NAME with the name of your disk image.
Cluster autoscaling with secondary boot disks
To create a node pool and configure cluster autoscaling on a secondary boot disk, use Google Cloud CLI:
gcloud container node-pools create NODE_POOL_NAME \
--cluster=CLUSTER_NAME \
--location LOCATION \
--enable-image-streaming \
--secondary-boot-disk=disk-image=global/images/DISK_IMAGE_NAME,mode=CONTAINER_IMAGE_CACHE \
--enable-autoscaling \
--num-nodes NUM_NODES \
--min-nodes MIN_NODES \
--max-nodes MAX_NODES
Replace the following:
- NODE_POOL_NAME: the name of the node pool.
- CLUSTER_NAME: the name of the existing cluster.
- LOCATION: the compute zone or zones separated by comma of the cluster.
- DISK_IMAGE_NAME: the name of your disk image.
- MIN_NODES: the minimum number of nodes to
automatically scale for the specified node pool per zone. To specify the
minimum number of nodes for the entire node pool in GKE
versions 1.24 and later, use
--total-min-nodes
. The flags--total-min-nodes
and--total-max-nodes
are mutually exclusive with the flags--min-nodes
and--max-nodes
. - MAX_NODES: the maximum number of nodes to
automatically scale for the specified node pool per zone. To specify the
maximum number of nodes for the entire node pool in GKE
versions 1.24 and later, use
--total-max-nodes
. The flags--total-min-nodes
and--total-max-nodes
are mutually exclusive with the flags--min-nodes
and--max-nodes
.
Node auto-provisioning with secondary boot disks
In GKE 1.30.1-gke.1329000 and later, you can configure node auto-provisioning to automatically create and delete node pools to meet the resource demands of your workloads.
Create a disk image allowlist custom resource for secondary boot disk for GKE node auto-provisioning similar to the following:
apiVersion: "node.gke.io/v1" kind: GCPResourceAllowlist metadata: name: gke-secondary-boot-disk-allowlist spec: allowedResourcePatterns: - "projects/<PROJECT_ID>/global/images/.*"
Replace the PROJECT_ID with your project ID to host the disk image.
Deploy the allowlist custom resource in the cluster, run the following command:
kubectl apply -f ALLOWLIST_FILE
Replace the ALLOWLIST_FILE with the manifest filename.
Update the Pod node selector to use secondary boot disk:
nodeSelector: cloud.google.com.node-restriction.kubernetes.io/gke-secondary-boot-disk-DISK_IMAGE_NAME:CONTAINER_IMAGE_CACHE.PROJECT_ID
Replace the following:
- DISK_IMAGE_NAME: the name of your disk image.
- PROJECT_ID: your project ID to host the disk image.
Use a secondary boot disk in a different project
When you create a node pool with a secondary boot disk, you can tell GKE to use the disk image in a
different project by using the --secondary-boot-disk
flag.
Create a node pool with a secondary boot disk from the disk image in a different project by using the
--secondary-boot-disk
flag. For example:gcloud beta container node-pools create NODE_POOL_NAME \ --cluster=CLUSTER_NAME \ --location LOCATION \ --enable-image-streaming \ --secondary-boot-disk=disk-image=projects/IMAGE_PROJECT_ID/global/images/DISK_IMAGE_NAME,mode=CONTAINER_IMAGE_CACHE
Replace the following:
- DISK_IMAGE_NAME: the name of your disk image.
- IMAGE_PROJECT_ID: the name of the project that the disk image belongs to.
GKE creates a node pool where each node has a secondary disk with preloaded data. GKE attaches and mounts the secondary boot disk onto the node.
Grant access to disk images belonging to a different project by adding "Compute Image User" roles for the cluster service accounts:
- Default compute service account: CLUSTER_PROJECT_NUMBER@cloudservices.gserviceaccount.com
- GKE service account: service-CLUSTER_PROJECT_NUMBER@container-engine-robot.iam.gserviceaccount.com
gcloud projects add-iam-policy-binding IMAGE_PROJECT_ID \ --member serviceAccount:CLUSTER_PROJECT_NUMBER@cloudservices.gserviceaccount.com \ --role roles/compute.imageUser gcloud projects add-iam-policy-binding IMAGE_PROJECT_ID \ --member serviceAccount:service-CLUSTER_PROJECT_NUMBER@container-engine-robot.iam.gserviceaccount.com \ --role roles/compute.imageUser
What's next
- Use Use Image streaming to pull container images to pull container images by streaming the image data as your workloads need it.
- See Improve workload efficiency using NCCL Fast Socket to learn how to use the NVIDIA Collective Communication Library (NCCL) Fast Socket plugin.