Optimize Autopilot Pod performance by choosing a machine series


This page shows you how to place workloads on specific Compute Engine machine series for optimal workload performance in your Google Kubernetes Engine (GKE) Autopilot clusters.

Ensure that you're familiar with the following:

How machine series selection works

You can add a cloud.google.com/machine-family node selector to your Pod specification for Autopilot to allocate specific Compute Engine hardware for that Pod. For example, you can choose the C3 machine series for Pods that need more CPU power, or the N1 machine series for Pods that need more memory. Autopilot provisions one of the predefined machine types from the selected machine series to optimally run your workload.

In addition to optimal Pod performance, choosing a specific machine series offers the following benefits:

  • Efficient node utilization: By default, Autopilot optimizes node resource usage by scheduling as many Pods as possible that request the same machine series onto each node. This approach optimizes resource usage on the node, which improves the price-to-performance ratio. If your workload needs access to all of the resources on the node, you can optionally configure your workload to request one Pod for each node.

  • Burstable workloads: You can configure Pods to burst into unused resource capacity on the node by setting your resource limits higher than your requests. For details, see Configure Pod bursting in GKE.

Request a dedicated node for each Pod

If you have CPU-intensive workloads that need reliable access to all of the node resources, you can optionally configure your Pod to get Autopilot to place a Pod that requests a machine series on its own node.

Dedicated nodes per Pod are recommended when you run large-scale, CPU-intensive workloads, like AI/ML training workloads or high performance computing (HPC) batch workloads.

Choose between multiple-Pod and single-Pod scheduling

Use the following guidance to choose a Pod scheduling behavior based on your requirements:

Pricing

You're billed for the underlying VM and any attached hardware by Compute Engine, plus a premium for Autopilot node management and scalability. For details, see GKE pricing.

Before you begin

Before you start, make sure you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.
  • Ensure that you have an existing Autopilot cluster running version 1.30.1-gke.1396000 or later. To create a cluster, see Create an Autopilot cluster.

Select a machine series

This section shows you how to select a specific Compute Engine machine series in a Pod.

  1. Save the following manifest as machine-series-pod.yaml:

    apiVersion: v1
    kind: Pod
    metadata:
      name: machine-series-pod
    spec:
      nodeSelector:
        cloud.google.com/machine-family: MACHINE_SERIES
      containers:
      - name: my-container
        image: "k8s.gcr.io/pause"
        resources:
          requests:
            cpu: 5
            memory: "25Gi"
          limits:
            cpu: 20
            memory: 100Gi
    

    Replace MACHINE_SERIES with the Compute Engine machine series for your Pod, like c3. For supported values, see Supported machine series in this page.

  2. Deploy the Pod:

    kubectl apply -f machine-series-pod.yaml
    

This manifest lets Autopilot optimize node resource usage by efficiently scheduling other Pods that select the same machine series onto the same node if there's available capacity.

Use Local SSDs

Pods that select a machine series can use Local SSDs for ephemeral storage if you specify a machine series that offers Local SSD. Autopilot considers ephemeral storage requests when choosing a Compute Engine machine type for the Pod.

  1. Save the following manifest as local-ssd-pod.yaml:

      apiVersion: v1
      kind: Pod
    metadata:
      name: local-ssd-pod
    spec:
      nodeSelector:
        cloud.google.com/machine-family: MACHINE_SERIES
        cloud.google.com/gke-ephemeral-storage-local-ssd: "true"
      containers:
      - name: my-container
        image: "k8s.gcr.io/pause"
        resources:
          requests:
            cpu: 6
            memory: "25Gi"
            ephemeral: "100Gi"
          limits:
            cpu: 12
            memory: "50Gi"
            ephemeral: "200Gi"
    

    Replace MACHINE_SERIES with a supported machine series that also supports Local SSDs. If your specified machine series doesn't support Local SSDs, the deployment fails with an error.

  2. Deploy the Pod:

    kubectl apply -f local-ssd-pod.yaml
    

Request a dedicated node for a Pod

If your Pod has specific performance requirements like needing reliable access to all of the resources of your node, you can request a dedicated node for each Pod by specifying the cloud.google.com/compute-class: Performance node selector along with your machine series node selector. This indicates to Autopilot to place your Pod on a new node that uses the specified machine series and is dedicated for that Pod. This node selector also prevents Autopilot from scheduling other Pods on that node.

  1. Save the following manifest as dedicated-node-pod.yaml:

    apiVersion: v1
    kind: Pod
    metadata:
      name: dedicated-node-pod
    spec:
      nodeSelector:
        cloud.google.com/machine-family: MACHINE_SERIES
        cloud.google.com/compute-class: Performance
      containers:
      - name: my-container
        image: "k8s.gcr.io/pause"
        resources:
          requests:
            cpu: 12
            memory: "50Gi"
            ephemeral: "200Gi"
    

    Replace MACHINE_SERIES with a supported machine series that also supports one Pod per node scheduling. If the specified machine series doesn't support one Pod per node scheduling, the deployment fails with an error.

  2. Deploy the Pod:

    kubectl apply -f dedicated-node-pod.yaml
    

When you deploy this manifest, Autopilot does the following:

  • Ensures that the deployed Pod requests at least the minimum resources for the performance-optimized node.
  • Calculates the total resource requests of the deployed Pod and any DaemonSets in the cluster.
  • Provisions a node that's backed by the selected machine series.
  • Modifies the Pod manifest with a combination of node selectors and tolerations to ensure that the Pod runs on its own node.

Supported machine series

The machine-family selector supports the following machine series:

(Preview with allowlist *)
(always bundled)

* This feature requires you to be added to an allowlist. To receive access, contact your account team.

Note that c4 is the default if machine series is not specified and if c4 is available in a region.

To compare these machine series and their use cases, see Machine series comparison in the Compute Engine documentation.

Compatibility with other GKE features

You can use Pods that select machine series with the following GKE capabilities and features:

Spot Pods and extended run time Pods are mutually exclusive. GKE doesn't enforce higher minimum resource requests for dedicated Pods per node, even though they use workload separation.

How GKE selects a machine type

To select a machine type in the specified machine series, GKE calculates the total CPU, total memory, and total ephemeral storage requests of the Pods and any DaemonSets that will run on the new node. GKE rounds these values up to the nearest available Compute Engine machine type that supports all of these totals.

  • Example 1: Consider a Deployment with four replicas that selects the C3D machine series. You don't request dedicated nodes per Pod. The resource requests of each replica are as follows:

    • 500m vCPU
    • 1 GiB of memory

    Autopilot places all of the Pods on a node that's backed by the c3d-standard-4 machine type, which has 4 vCPUs and 16 GB of memory.

  • Example 2: Consider a Pod that selects the C3D machine series and Local SSDs for ephemeral storage. You request a dedicated node for the Pod. The total resource requests including DaemonSets are as follows:

    • 12 vCPU
    • 50 GiB of memory
    • 200 GiB of ephemeral storage

    Autopilot places the Pod on a node that uses the c3d-standard-16-lssd machine type, which has 16 vCPUs, 64 GiB of memory, and 365 GiB of Local SSD capacity.

What's next