Cloud TPU quotas
This document lists the quotas that apply to Cloud TPU. For information about Cloud TPU pricing, see Cloud TPU pricing.
Google Cloud uses quotas to help ensure fairness and reduce spikes in resource use and availability. A quota restricts how much of a Google Cloud resource your Google Cloud project can use. Quotas apply to a range of resource types, including hardware, software, and network components. For example, quotas can restrict the number of API calls to a service, the number of load balancers used concurrently by your project, or the number of projects that you can create. Quotas protect the community of Google Cloud users by preventing the overloading of services. Quotas also help you to manage your own Google Cloud resources.
The Cloud Quotas system does the following:
- Monitors your consumption of Google Cloud products and services
- Restricts your consumption of those resources
- Provides a way to request changes to the quota value
In most cases, when you attempt to consume more of a resource than its quota allows, the system blocks access to the resource, and the task that you're trying to perform fails.
Quotas generally apply at the Google Cloud project level. Your use of a resource in one project doesn't affect your available quota in another project. Within a Google Cloud project, quotas are shared across all applications and IP addresses.
TPU quota
There are different quotas for each version of TPU. For example there are different quotas for TPU v2, v3, and so on. For each version of TPU there are different types of quota: on-demand and preemptible (Spot VMs). The following table describes the different types of quota.
Quota type | Description | Default value | How to request | Flags for TPU creation |
---|---|---|---|---|
On-demand | The number of on-demand resources for which you have access. On-demand resources won't be preempted, but on-demand quota does not guarantee there will be enough available Cloud TPU resources to satisfy your request. |
v3-8 and v2-8: 16 TensorCores All others: 0 |
See Request additional quota. | No flags needed, selected by default. |
Preemptible | The number of preemptible Cloud TPU resources for which you have access. This quota applies to both preemptible TPUs and TPU Spot VMs. Preemptible resources may be preempted to make room for higher priority jobs. Preemptible quota does not guarantee there will be enough available Cloud TPU resources to satisfy your request. For more information, see Preemptible TPUs and Manage TPU Spot VMs. |
v3-8 and v2-8: 48 TensorCores All others: 0 |
See Request additional quota. |
|
TPU quotas are specified in terms of TPU cores per project per zone or TPU cores per project per region.
TPU v5p quotas
You can use your TPU v5p quota in any combination of cores. For example, if you have quota for 32 cores, you can use this quota to create four TPU slices each with 8 cores.
Preemptible quotas:
- Preemptible TPU v5p cores per project per region
- Preemptible TPU v5p cores per project per zone
On-demand quotas:
- TPU v5p cores per project per region
- TPU v5p cores per project per zone
TPU v5e quotas
TPU v5e can be used for training and serving. There are separate quotas for training and serving as well as single-host (lite cores) and multi-host (lite pod cores).
Serving quotas
Preemptible serving quotas:
- Preemptible TPU v5 lite pod cores for serving per project per region
- Preemptible TPU v5 lite pod cores for serving per project per zone
On-demand serving quotas:
- TPU v5 lite pod cores for serving per project per region
- TPU v5 lite pod cores for serving per project per zone
Training quotas
Preemptible training quotas:
- Preemptible TPU v5 lite cores per project per region
- Preemptible TPU v5 lite cores per project per zone
- Preemptible TPU v5 lite pod cores per project per region
- Preemptible TPU v5 lite pod cores per project per zone
On-demand training quotas:
- TPU v5 lite cores per project per region
- TPU v5 lite cores per project per zone
- TPU v5 lite pod cores per project per region
- TPU v5 lite pod cores per project per zone
TPU v4 quotas
You can use your TPU v4 quota in any combination of cores. For example, if you have quota for 32 cores, you can use this quota to create four TPU slices each with 8 cores.
Preemptible quotas:
- Preemptible TPU v4 pod cores per project per region
- Preemptible TPU v4 pod cores per project per zone
On demand quotas:
- TPU v4 pod cores per project per region
- TPU v4 pod cores per project per zone
TPU v3 quotas
There are separate TPU v3 quotas for single host TPUs (core) and mulithost TPUs (pod). You must use v3 pod quotas to create TPUs with more than 8 cores.
Preemptible quotas:
- Preemptible TPU v3 cores per project per region
- Preemptible TPU v3 cores per project per zone
- Preemptible TPU v3 pod cores per project per region
- Preemptible TPU v3 pod cores per project per zone
On demand quotas:
- TPU v3 cores per project per region
- TPU v3 cores per project per zone
- TPU v3 pod cores per project per region
- TPU v3 pod cores per project per zone
TPU v2 quotas
There are separate TPU v2 quotas for single-host TPUs (core) and multi-host TPUs (pod).
Preemptible quotas:
- Preemptible TPU v2 cores per project per region
- Preemptible TPU v2 cores per project per zone
- Preemptible TPU v2 pod cores per project per region
- Preemptible TPU v2 pod cores per project per zone
On demand quotas:
- TPU v2 cores per project per region
- TPU v2 cores per project per zone
- TPU v2 pod cores per project per region
- TPU v2 pod cores per project per zone
For more information about TPU chips and TensorCores, see TPU System architecture.
View and request additional quota
You can view the quota allocated for your Google Cloud project on the Quotas page in the Google Cloud console. If you need additional Cloud TPU quota, you can request it from the Quotas page. For more information, see Request a higher quota limit.
When a Google Cloud service increases the default quota values for resources and APIs, these changes take place gradually. This might result in ongoing rollouts across different regions or resources. During the rollout, the quota value that appears in the Google Cloud console or Cloud Quotas API won't reflect the new, increased quota value until the rollout completes. For more information, see View ongoing rollouts.