To ensure that VM resources are available when your Dataflow jobs need them, you can use Compute Engine reservations. Reservations provide a high level of assurance in obtaining capacity for Compute Engine zonal resources.
To use Compute Engine reservations with Dataflow, perform the following steps:
Create a Compute Engine reservation. It can be a single-project reservation or a shared reservation. For more information, see the following documents:
The reservation can include GPU accelerators.
When you submit your Dataflow job, pass one of the following service options, depending on which version of the Beam SDK you are using:
- Beam version < 2.29:
--experiments=skip_gce_quota_verification
- Beam version >= 2.29:
--dataflow_service_options=automatically_use_created_reservation
- Beam version < 2.29:
To prevent low-priority workloads in the same project from competing for
reservations with Dataflow, set the reservation affinity to
none
when you create VMs for those workloads. For more information, see
Consuming reserved instances.
In order to use the reservation, the Dataflow workers must match the reservation configuration. You might need to set the worker machine type for the job. For more information, see Workers.
Limitations
All limitations of Compute Engine reservations apply when Dataflow workers consume reservations. See How reservations work.
Dataflow relies on the default consumption order in Compute Engine. As a result, the following limitations apply:
- Dataflow does not consume a reservation created with the
--require-specific-reservation
flag. - Other workloads in the same project or Organization that do not specify the
--reservation
flag might compete with Dataflow workloads for project-specific or shared reservations.
- Dataflow does not consume a reservation created with the
Dataflow Prime jobs don't consume Compute Engine reservations.
Pricing
Reserved Compute Engine VMs are billed by Dataflow while the Dataflow job is running, and are billed by Compute Engine when the VMs are not being used by Dataflow.
If you use your Compute Engine reservations with Dataflow, then those reserved resources aren't eligible for Compute Engine committed use discounts. Usage is billed by using the Dataflow pricing model.
What's next
To learn more about Compute Engine reservations, see Reservations of Compute Engine zonal resources.