Enable connection draining

Connection draining is a process that ensures that existing, in-progress requests are given time to complete when a virtual machine (VM) instance is removed from an instance group or when an endpoint is removed from network endpoint groups (NEGs) that are zonal in scope.

The information on this page applies only to instance groups and the following types of NEGs that are zonal in scope:

To enable connection draining, you set a connection draining timeout on the backend service. The timeout duration must be from 0 to 3600 seconds inclusive.

For the specified duration of the timeout, existing requests to the removed VM or endpoint are given time to complete. The load balancer doesn't send new connections to the removed VM or endpoint. After the timeout duration is reached, the load balancer stops sending all traffic to the removed VM or endpoint.

Connection draining begins whenever you do the following:

You manually remove a VM from an instance group.
You remove an instance from a managed instance group by performing a resize(), deleteInstances(), recreateInstances(), or abandonInstances() call.
An instance group is removed from a backend service. This isn't supported for internal passthrough Network Load Balancers.
Google Cloud deletes an instance as part of autoscaling.
You perform an update to the managed instance group using the Managed Instance Group Updater.
You manually remove an endpoint from a zonal NEG.

It can take up to 60 seconds after your specified timeout duration has passed for the instance to be terminated.

If you enable connection draining on multiple backend services that share the same instance groups or NEGs, the largest timeout value is used. For example, suppose that the same instance group or zonal NEG is a backend for two backend services, where one backend service has a connection draining timeout of 100 seconds, and the other backend service has a connection draining timeout of 200 seconds. Google Cloud uses 200 seconds as the effective connection draining timeout so that existing connections are allowed to exist for 200 seconds before Google Cloud terminates them. If the backend is a managed instance group, operations that delete the instance are delayed by at least 200 seconds.

The following is a list of specifications about connection draining:

Connection draining is available for backend services that are part of the following load balancers:
Both internal passthrough Network Load Balancers and external passthrough Network Load Balancers support connection draining for TCP, UDP, and other non-TCP protocols.
Connection draining is also available for backend services that are part of Cloud Service Mesh deployments.
When a connection draining timeout is set, and an instance is removed from the instance group or an endpoint is removed from a zonal NEG, Google Cloud load balancers and Cloud Service Mesh behave in the following way:
- No new connections are sent to the removed instance or endpoint.
- Active sessions supporting existing connections to the removed instance or endpoint can persist until the configured connection draining timeout has elapsed. After the timeout period ends, Google Cloud ends existing connections on the removed instance or endpoint.
If you don't set a connection draining timeout, or if the connection draining timeout is set to zero (0), Google Cloud ends existing connections on the removed instance or endpoint as quickly as possible.
If you're using connection pooling, you might see that new requests, using a previously established connection, are still being received on VMs that are getting drained, causing connection errors when those VMs are eventually deleted.

To enable connection draining, complete the following steps.

Console

Update a load balancer

Go to the Load balancing page in the Google Cloud console.
Go to Load balancing
Click Edit for your load balancer or create a new load balancer.
Click Backend configuration.
Click Advanced configurations at the bottom of your backend service.
In the Connection draining timeout field, enter a value from 0 - 3600. A setting of 0 disables connection draining.

Update Cloud Service Mesh

Go to the Cloud Service Mesh page in the Google Cloud console.
Go to Cloud Service Mesh
Click the Name of your service.
Click Advanced configurations at the bottom of your service.
In the Connection draining timeout field, enter a value from 0 - 3600. A setting of 0 disables connection draining.
Click Save.

gcloud

Enable connection draining on a new or existing backend service by using the --connection-draining-timeout flag. The following examples demonstrate how to change the connection draining timeout:

For an existing global or cross-region load balancer:

gcloud compute backend-services update BACKEND_SERVICE \
    --global \
    --connection-draining-timeout=CONNECTION_TIMEOUT_SECS

For an existing regional load balancer:

gcloud compute backend-services update BACKEND_SERVICE \
    --region=REGION \
    --connection-draining-timeout=CONNECTION_TIMEOUT_SECS

Replace the placeholders with valid values:

BACKEND_SERVICE: The backend service that you're updating.
REGION: If applicable, the region of the backend service that you're updating
CONNECTION_TIMEOUT_SECS: The number of seconds to wait before existing connections to instances or endpoints are terminated, between 0 - 3600 seconds, inclusive. A setting of 0 disables connection draining. The connection draining timeout applies to all backends of the backend service.

You can also use the gcloud compute backend-services edit command to update an existing backend service.

API

To enable connection draining in the API when creating or updating an instance or endpoint, make a request to the respective API URI to include the connectionDraining field in your request body. The following examples demonstrate how to set that attribute by editing an existing backend service. For information about other required attributes, see the documentation for each load balancer.

For an existing global or cross-region load balancer:
```
PATCH https://www.googleapis.com/compute/v1/projects/PROJECT_ID/global/backendServices
```
Note: Classic Application Load Balancers and classic proxy Network Load Balancers always use global backend services, even when backends can be in only one region because the load balancer is set to use the Standard Network Tier.

For an existing regional load balancer:
```
PATCH https://www.googleapis.com/compute/v1/projects/PROJECT_ID/region/REGION/backendServices

{
   "name": "BACKEND_SERVICE",
   "connectionDraining": {
     "drainingTimeoutSec": CONNECTION_TIMEOUT_SECS
   }
}
```
where:
- PROJECT_ID is the project ID that contains your load balancer or Cloud Service Mesh deployment.
- BACKEND_SERVICE is the backend service used by your load balancer or Cloud Service Mesh deployment.
- CONNECTION_TIMEOUT_SECS is the number of seconds to wait before instances or endpoints are removed from the instance group or NEG, between 0 to 3600 seconds, inclusive. This timeout duration applies to all instance groups or NEGs that are referenced by the backend service.

What's next

For general information on backend services, see Backend services overview.