This page explains how to upgrade Google Distributed Cloud.
Overview of the upgrade process
You can upgrade directly to any version that is in the same minor release or the next minor release. For example, you can upgrade from 1.8.0 to 1.8.3, or from 1.8.1 to 1.9.0.
If you are upgrading to a version that is not part of the next minor release, you must upgrade through one version of each minor release between your current version and your desired version. For example, if you are upgrading from version 1.8.2 to version 1.10.0, it is not possible to upgrade directly. You must first upgrade from version 1.8.2 to version 1.9.x, and then upgrade to version 1.10.0.
First upgrade the admin workstation, then the user clusters, and last the admin cluster. You do not have to upgrade the admin cluster immediately after upgrading the user clusters if you want to keep the admin cluster on its current version.
- Download the
gkeadm
tool. The version ofgkeadm
must be the same as the target version of your upgrade. - Upgrade your admin workstation.
- From your admin workstation, upgrade your user clusters.
- From your admin workstation, upgrade your admin cluster.
Example of a recommended upgrade process from version 1.9.x to version 1.10.x
Suppose your admin workstation, admin cluster, and user clusters currently use version 1.9.x, and you want to upgrade both your admin cluster and your user clusters to version 1.10.x. If you follow an upgrade path like the following, with the use of a canary cluster for testing before you proceed further, then you minimize the risk of disruption.
The following is a high-level overview of a recommended upgrade process. Before you begin, create a canary user cluster that uses version 1.9.x, if you have not done so already.
- Test version 1.10.x in a canary cluster.
- Upgrade the admin workstation to version 1.10.x.
- Run
gkectl prepare
command, as described subsequently, to set up the upgrade. - Upgrade the canary user cluster to version 1.10.x.
- Upgrade all production user clusters to version 1.10.x when you are confident with version 1.10.x.
- Upgrade the admin cluster to version 1.10.x.
Locating your configuration and information files to prepare for upgrade
Before you created your admin workstation, you filled in an admin workstation configuration file that was generated by gkeadm create config
. The default name for this file is admin-ws-config.yaml
.
In addition, your workstation has an information file. The default name of this file is the same as the name of your current admin workstation.
Locate your admin workstation configuration file and your information file. You
need them to do the steps in this guide. If these files are in your current
directory and they have their default names, then you won't need to specify
them when you run the upgrade commands. If these files are in
another directory, or if you have changed the filenames, then you specify them
by using the --config
and --info-file
flags.
If your output information file is missing, you can re-create it.
Re-create an information file if missing
If the output information file for your admin workstation is missing, you must re-create this file so you can then proceed with the upgrade. This file was created when you initially created your workstation, and if you have since done an upgrade, it was updated with new information.
The output information file has this format:
Admin workstation version: GKEADM_VERSION Created using gkeadm version: GKEADM_VERSION VM name: ADMIN_WS_NAME IP: ADMIN_WS_IP SSH key used: FULL_PATH_TO_ADMIN_WS_SSH_KEY To access your admin workstation: ssh -i FULL-PATH-TO-ADMIN-WS-SSH-KEY ubuntu@ADMIN-WS-IP
Here is a sample output information file:
Admin workstation version: v1.10.3-gke.49 Created using gkeadm version: v1.10.3-gke.49 VM name: admin-ws-janedoe IP: 172.16.91.21 SSH key used: /usr/local/google/home/janedoe/.ssh/gke-admin-workstation Upgraded from (rollback version): v1.10.0-gke.194 To access your admin workstation: ssh -i /usr/local/google/home/janedoe/.ssh/gke-admin-workstation ubuntu@172.16.91.21
Create the file in an editor, substituting the appropriate parameters. Save the file with a filename that is the same as the VM name in the directory from which gkeadm is run. For example, if the VM name is admin-ws-janedoe
, save the file as admin-ws-janedoe
.
Upgrading your admin workstation
Make sure your gkectl
and clusters are at the appropriate version level for an upgrade, and that you have
downloaded the appropriate bundle.
Run this command:
gkeadm upgrade admin-workstation --config [AW_CONFIG_FILE] --info-file [INFO_FILE]
where:
[AW_CONFIG_FILE] is the path of your admin workstation configuration file. You can omit this flag if the file is in your current directory and has the name
admin-ws-config.yaml
.[INFO_FILE] is the path of your information file. You can omit this flag if the file is in your current directory. The default name of this file is the same as the name of your admin workstation.
The preceding command performs the following tasks:
Back up all files in the home directory of your current admin workstation. These include:
- Your admin cluster configuration file. The default name is
admin-cluster.yaml
. - Your user cluster configuration file. The default name is
user-cluster.yaml
. - The kubeconfig files for your admin cluster and your user clusters.
- The root certificate for your vCenter server. Note that this file must have owner read and owner write permission.
- The JSON key file for your component access service account. Note that this file must have owner read and owner write permission.
- The JSON key files for your connect-register and logging-monitoring service accounts.
- Your admin cluster configuration file. The default name is
Create a new admin workstation, and copy all the backed-up files to the new admin workstation.
Delete the old admin workstation.
Verify that enough IP addresses are available
Do the steps in this section on your new admin workstation.
Before you upgrade, be sure that you have enough IP addresses available for your clusters. You can set aside additional IPs as needed, as described for each of DHCP and static IPs.
DHCP
When you upgrade the admin cluster, Google Distributed Cloud creates one temporary node in the admin cluster. When you upgrade a user cluster, Google Distributed Cloud creates a temporary node in that user cluster. The purpose of the temporary node is to ensure uninterrupted availability. Before you upgrade a cluster, make sure that your DHCP server can provide enough IP addresses for the temporary node. For more information, see IP addresses needed for admin and user clusters.
Static IPs
When you upgrade the admin cluster, Google Distributed Cloud creates one temporary node in the admin cluster. When you upgrade a user cluster, Google Distributed Cloud creates a temporary node in that user cluster. The purpose of the temporary node is to ensure uninterrupted availability. Before you upgrade a cluster, verify that you have reserved enough IP addresses. For each cluster, you must reserve at least one more IP address than the number of cluster nodes. For more information, see Configuring static IP addresses.
Determine the number of nodes in your admin cluster:
kubectl --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] get nodes
where [ADMIN_CLUSTER_KUBECONFIG] is the path of your admin cluster's kubeconfig file.
Next, view the addresses reserved for your admin cluster:
kubectl get cluster --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] -o yaml
In the output, in the reservedAddresses
field, you can see the number of
IP addresses that are reserved for the admin cluster nodes. For example, the
following output shows that there are five IP addresses reserved for the
admin cluster nodes:
...
reservedAddresses:
- gateway: 21.0.135.254
hostname: admin-node-1
ip: 21.0.133.41
netmask: 21
- gateway: 21.0.135.254
hostname: admin-node-2
ip: 21.0.133.50
netmask: 21
- gateway: 21.0.135.254
hostname: admin-node-3
ip: 21.0.133.56
netmask: 21
- gateway: 21.0.135.254
hostname: admin-node-4
ip: 21.0.133.47
netmask: 21
- gateway: 21.0.135.254
hostname: admin-node-5
ip: 21.0.133.44
netmask: 21
The number of reserved IP addresses should be at least one more than the number of nodes in the admin cluster.
For version 1.7 and later, to add IP addresses to the admin cluster:
First, edit the IP block file, as shown in this example.
blocks:
- netmask: "255.255.252.0"
ips:
- ip: 172.16.20.10
hostname: admin-host1
- ip: 172.16.20.11
hostname: admin-host2
# Newly-added IPs.
- ip: 172.16.20.12
hostname: ad
min-host3
Next, run this command to update the configuration.
gkectl update admin --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] --config [ADMIN_CONFIG_FILE]
[ADMIN_CLUSTER_KUBECONFIG] is the path of your kubeconfig file.
[ADMIN_CONFIG_FILE] is the path of your admin config file. You can omit this flag if the file is in your current directory and has the name
admin-config.yaml
.
You cannot remove IP addresses, but only add them.
For versions prior to 1.7, you can add an additional address by editing the Cluster object directly.
Open the Cluster object for editing:
kubectl edit cluster --kubeconfig [ADMIN_CLUSTER_KUBECONFIG]
Under reservedAddresses
, add an additional block that has gateway
,
hostname
, ip
, and netmask
.
Important: Starting from 1.5.0, the same procedure does not work for user clusters and
you must use gkectl update cluster
for each of them.
To determine the number of nodes in a user cluster:
kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] get nodes
where [USER_CLUSTER_KUBECONFIG] is the path of your user cluster's kubeconfig file.
To view the addresses reserved for a user cluster:
kubectl get cluster --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] \ -n [USER_CLUSTER_NAME] [USER_CLUSTER_NAME] -o yaml
where:
[ADMIN_CLUSTER_KUBECONFIG] is the path of your admin cluster's kubeconfig file.
[USER_CLUSTER_NAME] is the name of the user cluster.
The number of reserved IP addresses should be at least one more than the number of nodes in the user cluster. If this is not the case, you can open the user cluster's IP block file for editing:
If any of the addresses reserved for a user cluster are included in the IP block file, add them to the corresponding block based on
netmask
andgateway
.Add as many additional static IP addresses to the corresponding block as required, and then run
gkectl update cluster
.
Install bundle for upgrade
To make a version available for cluster creation or upgrade, you must install the corresponding bundle. Follow these steps to install a bundle for TARGET_VERSION, which is the number of the version to which you want to upgrade.
To check the current gkectl
and cluster versions, run this command. Use the flag --details/-d
for more detailed information.
gkectl version --kubeconfig ADMIN_CLUSTER_KUBECONFIG --details
Here is example output:
gkectl version: 1.7.2-gke.2 (git-5b8ef94a3)onprem user cluster controller version: 1.6.2-gke.0 current admin cluster version: 1.6.2-gke.0 current user cluster versions (VERSION: CLUSTER_NAMES): - 1.6.2-gke.0: user-cluster1 available admin cluster versions: - 1.6.2-gke.0 available user cluster versions: - 1.6.2-gke.0 - 1.7.2-gke.2 Info: The admin workstation and gkectl is NOT ready to upgrade to "1.8" yet, because there are "1.6" clusters. Info: The admin cluster can't be upgraded to "1.7", because there are still "1.6" user clusters.
Based on the output you get, look for the following issues, and fix them as needed.
If the
gkectl
version is lower than 1.7, the new upgrade flow is not available directly. Follow the original upgrade flow to upgrade all your clusters to 1.6, and then upgrade your admin workstation to 1.7 to start using the new upgrade flow.If the current admin cluster version is more than one minor version lower than the TARGET_VERSION, upgrade all your clusters to be one minor version lower than the TARGET_VERSION.
If the
gkectl
version is lower than the TARGET_VERSION, upgrade the admin workstation to the TARGET_VERSION, following the instructions.
When you have determined that your gkectl
and cluster versions are appropriate for an upgrade, download the bundle.
Check whether the bundle tarball already exists on the admin workstation.
stat /var/lib/gke/bundles/gke-onprem-vsphere-TARGET_VERSION.tgz
If the bundle is not on the admin workstation, download it.
gcloud storage cp gs://gke-on-prem-release/gke-onprem-bundle/TARGET_VERSION/gke-onprem-vsphere-TARGET_VERSION.tgz /var/lib/gke/bundles/
If you have run the
pre-upgrade tool,
skip to the next step. Otherwise, run gkectl prepare
to import
OS images to vSphere:
gkectl prepare --bundle-path /var/lib/gke/bundles/gke-onprem-vsphere-TARGET_VERSION.tgz --kubeconfig ADMIN_CLUSTER_KUBECONFIG
where:
- [ADMIN_CLUSTER_KUBECONFIG] is the path of your kubeconfig file. You can omit
this flag if the file is in your current directory and has the name
kubeconfig
.
List available cluster versions, and make sure the target version is included in the available user cluster versions.
gkectl version --kubeconfig ADMIN_CLUSTER_KUBECONFIG --details
You can now create a user cluster at the target version, or upgrade a user cluster to the target version.
Roll back an admin workstation after an upgrade
You can roll back the admin workstation to the version used before the upgrade.
During the upgrade, gkeadm
records the version before it was upgraded in the output information file. During the rollback, gkeadm
uses the version listed to download the older file.
To roll back your admin workstation to the previous version:
gkeadm rollback admin-workstation --config=AW_CONFIG_FILE
You can omit --config=AW_CONFIG_FILE
if your admin workstation configuration file is the default admin-ws-config.yaml
. Otherwise, replace AW_CONFIG_FILE with the path to the admin workstation configuration file.
The rollback command performs these steps:
- Downloads the rollback version of
gkeadm
. - Backs up the home directory of the current admin workstation.
- Creates a new admin workstation using the rollback version of
gkeadm
. - Deletes the original admin workstation.
Upgrading a user cluster
Take note of the following before you proceed with the upgrade:
If a user cluster is unregistered, you must register that user cluster before you can upgrade it to version 1.10.
The
gkectl upgrade
command runs preflight checks. If the preflight checks fail, the command is blocked. You must fix the failures or use the flag--skip-preflight-check-blocking
with the command to unblock it.As of version 1.10, Google Distributed Cloud includes the
konnectivityServerNodePort
for the manual load balancer. Make sure you specify an appropriate value for this node port, and configure the load balancer using this node port and add this new node port in the configuration file before upgrading. See manual load balance.
Proceed with these steps on your admin workstation:
Make sure the
bundlepath
field in the admin cluster configuration file matches the path of the bundle to which you want to upgrade.Make sure the
gkeOnPremVersion
field in the user cluster configuration file matches the version to which you want to upgrade.If you make any other changes to the fields in the admin cluster configuration file or the user cluster configuration file, these changes are ignored during the upgrade. To make those changes take effect, you must first upgrade the cluster, and then run an update command with the configuration file changes to make other changes to the cluster.
Upgrade with the following command.
gkectl upgrade cluster \ --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] \ --config [USER_CLUSTER_CONFIG_FILE] \ [FLAGS]
where:
[ADMIN_CLUSTER_KUBECONFIG] is the admin cluster's kubeconfig file.
[USER_CLUSTER_CONFIG_FILE] is the Google Distributed Cloud user cluster configuration file on your new admin workstation.
[FLAGS] is an optional set of flags. For example, you could include the
--skip-validation-infra
flag to skip checking of your vSphere infrastructure.
Resuming an upgrade
If a user cluster upgrade is interrupted, you can resume the user cluster upgrade by running the
same upgrade command with the --skip-validation-all
flag:
gkectl upgrade cluster \ --kubeconfig [ADMIN_CLUSTER_KUBECONFIG] \ --config [USER_CLUSTER_CONFIG_FILE] \ --skip-validation-all
Upgrading the admin cluster
Do the steps in this section on your new admin workstation. Make sure your gkectl
and clusters are at the appropriate version level for an upgrade, and that you have downloaded the appropriate bundle.
Make sure the
bundlepath
field in the admin cluster configuration file matches the path of the bundle to which you want to upgrade.If you make any other changes to the fields in the admin cluster configuration file, these changes are ignored during the upgrade. To make those changes take effect, you must first upgrade the cluster, and then run an update command with the configuration file changes to make other changes to the cluster.
The gkectl
version must be equal to or greater than the target upgrade version. Thus, if your gkectl
version is 1.10.0, it can upgrade a cluster to either 1.10.0 or 1.9.x. The admin cluster can only be upgraded to a minor version, when all user clusters have been upgraded to that minor version.
Run the following command:
gkectl upgrade admin \ --kubeconfig ADMIN_CLUSTER_KUBECONFIG \ --config ADMIN_CLUSTER_CONFIG_FILE \ FLAGS
Replace the following:
ADMIN_CLUSTER_KUBECONFIG is the admin cluster's kubeconfig file.
ADMIN_CLUSTER_CONFIG_FILE is the Google Distributed Cloud admin cluster configuration file on your new admin workstation.
FLAGS is an optional set of flags. For example, you could include the
--skip-validation-infra
flag to skip checking of your vSphere infrastructure.
If you downloaded a full bundle, and you have successfully run the gkectl prepare
and gkectl upgrade admin
commands, you should now delete the full bundle to save disk space on the admin workstation. Use this command:
rm /var/lib/gke/bundles/gke-onprem-vsphere-${TARGET_VERSION}-full.tgz
Resuming an admin cluster upgrade
If an admin cluster upgrade is interrupted or fails, the upgrade can be resumed if the admin cluster checkpoint contains the state required to restore the state prior to the interruption.
Follow these steps:
- Check if the admin control plane is healthy before you begin the initial upgrade attempt.
- If the admin control plane is unhealthy prior to the initial upgrade attempt, repair the admin control plane with the
gkectl repair admin-master
command. - When you rerun the upgrade command after an upgrade has been interrupted or has failed, use the same bundle and target version as you did in the previous upgrade attempt.
When you rerun the upgrade command, the resumed upgrade recreates the state in the kind cluster from the checkpoint and reruns the entire upgrade. If the admin control plane is unhealthy, it will first be restored before proceeding to upgrade again.
The upgrade will resume from the point where it failed or exited if the admin cluster checkpoint is available. If the checkpoint is unavailable, the upgrade will fall back to relying on the admin control plane, and therefore the admin control plane must be healthy in order to proceed with the upgrade. After a successful upgrade, the checkpoint is regenerated.
If gkectl
exits unexpectedly during an admin cluster upgrade, the kind cluster is not cleaned up. Before you rerun the upgrade command to resume the upgrade, delete the kind cluster:
docker stop gkectl-control-plane && docker rm gkectl-control-plane
After deleting the kind cluster, rerun the upgrade command again.
Troubleshooting the upgrade process
If you experience an issue when following the recommended upgrade process, follow these recommendations to resolve them. These suggestions assume that you have begun with a version 1.6.2 setup, and are proceeding through the recommended upgrade process.
Troubleshooting a user cluster upgrade issue
Suppose you find an issue with 1.7 when testing the canary cluster, or upgrading a user cluster. You determine from Google Support that the issue will be fixed in an upcoming patch release 1.7.x. You can proceed as follows:
- Continue using 1.6.2 for production;
- Test the 1.7.x patch release in a canary cluster when it is released.
- Upgrade all production user clusters to 1.7.x when you are confident with it.
- Upgrade the admin cluster to 1.7.x.
Managing a 1.6.x patch release when testing 1.7
Suppose you are in the process of testing or migrating to 1.7, but not confident with it yet, and your admin cluster still uses 1.6.2. You find that a significant 1.6.x patch release has been released. You can still take advantage of this 1.6.x patch release while continuing to test 1.7. Follow this upgrade process:
- Install the 1.6.x-gke.0 bundle.
- Upgrade all 1.6.2 production user clusters to 1.6.x.
- Upgrade the admin cluster to 1.6.x.
Troubleshooting an admin cluster upgrade issue
If you encounter an issue when upgrading the admin cluster, you must contact Google Support to resolve the issue with the admin cluster.
In the meantime, with the new upgrade flow, you can still benefit from new user cluster features without being blocked by the admin cluster upgrade, which allows you to reduce the upgrade frequency of the admin cluster if you want. For example, you might want to use the Container-Optimized OS nodepool released in version 1.7. Your upgrade process can proceed as follows:
- Upgrade production user clusters to 1.7.
- Keep the admin cluster at 1.6 and continue receiving security patches;
- Test admin cluster upgrade from 1.6 to 1.7 in a test environment, and report issues if there are any;
- If your issue is solved by a 1.7 patch release, you can then choose to upgrade the production admin cluster from 1.6 to this 1.7 patch release if desired.
Known issues
The following known issues affect upgrading clusters.
Upgrading the admin workstation might fail if the data disk is nearly full
If you upgrade the admin workstation with the gkectl upgrade admin-workstation
command, the upgrade might fail if the data disk is nearly full, because the system attempts to back up the current admin workstation locally while upgrading to a new admin workstation. If you cannot clear sufficient space on the data disk, use the gkectl upgrade admin-workstation
command with the additional flag --backup-to-local=false
to prevent making a local backup of the current admin workstation.
Version 1.7.0: Changes to Config Management updates
In versions earlier than 1.7.0, Google Distributed Cloud included the images required to install and upgrade Config Management. Beginning with 1.7.0, the Config Management software is no longer included in the Google Distributed Cloud bundle, and you need to add it separately. If you were previously using Config Management on your cluster or clusters, the software is not upgraded until you take action.
To learn more about installing Config Management, see Installing Config Management.
Version 1.1.0-gke.6, 1.2.0-gke.6: stackdriver.proxyconfigsecretname
field removed
The stackdriver.proxyconfigsecretname
field was removed in version
1.1.0-gke.6. Google Distributed Cloud's preflight checks will return an error if
the field is present in your configuration file.
To work around this, before you upgrade to 1.2.0-gke.6, delete the
proxyconfigsecretname
field from your configuration file.
Stackdriver references old version
Before version 1.2.0-gke.6, a known issue prevents Stackdriver from updating its configuration after cluster upgrades. Stackdriver still references an old version, which prevents Stackdriver from receiving the latest features of its telemetry pipeline. This issue can make it difficult for Google Support to troubleshoot clusters.
After you upgrade clusters to 1.2.0-gke.6, run the following command against admin and user clusters:
kubectl --kubeconfig=[KUBECONFIG] \ -n kube-system --type=json patch stackdrivers stackdriver \ -p '[{"op":"remove","path":"/spec/version"}]'
where [KUBECONFIG] is the path to the cluster's kubeconfig file.
Disruption for workloads with PodDisruptionBudgets
Currently, upgrading clusters can cause disruption or downtime for workloads that use PodDisruptionBudgets (PDBs).
Version 1.2.0-gke.6: Prometheus and Grafana disabled after upgrading
In user clusters, Prometheus and Grafana get automatically disabled during upgrade. However, the configuration and metrics data are not lost. In admin clusters, Prometheus and Grafana stay enabled.
For instructions, refer to the Google Distributed Cloud release notes.
Version 1.1.2-gke.0: Deleted user cluster nodes aren't removed from vSAN datastore
For instructions, refer to the Google Distributed Cloud release notes.
Version 1.1.1-gke.2: Data disk in vSAN datastore folder can be deleted
If you're using a vSAN datastore, you need to create a folder in which to save
the VMDK. A known issue
requires that you provide the folder's universally unique identifier (UUID) path,
rather than its file path, to vcenter.datadisk
. This mismatch can cause
upgrades to fail.
For instructions, refer to the Google Distributed Cloud release notes.
Upgrading to version 1.1.0-gke.6 from version 1.0.2-gke.3: OIDC issue
Version 1.0.11, 1.0.1-gke.5, and 1.0.2-gke.3 clusters that have OpenID Connect (OIDC) configured cannot be upgraded to version 1.1.0-gke.6. This issue is fixed in version 1.1.1-gke.2.
If you configured a version 1.0.11, 1.0.1-gke.5, or 1.0.2-gke.3 cluster with OIDC during installation, you are not able to upgrade it. Instead, you should create new clusters.
Upgrading to version 1.0.2-gke.3 from version 1.0.11
Version 1.0.2-gke.3 introduces the following OIDC fields (usercluster.oidc
).
These fields enable logging in to a cluster from Google Cloud console:
usercluster.oidc.kubectlredirecturl
usercluster.oidc.clientsecret
usercluster.oidc.usehttpproxy
If you want to use OIDC, the clientsecret
field is required even if you don't
want to log in to a cluster from Google Cloud console. To use OIDC, you might
need to provide a placeholder value for clientsecret
:
oidc: clientsecret: "secret"
Nodes fail to complete their upgrade process
If you have PodDisruptionBudget
objects configured that are unable to
allow any additional disruptions, node upgrades might fail to upgrade to the
control plane version after repeated attempts. To prevent this failure, we
recommend that you scale up the Deployment
or HorizontalPodAutoscaler
to
allow the node to drain while still respecting the PodDisruptionBudget
configuration.
To see all PodDisruptionBudget
objects that do not allow any disruptions:
kubectl get poddisruptionbudget --all-namespaces -o jsonpath='{range .items[?(@.status.disruptionsAllowed==0)]}{.metadata.name}/{.metadata.namespace}{"\n"}{end}'
Appendix
About VMware DRS rules enabled in version 1.1.0-gke.6
As of version 1.1.0-gke.6, Google Distributed Cloud automatically creates VMware Distributed Resource Scheduler (DRS) anti-affinity rules for your user cluster's nodes, causing them to be spread across at least three physical hosts in your datacenter. As of version 1.1.0-gke.6, this feature is automatically enabled for new clusters and existing clusters.
Before you upgrade, be sure that your vSphere environment meets the following conditions:
VMware DRS is enabled. VMware DRS requires vSphere Enterprise Plus license edition. To learn how to enable DRS, see Enabling VMware DRS in a cluster
The vSphere username provided in your credentials configuration file has the
Host.Inventory.EditCluster
permission.There are at least three physical hosts available.
If your vSphere environment does not meet the preceding conditions, you can still upgrade, but for upgrading a user cluster from 1.3.x to 1.4.x, you need to disable anti-affinity groups. For more information, see this known issue in the Google Distributed Cloud release notes.
Down time
About downtime during upgrades
Resource | Description |
---|---|
Admin cluster | When an admin cluster is down, user cluster control planes and workloads on user clusters continue to run, unless they were affected by a failure that caused the downtime. |
User cluster control plane | Typically, you should expect no noticeable downtime to user cluster control planes. However, long-running connections to the Kubernetes API server might break and would need to be re-established. In those cases, the API caller should retry until it establishes a connection. In the worst case, there can be up to one minute of downtime during an upgrade. |
User cluster nodes | If an upgrade requires a change to user cluster nodes, Google Distributed Cloud recreates the nodes in a rolling fashion, and reschedules Pods running on these nodes. You can prevent impact to your workloads by configuring appropriate PodDisruptionBudgets and anti-affinity rules. |
Known issues
See Known issues.
Troubleshooting
See Troubleshooting cluster creation and upgrade