This document shows how to create an admin cluster for Google Distributed Cloud. The admin cluster manages user clusters that run your workloads.
This page is for Admins, Architects, and Operators who set up, monitor, and manage the tech infrastructure. To learn more about common roles and example tasks that we reference in Google Cloud content, see Common GKE Enterprise user roles and tasks.
For more details about the admin cluster, see the installation overview.
Procedure overview
These are the primary steps involved in creating an admin cluster:
- Prepare an admin workstation.
- This machine has the necessary tools to create new clusters.
- Fill in your configuration files.
- Specify the details for your new admin cluster by completing and validating an admin cluster configuration file, a credentials configuration file, and possibly an IP block file.
- Import OS images to vSphere, and push container images to the private registry if applicable.
- Run
gkectl prepare
.
- Create an admin cluster.
- Use
gkectl
to create a new admin cluster as specified in your completed configuration files. When Google Distributed Cloud creates an admin cluster, it deploys a Kubernetes in Docker (kind) cluster to temporarily host the Kubernetes controllers needed to create the admin cluster. This transient cluster is called a bootstrap cluster. User clusters are created and upgraded by their managing admin without the use of a bootstrap cluster.
- Verify that your admin cluster is running.
- Use
kubectl
to view your cluster nodes.
At the end of this procedure, you will have a running admin cluster that you can use to create and manage user clusters.
If you use VPC Service Controls, you might see errors when you run some
gkectl
commands, such as "Validation Category: GCP - [UNKNOWN] GCP
service: [Stackdriver] could not get GCP services"
. To avoid these errors, add
the --skip-validation-gcp
parameter to your commands.
Before you begin
Review the IP addresses planning document. Ensure that you have enough IP addresses available for the three control-plane nodes and a control-plane VIP. If you plan to create any kubeception user clusters, then you must have enough IP addresses available for the control-plane nodes of those user clusters.
Review the load balancing overview and revisit your decision about the kind of load balancer you want to use. For certain load balancers, you must set up the load balancer before you create your admin cluster.
Look ahead at the
privateRegistry
section, and decide whether you want to use a public or private registry for Google Distributed Cloud components.Look ahead at the osImageType field, and decide what type of operating system you want to run on your admin cluster nodes.
If your organization requires outbound traffic to pass through a proxy server, make sure to allowlist required APIs and the Container Registry address.
In version 1.29 and higher, server-side preflight checks are enabled by default. Server-side preflight checks require additional firewall rules. In Firewall rules for admin clusters, search for "Preflight checks" and make sure all required firewall rules are configured. Server-side preflight checks are run on the bootstrap cluster instead of locally on the admin workstation.
Prepare your admin workstation
Make sure you have set up and can sign in to your admin workstation as described in Create an admin workstation. The admin workstation has the tools you need to create your admin cluster.
Do all the remaining steps in this document on your admin workstation.
Fill in your configuration file
If you used gkeadm
to create your admin workstation, it generated a configuration
file named admin-cluster.yaml
.
If you didn't use gkeadm
to create your admin workstation, then generate
admin-cluster.yaml
by running this command on your admin workstation:
gkectl create-config admin
This configuration file is for creating your admin cluster.
Familiarize yourself with the configuration file by scanning the admin cluster configuration file document. You might want to keep this document open in a separate tab or window, because you will refer to it as you complete the following steps.
name
If you want to specify a name for your admin cluster, fill in the
name
field.
bundlePath
The bundle is a zipped file that contains cluster components. It is included with the admin workstation. This field is already filled in for you.
vCenter
The fields in this section are already filled in with values that you entered when you created your admin workstation.
network
Fill in the
network.controlPlaneIPBlock
section and the
network.hostConfig
section. Also set
adminMaster.replicas
to 3
.
The network.podCIDR and network.serviceCIDR fields have prepopulated values that you can leave unchanged unless they conflict with addresses already being used in your network. Kubernetes uses these ranges to assign IP addresses to Pods and Services in your cluster.
Fill in the rest of the fields in the network section of the configuration file as needed.
loadBalancer
Set aside a VIP for the Kubernetes API server of your admin cluster. Provide
your VIP as the value for
loadBalancer.vips.controlPlaneVIP
For more information, see VIPs in the admin cluster subnet.
Decide what type of load balancing you want to use. The options are:
MetalLB bundled load balancing. Set
loadBalancer.kind
to"MetalLB"
.Integrated load balancing with F5 BIG-IP. Set
loadBalancer.kind
to"F5BigIP"
, and fill in thef5BigIP
section.Manual load balancing. Set
loadBalancer.kind
to"ManualLB"
, and fill in themanualLB
section.
For more information about load balancing options, see Overview of load balancing.
antiAffinityGroups
Set antiAffinityGroups.enabled
to true
or false
according to your preference.
Use this field to specify whether you want Google Distributed Cloud to create VMware Distributed Resource Scheduler (DRS) anti-affinity rules for your admin cluster nodes, causing them to be spread across at least three physical hosts in your data center.
adminMaster
If you want to specify CPU and memory for the control-plane nodes of the admin
cluster, fill in the cpus
and memoryMB
fields in the
adminMaster
section.
Set the replicas
field in the adminMaster
section to 3
.
proxy
If the network that will have your admin cluster nodes is behind a proxy server,
fill in the
proxy
section.
privateRegistry
Decide where you want to keep container images for the Google Distributed Cloud components. The options are:
Container Registry
Your own private Docker registry.
If you want to use your own private registry, fill in the
privateRegistry
section.
componentAccessServiceAccountKeyPath
Google Distributed Cloud uses your component access service account to download cluster components from Container Registry. This field holds the path of a JSON key file for your component access service account.
This field is already filled in for you.
gkeConnect
Register your admin cluster
to a Google Cloud fleet by filling in the
gkeConnect
section. If you include the stackdriver
and cloudAuditLogging
sections in the
configuration file, the ID in gkeConnect.projectID
must be the same as the ID
set in stackdriver.projectID
and cloudAuditLogging.projectID
. If the project
IDs aren't the same, cluster creation fails.
In 1.28 and later, you can optionally specify a region where the Fleet and
Connect services run in gkeConnect.location
. If you don't include this field,
the cluster uses the global instances of these services.
If you include gkeConnect.location
, the region that you specify must be the
same as the region configured in cloudAuditLogging.clusterLocation
,
stackdriver.clusterLocation
, and gkeOnPremAPI.location
. If the regions
aren't the same, cluster creation fails.
gkeOnPremAPI
If the GKE On-Prem API is enabled in your
Google Cloud project, all clusters in the project are
enrolled in the GKE On-Prem API
automatically in the region configured in stackdriver.clusterLocation
.
The gkeOnPremAPI.location
region must be the same as the region specified in
cloudAuditLogging.clusterLocation
, gkeConnect.location
,
and stackdriver.clusterLocation
. If the regions aren't the same, cluster
creation fails.
If you want to enroll all clusters in the project in the GKE On-Prem API, be sure to do the steps in Before you begin to activate and use the GKE On-Prem API in the project.
If you don't want to enroll the cluster in the GKE On-Prem API, include this section and set
gkeOnPremAPI.enabled
tofalse
. If you don't want to enroll any clusters in the project, disablegkeonprem.googleapis.com
(the service name for the GKE On-Prem API) in the project. For instructions, see Disabling services.
stackdriver
If you want to enable
Cloud Logging and Cloud Monitoring
for your cluster, fill in the
stackdriver
section.
This section is required by default. That is, if you don't fill in this
section, then you must include the --skip-validation-stackdriver
flag when you
run gkectl create admin
.
Note the following requirements for new clusters:
The ID in
stackdriver.projectID
must be the same as the ID ingkeConnect.projectID
andcloudAuditLogging.projectID
.The Google Cloud region set in
stackdriver.clusterLocation
must be the same as the region set incloudAuditLogging.clusterLocation
andgkeConnect.location
(if the field is included in your configuration file). Additionally, ifgkeOnPremAPI.enabled
istrue
, the same region must be set ingkeOnPremAPI.location
.
If the project IDs and regions aren't the same, cluster creation fails.
cloudAuditLogging
If you want to integrate the audit logs from your cluster's Kubernetes API
server with Cloud Audit Logs, fill in the
cloudAuditLogging
section.
Note the following requirements for new clusters:
The ID in
cloudAuditLogging.projectID
must be the same as the ID ingkeConnect.projectID
andstackdriver.projectID
.The Google Cloud region set in
cloudAuditLogging.clusterLocation
must be the same as the region set instackdriver.clusterLocation
andgkeConnect.location
(if the field is included in your configuration file). Additionally, ifgkeOnPremAPI.enabled
istrue
, the same region must be set ingkeOnPremAPI.location
.
If the project IDs and regions aren't the same, cluster creation fails.
clusterBackup
If you want to enable
backing up of the admin cluster,
set
clusterBackup.datastore
to the
vSphere datastore
where you want to save cluster backups.
autoRepair
If you want to
enable automatic node repair
for your admin cluster, set
autoRepair.enabled
to true
.
secretsEncryption
If you want to enable
always-on Secrets encryption,
fill in the
secretsEncryption
section.
osImageType
Decide what type of OS image you want to use for the admin cluster nodes, and
fill in the
osImageType
section accordingly.
Example of filled-in configuration files
Here is an example of a filled-in admin cluster configuration file. The configuration enables some, but not all, of the available features.
vc-01-admin-cluster.yaml
apiVersion: v1 kind: AdminCluster name: "gke-admin-01" bundlePath: "/var/lib/gke/bundles/gke-onprem-vsphere-1.28.0-gke.1-full.tgz" vCenter: address: "vc01.example" datacenter: "vc-01" cluster: "vc01-workloads-1" resourcePool: "vc-01-pool-1" datastore: "vc01-datastore-1" caCertPath: "/usr/local/google/home/me/certs/vc01-cert.pem"" credentials: fileRef: path: "credential.yaml" entry: "vCenter" network: hostConfig: dnsServers: - "203.0.113.1" - "198.51.100.1" ntpServers: - "216.239.35.4" serviceCIDR: "10.96.232.0/24" podCIDR: "192.168.0.0/16" vCenter: networkName: "vc01-net-1" controlPlaneIPBlock: netmask: "255.255.248.0" gateway: "21.0.143.254" ips: - ip: "21.0.140.226" hostname: "admin-cp-vm-1" - ip: "21.0.141.48" hostname: "admin-cp-vm-2" - ip: "21.0.141.65" hostname: "admin-cp-vm-3" loadBalancer: vips: controlPlaneVIP: "172.16.20.59" kind: "MetalLB" antiAffinityGroups: enabled: true adminMaster: cpus: 4 memoryMB: 16384 replicas: 3 componentAccessServiceAccountKeyPath: "sa-key.json" gkeConnect: projectID: "my-project-123" registerServiceAccountKeyPath: "connect-register-sa-2203040617.json" stackdriver: projectID: "my-project-123" clusterLocation: "us-central1" enableVPC: false serviceAccountKeyPath: "log-mon-sa-2203040617.json" disableVsphereResourceMetrics: false clusterBackup: datastore: "vc-01-datastore-bu" autoRepair: enabled: true osImageType: "ubuntu_containerd"
Validate your configuration file
After you've filled in your admin cluster configuration file, run
gkectl check-config
to verify that the file is valid:
gkectl check-config --config ADMIN_CLUSTER_CONFIG
Replace ADMIN_CLUSTER_CONFIG with the path of your admin cluster configuration file.
If the command returns any failure messages, fix the issues and validate the file again.
If you want to skip the more time-consuming validations, pass the --fast
flag.
To skip individual validations, use the --skip-validation-xxx
flags. To
learn more about the check-config
command, see
Running preflight checks.
Get OS images
Run gkectl prepare
to initialize your vSphere environment:
gkectl prepare --config ADMIN_CLUSTER_CONFIG
The gkectl prepare
command performs the following preparatory tasks:
Imports OS images to vSphere and marks them as VM templates.
If you are using a private Docker registry, pushes the container images to your registry.
Optionally, validates the container images' build attestations, thereby verifying the images were built and signed by Google and are ready for deployment.
Create the admin cluster
Create the admin cluster:
gkectl create admin --config ADMIN_CLUSTER_CONFIG
If you use VPC Service Controls, you might see errors when you run some
gkectl
commands, such as "Validation Category: GCP - [UNKNOWN] GCP
service: [Stackdriver] could not get GCP services"
. To avoid these errors, add
the --skip-validation-gcp
parameter to your commands.
Resume creation of the admin cluster after a failure
If the admin cluster creation fails or is canceled, you can run the create
command again:
gkectl create admin --config ADMIN_CLUSTER_CONFIG
Locate the admin cluster kubeconfig file
The gkectl create admin
command creates a kubeconfig file named
kubeconfig
in the current directory. You will need this kubeconfig file
later to interact with your admin cluster.
The kubeconfig file contains the name of your admin cluster. To view the cluster name, you can run:
kubectl config get-clusters --kubeconfig ADMIN_CLUSTER_KUBECONFIG
The output shows the name of the cluster. For example:
NAME gke-admin-tqk8x
If you like, you can change the name and location of your kubeconfig file.
Manage the checkpoint.yaml
file
When you ran the gkectl create admin
command to create the admin cluster, it
created a checkpoint file in the same datastore folder as the admin
cluster data disk. By default, this file has the name
DATA_DISK_NAME‑checkpoint.yaml
. If the length of
DATA_DISK_NAME is greater than or equal to 245 characters, then, due
to the vSphere limit on filename length, the name is
DATA_DISK_NAME.yaml
.
This file contains the admin cluster state and credentials, and is used for future upgrades. Do not delete this file unless you are following the process for deleting an admin cluster.
If you have enabled VM encryption in your instance of vCenter Server, then you
must have the
Cryptographic operations.Direct Access
privilege before you create or upgrade your admin
cluster. Otherwise the checkpoint will not be uploaded. If you cannot obtain
this privilege, then you can disable uploading the checkpoint file by using the
hidden flag --disable-checkpoint
when you run a relevant command.
The checkpoint.yaml
file is automatically updated when you run the
gkectl upgrade admin
command, or when you run a gkectl update
command that
affects the admin cluster.
Verify that your admin cluster is running
Verify that your admin cluster is running:
kubectl get nodes --kubeconfig ADMIN_CLUSTER_KUBECONFIG
Replace ADMIN_CLUSTER_KUBECONFIG with the path of your admin cluster kubeconfig file.
The output shows the admin cluster nodes. For example:
admin-cp-vm-1 Ready control-plane,master ... admin-cp-vm-2 Ready control-plane,master ... admin-cp-vm-3 Ready control-plane,master ...
Back up files
We recommend that you back up your admin cluster kubeconfig file. That is, copy the kubeconfig file from your admin workstation to another location. Then if you lose access to the admin workstation, or if the kubeconfig file on your admin workstation gets accidentally deleted, you still have access to the admin cluster.
We also recommend that you back up the private SSH key for your admin cluster. Then if you lose access to the admin cluster, you can still use SSH to connect to the admin cluster nodes. This will allow you to troubleshoot and investigate any issues with connectivity to the admin cluster.
Extract the SSH key from the admin cluster to a file named
admin-cluster-ssh-key
:
kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG get secrets -n kube-system sshkeys \ -o jsonpath='{.data.vsphere_tmp}' | base64 -d > admin-cluster-ssh-key
Now you can back up admin-cluster-ssh-key
to another location of your choice.
RBAC policies
When you fill in the
gkeConnect
section
in your admin cluster configuration file, the cluster is registered to your
fleet during creation or update. To enable
fleet management functionality, Google Cloud deploys the
Connect agent and creates a Google
service account that represents the project that the cluster is registered to.
The Connect agent establishes a connection with the service account to handle
requests to the cluster's Kubernetes API server. This enables access to
cluster and workload management features in Google Cloud, including access
to the Google Cloud console, which lets you interact with
your cluster.
The admin cluster's Kubernetes API server needs to be able to authorize requests from the Connect agent. To ensure this, the following role-based access control (RBAC) policies are configured on the service account:
An impersonation policy that authorizes the Connect agent to send requests to the Kubernetes API server on behalf of the service account.
A permissions policy that specifies the operations that are allowed on other Kubernetes resources.
The service account and RBAC policies are needed so that you can manage the lifecycle of your user clusters in the Google Cloud console.
Troubleshooting
See Troubleshooting cluster creation and upgrade.