Setting up Knative serving

This guide shows you how to set up a Google Kubernetes Engine (GKE) cluster and enable Knative serving. You can use either the Google Cloud console or the Google Cloud CLI to enable Knative serving on standard and private GKE clusters.

Enabling Knative serving installs Istio and Knative Serving into the cluster to connect and manage your stateless workloads. For more information, see Architectural overview of Knative serving.

Before you begin

  1. Knative serving is an add-on for Google Kubernetes Engine. A free trial is available until September 30, 2021.
  2. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  3. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  4. Make sure that billing is enabled for your Google Cloud project.

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Make sure that billing is enabled for your Google Cloud project.

Setting up the command-line environment

Although you can choose to use either the Google Cloud console or the gcloud CLI to manage Knative serving, there are some tasks that require the gcloud CLI.

To set up the gcloud and kubectl command-line tools for Knative serving:

  1. Install and initialize the Google Cloud CLI.

  2. Set the default project setting for gcloud CLI to the one you just created, or to an existing project you want to use:

    gcloud config set project PROJECT-ID

    Replace PROJECT-ID with the project ID of the project you created.

  3. Set zone to the desired zone for your cluster. You can use any zone where GKE is supported. For example:

    gcloud config set compute/zone ZONE

    Replace ZONE with your zone.

  4. Enable the following APIs for the project, which are necessary to create a cluster, build a container, and publish a container into the Google Kubernetes Engine registry:

    gcloud services enable container.googleapis.com containerregistry.googleapis.com cloudbuild.googleapis.com
  5. Install the kubectl command-line tool:

    gcloud components install kubectl
  6. Update installed gcloud CLI components:

    gcloud components update

Enabling Knative serving

Knative serving runs on a GKE cluster. You can enable Knative serving on an existing cluster, or you can create a new cluster with Knative serving enabled.

Choose how you want to set up Knative serving:

Knative serving can also be enabled on private GKE clusters. For information on how to create a private GKE cluster, see Creating a private cluster in the GKE documentation.

Creating a new GKE cluster with Knative serving enabled

These instructions create a cluster with the following configurations:

  • Knative serving enabled
  • Kubernetes version: available GKE versions
  • 4 nodes with 4 vCPU
  • Default namespace: default

These are the recommended cluster configurations for testing Knative serving. For production workloads, you should configure your GKE cluster to meet your specific needs. For information about the different kinds of GKE clusters and their configuration options, see Types of clusters in the GKE documentation.

To create a cluster and enable Knative serving:

  1. Create a cluster:

    gcloud container clusters create CLUSTER-NAME \
    --zone=ZONE \
    --addons=HttpLoadBalancing,CloudRun \
    --machine-type=e2-standard-4 \
    --num-nodes=4 \
    --cluster-version=GKE-VERSION \
    --enable-stackdriver-kubernetes

    Cluster autoscaling is not enabled by default, although Knative serving automatically scales the number of instances within your cluster based on available capacity.

  2. Wait for the cluster creation to complete.

Enabling Knative serving on an existing cluster

Your GKE cluster must have the following minimum configuration:

You can use either the gcloud CLI or the Google Cloud console to enable Knative serving on a cluster:

Console

To enable Knative serving on an existing cluster:

  1. Go to the Google Kubernetes Engine page in the Google Cloud console:

    Go to Google Kubernetes Engine

  2. Click the name of the cluster you want to enable Knative serving on.

  3. Click Edit.

  4. Click Enable Knative serving.

  5. Click Save. After the update completes, the cluster will support Knative serving.

Command line

To enable Knative serving on an existing cluster:

  1. Enable the cluster using the following command:

    gcloud container clusters update \
    CLUSTER_NAME \
    --update-addons=CloudRun=ENABLED,HttpLoadBalancing=ENABLED \
    --zone=ZONE

    Replace the following:

    • CLUSTER_NAME: The cluster's name.
    • ZONE: The cluster's zone. For example, us-central1-a.
  2. Wait for enabling to complete. Upon success, the command line returns a message similar to the following:

    Updating your-cluster-name...done.

Configuring default settings for the gcloud CLI

After you create the cluster, you can set default values for the Google Cloud CLI to use. When you use the command line, this removes subsequent prompts for any defaults that you set, for example cluster name or location.

You can configure default settings for:

  • Cluster name
  • Cluster location
  • Credentials
  • Namespace
  • Platform

To set defaults:

  1. Set your default cluster and cluster location, and then get credentials by running the following commands:

    gcloud config set run/platform gke
    gcloud config set run/cluster CLUSTER
    gcloud config set run/cluster_location ZONE
    gcloud container clusters get-credentials CLUSTER

    Replace:

    • CLUSTER with the name of the cluster.
    • ZONE with the location of the cluster.
  2. By default, you cluster is created with a namespace named default. To learn about namespaces, and why you might want to create and use a namespace other than default, refer to namespace.

    To create a new namespace, run:

    kubectl create namespace NAMESPACE

    Replace NAMESPACE with the name of the namespace that you want to create.

  3. If you created a new namespace in the previous step, you can set that namespace as the default namespace that is used each time you invoke the Google Cloud CLI. Otherwise, the default namespace is used. To set your new namespace, run the following command:

    gcloud config set run/namespace NAMESPACE

    Replace NAMESPACE with the name of the namespace that you want the gcloud CLI tool to use by default.

Enabling metrics on a cluster with Workload Identity

When enabling Workload Identity, Knative serving doesn't report certain metrics, such as revision request count or request latency to Google Cloud Observability, but continues reporting metrics for CPU and memory.

To enable all metrics, you need to manually set permissions to write metrics to Cloud Monitoring by granting the Monitoring Metric Writer role to the Google service account (GSA) associated with your Knative serving service.

Grant the Monitoring Metric Writer role permissions to your service's GSA:

gcloud projects add-iam-policy-binding PROJECT_ID \
--member=serviceAccount:GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com \
--role=roles/monitoring.metricWriter

Replace:

  • PROJECT_ID with the project ID for a cluster project that hosts your KSA.
  • GSA_PROJECT with the project ID for a GSA that's not in the cluster. You can use any GSA in your organization.

For more information, see Granting, changing, and revoking access to resources.

To set up services provided by Google Cloud APIs such as the Compute APIs, Storage and Database APIs, or Machine Learning APIs from within your GKE cluster, see Using Workload Identity.

Developing in a multi-tenant setup

In multi-tenant use cases, you'll need to manage and deploy Knative serving services to a Google Kubernetes Engine cluster that is outside your current project. For more information about GKE multi-tenancy, see Cluster multi-tenancy.

To learn how to configure multi-tenancy for Knative serving, see Cross-project multi-tenancy.

Setting up a private, internal network

Deploying services on an internal network is useful for enterprises that provide internal apps to their staff, and for services that are used by clients that run outside the Knative serving cluster. This configuration allows other resources in your network to communicate with the service using a private, internal (RFC 1918) IP address that can't be accessed by the public.

To create your internal network, you configure Istio's Ingress Gateway to use Internal TCP/UDP Load Balancing instead of a public, external network load balancer. You can then deploy your Knative serving services on an internal IP address within your VPC network.

Before you begin

  • You must have admin permissions on your cluster.
  • Only Google Cloud CLI versions 310.0 or above are supported. For more details, see Setting up gcloud.

To set up the internal load balancer:

  1. Update the Istio Ingress Gateway to use Internal TCP/UDP Load Balancing by creating a new cluster or updating an existing cluster:

    • Create a new cluster with an internal load balancer:

      gcloud container clusters create CLUSTER_NAME \
      --addons=HttpLoadBalancing,CloudRun \
      --machine-type=n1-standard-2  \
      --num-nodes=3  \
      --enable-stackdriver-kubernetes \
      --cloud-run-config=load-balancer-type=INTERNAL
    • Update an existing cluster to use an internal load balancer:

      gcloud container clusters update CLUSTER_NAME
      --update-addons=CloudRun=ENABLED \
      --cloud-run-config=load-balancer-type=INTERNAL

    It might take a few minutes for the change to take effect.

  2. Run the following command to watch updates to your GKE cluster:

    kubectl -n gke-system get svc istio-ingress --watch
    
    1. Note the annotation `cloud.google.com/load-balancer-type: Internal".
    2. Look for the value of IP in the Ingress load balancer to change to a private IP address.
    3. Press Ctrl+C to stop the updates once you see a private IP address in the IP field.

To verify internal connectivity after your changes:

  1. Deploy a service called sample to Knative serving in the default namespace:

    gcloud run deploy sample \
    --image us-docker.pkg.dev/knative-samples/knative-samples-backup/helloworld-java \
    --namespace default
    
  2. Create a Compute Engine virtual machine (VM) in the same zone as the GKE cluster:

    VM=cloudrun-gke-ilb-tutorial-vm
    
    gcloud compute instances create $VM
    
  3. Store the private IP address of the Istio Ingress Gateway in an environment variable called EXTERNAL_IP and a file called external-ip.txt:

    export EXTERNAL_IP=$(kubectl -n gke-system get svc istio-ingress \
        -o jsonpath='{.status.loadBalancer.ingress[0].ip}' | tee external-ip.txt)
    
  4. Copy the file containing the IP address to the VM:

    gcloud compute scp external-ip.txt $VM:~
    
  5. Connect to the VM using SSH:

    gcloud compute ssh $VM
    
  6. While in the SSH session, test the sample service:

    curl -s -w'\n' -H Host:sample.default.example.com $(cat external-ip.txt)
    

    The output is as follows:

    Hello World!
    
  7. Leave the SSH session:

    exit
    

Using a separate Istio installation

The following instructions show you how to connect Cloud Service Mesh, the Istio on GKE add-on, or a custom Istio installation with Knative serving in addition to the Istio components already installed by default in Knative serving.

The Istio components included in the default Knative serving install doesn't currently support automatic sidecar injection; however, you can use an additional Istio installation to enable Istio sidecar injection in the namespace where your services run.

To use an additional Istio installation, you need to verify that the Istio Ingress Gateway is named istio-ingressgateway in the istio-system namespace. Knative serving can support and handle external traffic from Istio Ingress Gateways installed at:

  • The istio-system namespace, with the cluster local domain istio-ingressgateway.istio-system.svc.cluster.local that is set up by default when you use an additional Istio installation.
  • The gke-system namespace, with the cluster local domain istio-ingress.gke-system.svc.cluster.local that is set up with the default Knative serving install.

Important: If you configure and use Istio's AuthorizationPolicy, you must address a known vulnerability with path-type matching for access control. For details about preventing exposure to the vulnerability, see Security best practices.

To verify the additional Istio Ingress Gateway Knative serving uses:

  1. Open the config-istio ConfigMap:

    kubectl get configmap config-istio --namespace knative-serving -oyaml
    
  2. Verify your additional Istio Ingress Gateway is named istio-ingressgateway and is in the istio-system namespace.

Enabling HTTPS and custom domains

If you want to use HTTPS and custom domains that apply to the cluster, refer to Enabling HTTPS and automatic TLS certs and mapping custom domains.

Disabling Knative serving

To disable Knative serving in your cluster:

  1. Go to the Google Kubernetes Engine page in the Google Cloud console:

    Go to Google Kubernetes Engine

  2. Click the cluster where you want to disable Knative serving .

  3. Click Edit.

  4. From the Knative serving menu, select Disable.

  5. Click Save.

What's next