About GKE logs

Autopilot Standard

This page provides an overview of the logging options available in Google Kubernetes Engine (GKE).

Overview

GKE logs sent to Cloud Logging are stored in a dedicated, persistent datastore. While GKE itself stores logs, these logs are not stored permanently. For example, GKE container logs are removed when their host Pod is removed, when the disk on which they are stored runs out of space, or when they are replaced by newer logs. System logs are periodically removed to free up space for new logs. Cluster events are removed after one hour.

GKE logging agent

For container and system logs, GKE deploys, by default, a per-node logging agent that reads container logs, adds helpful metadata, and then stores them in Cloud Logging. The GKE logging agent checks for container logs in the following sources:

Standard output and standard error logs from containerized processes
kubelet and container runtime logs
Logs for system components, such as VM startup scripts

For events, GKE uses a deployment in the kube-system namespace which automatically collects events and sends them to Logging.

What logs are collected

By default, GKE collects several types of logs from your cluster and stores those in Cloud Logging:

Audit logs include the Admin Activity log, Data Access log, and the Events log. For detailed information about the Audit Logs for GKE, refer to the Audit Logs for GKE documentation. Audit logs for GKE cannot be disabled.
System logs including the logs listed in available logs.
Application logs include all logs generated by non-system containers running on user nodes.

The following limitations might result in application logs failed to be sent to Cloud Logging:
- For json logs, duplicated json keys are not supported.
- stream is a reserved key in GKE logging pipeline. A stream key in the application json log might cause unexpected behavior and logs dropped.
- Cloud Logging has a size limit per LogEntry. Any LogEntry exceeding the size limit is dropped for jsonPayload logs and truncated for textPayload logs.

Optionally, GKE can collect additional types of logs from certain Kubernetes control plane components and store them in Cloud Logging:

API server logs include all logs generated by the Kubernetes API server (kube-apiserver).
Scheduler logs include all logs generated by the Kubernetes Scheduler (kube-scheduler).
Controller Manager logs include all logs generated by the Kubernetes Controller Manager (kube-controller-manager).

To learn more about each of these control plane components, see GKE cluster architecture.

Collecting your logs

When you create a new GKE cluster, integration with Cloud Logging is enabled by default.

System and application logs are delivered to the Log Router in Cloud Logging.

From there, logs can be ingested into Cloud Logging, excluded, or exported to BigQuery, Pub/Sub, or Cloud Storage.

You can also configure a Standard cluster to only capture system logs and not collect application logs. For both Autopilot and Standard clusters, exclusion filters let you reduce the volume of logs sent to Cloud Logging.

Logging throughput

When system logging is enabled, a dedicated Cloud Logging agent is automatically deployed and managed. It runs on all GKE nodes in a cluster to collect logs, adds helpful metadata about the container, pod, and cluster, and then sends the logs to Cloud Logging using a fluentbit-based agent.

If any GKE nodes require more than the default log throughput and your GKE Standard cluster is using control plane version 1.23.13-gke.1000 or later, you can configure GKE to deploy an alternative configuration of the Logging agent designed to maximize logging throughput.

For more information, see Adjust log throughput.

Log collection with custom fluentd or fluentbit

GKE's default logging agent provides a managed solution to deploy and manage the agents that send the logs for your clusters to Cloud Logging. Logs are collected using a fluentbit-based agent.

Collecting Linux `auditd` logs for GKE nodes

You can enable verbose operating system audit logs on GKE nodes running Container-Optimized OS. Operating system logs on your nodes provide valuable information about the state of your cluster and workloads, such as error messages, login attempts, and binary executions. You can use this information to debug problems or investigate security incidents.

To learn more, see Enabling Linux auditd logs on GKE nodes.

GKE Audit Logs

For detailed information about log entries that apply to the Kubernetes Cluster and GKE Cluster Operations resource types, go to Audit logging.

Logging Access Control

There are two aspects of logging access control: application access and user access. Cloud Logging provides Identity and Access Management (IAM) roles that you can use to grant appropriate access.

Application Access

Applications need permissions to write logs to Cloud Logging, which is granted by assigning the IAM role roles/logging.logWriter to the service account attached to the underlying node pool.

User View Access

You need to have the roles/logging.viewer role to view your logs in your project. If you need to have access to the Data Access logs, you need to have the logging.privateLogViewer IAM permission.

For more information about permissions and roles, go to the Access control guide. You can also review Best practices for Cloud Audit Logs, which also apply to Cloud Logging in general.

User Admin Access

The IAM roles roles/logging.configWriter and roles/logging.admin provide the administrative capabilities. The roles/logging.configWriter role is required to create a logging sink which is commonly used to direct your logs to a specific or centralized project. For example, you might want to use a logging sink along with a logging filter to direct all of your logs for a namespace to a centralized logging bucket.

To learn more, go to the Access Control guide for Cloud Logging.

Best practices

Structured logging: The logging agent integrated with GKE will read JSON documents serialized to single-line strings and written to standard output or standard error and will send them to Google Cloud Observability as structured log entries.
- See Structured logging for more details on working with an integrated logging agent.
- You can use Advanced logs filters to filter logs based on the JSON document's fields.
- Logs generated with glog will have the common fields parsed, for example, severity, pid, source_file, source_line. However, the message payload itself is unparsed and shows up verbatim in the resulting log message in Google Cloud Observability.
Severities: By default, logs written to the standard output are on the INFO level and logs written to the standard error are on the ERROR level. Structured logs can include a severity field, which defines the log's severity.
Exporting to BigQuery: For additional analysis, you can export logs to external services, such as BigQuery or Pub/Sub. Logs exported to BigQuery retain their format and structure. See Routing and storage overview for further information.
Alerting: When Logging logs unexpected behavior, you can use logs-based metrics to set up alerting policies. For an example, see Create an alerting policy on a counter metric. For detailed information on logs-based metrics, see Overview of logs-based metrics.

Error reporting: To collect errors from applications running on your clusters, you can use Error Reporting.

Available logs

If you choose to send logs to Cloud Logging, you must send system logs, and you can optionally send logs from additional sources.

The following table indicates supported values for the --logging flag for the create and update commands.

Log source	`--logging` value	Logs collected
None	`NONE`	No logs sent to Cloud Logging; no log collection agent installed in the cluster. This value is not supported for Autopilot clusters.
System	`SYSTEM`	Collects logs from the following: All Pods running in namespaces `kube-system`, `istio-system`, `knative-serving`, `gke-system`, and `config-management-system`. Services that are not containerized including `docker`/`containerd` runtime, `kubelet`, `kubelet-monitor`, `node-problem-detector`, and `kube-container-runtime-monitor`. The node's serial ports output, if the VM instance metadata `serial-port-logging-enable` is set to true. Also collects Kubernetes events. This value is required for all cluster types.
Workloads	`WORKLOAD`	All logs generated by non-system containers running on user nodes. This value is on by default but optional for all cluster types.
API server	`API_SERVER`	All logs generated by `kube-apiserver`. This value is optional for all cluster types.
Scheduler	`SCHEDULER`	All logs generated by `kube-scheduler`. This value is optional for all cluster types.
Controller manager	`CONTROLLER_MANAGER`	All logs generated by `kube-controller-manager`. This value is optional for all cluster types.
Horizontal Pod Autoscaler	`KCP_HPA`	Exports both atomic and final recommendation decision logs for each HPA object in your GKE cluster. For details, see View Horizontal Pod Autoscaler events.
Control plane network connections	`KCP_CONNECTION`	Only available with GKE control plane authority Inbound network connection logs for GKE control plane instances. For details, see Verify Google connections to the cluster control plane.
Control plane SSH events	`KCP_SSHD`	Only available with GKE control plane authority Generates logs for all SSH events, like public key acceptance and session closure, that occur when Google personnel connect to your cluster control plane instances during support cases or for other administrative access. For details, see Verify Google connections to the cluster control plane.

Logs enabled by default in GKE Enterprise

When you create a new GKE cluster on Google Cloud, workload logs are enabled by default for all Autopilot clusters but can be disabled.

For GKE Enterprise edition projects, additional useful logs are enabled by default if you register to a fleet while creating the cluster.

In the following table, a checkmark () indicates which logs are enabled by default when you create and register a new cluster in a project with GKE Enterprise enabled:

Log name	Autopilot	Standard
System
Workloads		-
API server
Scheduler
Controller Manager
Control plane network connections
Control plane SSH events

The control plane logs (API server, Scheduler, and Controller Manager) incur Cloud Logging charges.

Pricing

GKE control plane logs are exported to Cloud Logging. Cloud Logging pricing applies.

You are charged for the accrued storage costs when you export logs to another Google Cloud service, such as BigQuery. For costs associated with Cloud Logging, see Pricing.

Quota

Control plane logs consume the "Write requests per minute" quota of the Cloud Logging API. Before enabling control plane logs, check your recent peak usage of that quota. If you have many clusters in the same project or are already approaching the quota limit, then you can request a quota-limit increase before enabling control plane logs.

Access controls

If you want to limit access within your organization to Kubernetes control plane logs, you can create a separate log bucket with more limited access controls.

By storing them in a separate log bucket with limited access, control plane logs in the log bucket won't automatically be accessible to anyone with roles/logging.viewer access to the project. Additionally, if you decide to delete certain control plane logs due to privacy or security concerns, storing them in a separate log bucket with limited access makes it possible to delete the logs without impacting logs from other components or services.