This page provides an overview of the logging options available in Google Kubernetes Engine (GKE).
Overview
GKE logs sent to Cloud Logging are stored in a dedicated, persistent datastore. While GKE itself stores logs, these logs are not stored permanently. For example, GKE container logs are removed when their host Pod is removed, when the disk on which they are stored runs out of space, or when they are replaced by newer logs. System logs are periodically removed to free up space for new logs. Cluster events are removed after one hour.
GKE logging agent
For container and system logs, GKE deploys, by default, a per-node logging agent that reads container logs, adds helpful metadata, and then stores them in Cloud Logging. The GKE logging agent checks for container logs in the following sources:
Standard output and standard error logs from containerized processes
kubelet and container runtime logs
Logs for system components, such as VM startup scripts
For events, GKE uses a deployment in the kube-system namespace which automatically collects events and sends them to Logging.
What logs are collected
By default, GKE collects several types of logs from your cluster and stores those in Cloud Logging:
Audit logs include the Admin Activity log, Data Access log, and the Events log. For detailed information about the Audit Logs for GKE, refer to the Audit Logs for GKE documentation. Audit logs for GKE cannot be disabled.
System logs include logs from the following sources:
All Pods running in namespaces
kube-system
,istio-system
,knative-serving
,gke-system
, andconfig-management-system
.Key services that are not containerized including
docker
/containerd
runtime,kubelet
,kubelet-monitor
,node-problem-detector
, andkube-container-runtime-monitor
.The node's serial ports output, if the VM instance metadata
serial-port-logging-enable
is set to true. As of GKE 1.16-13-gke.400, serial port output for nodes is collected by the Logging agent. To disable serial port output logging, set--metadata serial-port-logging-enable=false
during cluster creation. Serial port output is useful for troubleshooting crashes, failed boots, startup issues, or shutdown issues with GKE nodes. Disabling these logs might make troubleshooting issues more difficult.
Application logs include all logs generated by non-system containers running on user nodes.
Optionally, GKE can collect additional types of logs from certain Kubernetes control plane components and store them in Cloud Logging:
API server logs include all logs generated by the Kubernetes API server (
kube-apiserver
).Scheduler logs include all logs generated by the Kubernetes Scheduler (
kube-scheduler
).Controller Manager logs include all logs generated by the Kubernetes Controller Manager (
kube-controller-manager
).
To learn more about each of these control plane components, see GKE cluster architecture.
Collecting your logs
When you create a new GKE cluster, integration with Cloud Logging is enabled by default.
System and application logs are delivered to the Log Router in Cloud Logging.
From there, logs can be ingested into Cloud Logging, excluded, or exported to BigQuery, Pub/Sub, or Cloud Storage.
Beginning with GKE version 1.15.7, you can configure a Standard cluster to only capture system logs and not collect application logs. For both Autopilot and Standard clusters, exclusion filters let you reduce the volume of logs sent to Cloud Logging.
Logging throughput
When system logging is enabled, a dedicated Cloud Logging agent is automatically deployed and managed. It runs on all GKE nodes in a cluster to collect logs, adds helpful metadata about the container, pod, and cluster, and then sends the logs to Cloud Logging using a fluentbit-based agent.
If any GKE nodes require more than the default log throughput and your GKE Standard cluster is using control plane version 1.23.13-gke.1000 or later, you can configure GKE to deploy an alternative configuration of the Logging agent designed to maximize logging throughput.
For more information, see Adjust log throughput.
Log collection with custom fluentd or fluentbit
GKE's default logging agent provides a managed solution to
deploy and manage the agents that send the logs for your clusters to
Cloud Logging. Depending on your GKE control plane
version, either fluentd or fluentbit are used to collect logs. Starting from
GKE 1.17, logs are collected using a fluentbit-based agent.
GKE clusters using versions prior to GKE
1.17 use a fluentd-based agent. If you want to alter the default behavior of the
fluentd
agents, then you can run a customized fluentd
agent.
Common use cases include:
removing sensitive data from your logs
collecting additional logs not written to
STDOUT
orSTDERR
using specific performance-related settings
customized log formatting
Collecting Linux auditd
logs for GKE nodes
You can enable verbose operating system audit logs on GKE nodes running Container-Optimized OS. Operating system logs on your nodes provide valuable information about the state of your cluster and workloads, such as error messages, login attempts, and binary executions. You can use this information to debug problems or investigate security incidents.
To learn more, see Enabling Linux auditd logs on GKE nodes.
GKE Audit Logs
For detailed information about log entries that apply to the Kubernetes Cluster and GKE Cluster Operations resource types, go to Audit logging.
Logging Access Control
There are two aspects of logging access control: application access and user access. Cloud Logging provides Identity and Access Management (IAM) roles that you can use to grant appropriate access.
Application Access
Applications need permissions to write logs to Cloud Logging,
which is granted by assigning the IAM role
roles/logging.logWriter
to the service account attached to the underlying
node pool.
User View Access
You need to have the roles/logging.viewer
role to view your logs in your
project. If you need to have access to the Data Access logs, you need to have
the logging.privateLogViewer
IAM permission.
For more information about permissions and roles, go to the Access control guide. You can also review Best practices for Cloud Audit Logs, which also apply to Cloud Logging in general.
User Admin Access
The IAM roles roles/logging.configWriter
and
roles/logging.admin
provide the administrative capabilities. The
roles/logging.configWriter
role is
required to create a logging sink which is commonly used to direct your logs to
a specific or centralized project. For example, you might want to use a logging
sink along with a logging filter to direct all of your logs for a namespace to a
centralized logging bucket.
To learn more, go to the Access Control guide for Cloud Logging.
Best practices
- Structured logging: The logging agent integrated with GKE
will read JSON documents serialized to single-line strings and written to
standard output or standard error and will send them to Google Cloud Observability
as structured log entries.
- See Structured logging for more details on working with an integrated logging agent.
- You can use Advanced logs filters to filter logs based on the JSON document's fields.
- Logs generated with glog will have the
common fields parsed, for example,
severity
,pid
,source_file
,source_line
. However, the message payload itself is unparsed and shows up verbatim in the resulting log message in Google Cloud Observability.
- Severities: By default, logs written to the standard output are on the
INFO
level and logs written to the standard error are on theERROR
level. Structured logs can include aseverity
field, which defines the log's severity. - Exporting to BigQuery: For additional analysis, you can export logs to external services, such as BigQuery or Pub/Sub. Logs exported to BigQuery retain their format and structure. See Routing and storage overview for further information.
- Alerting: When Logging logs unexpected behavior, you can use logs-based metrics to set up alerting policies. For an example, see Create an alerting policy on a counter metric. For detailed information on logs-based metrics, see Overview of logs-based metrics.
- Error reporting: To collect errors from applications running on your clusters, you can use Error Reporting.
Control plane logs
You can configure a GKE cluster to send logs emitted by the Kubernetes API server, Scheduler, and Controller Manager to Cloud Logging.
Requirements
Sending logs emitted by Kubernetes control plane components to Cloud Logging requires GKE control plane version 1.22.0 or later and requires that the collection of system logs be enabled.
Configuring collection of control plane logs
See the instructions to configure logging support for a new cluster or for an existing cluster.
Pricing
GKE control plane logs are exported to Cloud Logging. Cloud Logging pricing applies.
Quota
Control plane logs consume the "Write requests per minute" quota of the Cloud Logging API. Before enabling control plane logs, check your recent peak usage of that quota. If you have many clusters in the same project or are already approaching the quota limit, then you can request a quota-limit increase before enabling control plane logs.
Access controls
If you want to limit access within your organization to Kubernetes control plane logs, you can create a separate log bucket with more limited access controls.
By storing them in a separate log bucket with limited access, control plane logs
in the log bucket won't automatically be accessible to anyone with
roles/logging.viewer
access to the project. Additionally, if you decide to
delete certain control plane logs due to privacy or security concerns, storing
them in a separate log bucket with limited access makes it possible to delete
the logs without impacting logs from other components or services.