Class ClusterConfig (1.1.3)

The cluster config.

Optional. A Cloud Storage bucket used to store ephemeral cluster and jobs data, such as Spark and MapReduce history files. If you do not specify a temp bucket, Dataproc will determine a Cloud Storage location (US, ASIA, or EU) for your cluster’s temp bucket according to the Compute Engine zone where your cluster is deployed, and then create and manage this project-level, per-location bucket. The default bucket has a TTL of 90 days, but you can use any TTL (or none) if you specify a bucket.

Optional. The Compute Engine config settings for the master instance in a cluster.

Optional. The Compute Engine config settings for additional worker instances in a cluster.

Optional. Commands to execute on each node after config is completed. By default, executables are run on master and all worker nodes. You can test a node’s role metadata to run an executable on a master or worker node, as shown below using curl (you can also use wget): :: ROLE=$(curl -H Metadata-Flavor:Google http://metadata/computeMetadata/v1/i nstance/attributes/dataproc-role) if [[ "${ROLE}" == 'Master' ]]; then ... master specific actions ... else ... worker specific actions ... fi

Optional. Autoscaling config for the policy associated with the cluster. Cluster does not autoscale if this field is unset.

Optional. Lifecycle setting for the cluster.