This page introduces the Cloud Data Fusion: Console, also known as the control plane. It's a set of API operations and a Google Cloud console interface that let you manage a Cloud Data Fusion instance. For example, using the console, you can create, delete, restart, or update an instance.
Before you begin
- Enable the Cloud Data Fusion API.
- Understand the costs of Cloud Data Fusion editions.
- Understand access control and service accounts in Cloud Data Fusion.
Cloud Data Fusion: Console overview
The following sections describe important aspects of the console.
Instances
An instance is a unique deployment of Cloud Data Fusion. To start using Cloud Data Fusion, you create an instance in the Google Cloud console. You can create multiple Cloud Data Fusion instances in a single Google Cloud project. You can specify a Google Cloud region for each instance. Each instance is a unique, independent Cloud Data Fusion deployment, which contains a set of services that handle pipeline lifecycle management, orchestration, coordination, and metadata management. These services run using long-running resources in a tenant project.
When you create the instance, consider the following options.
Edition
You create the instance in one of the following Cloud Data Fusion editions: Developer, Basic, or Enterprise. Choose the edition based on the following criteria:
- Cost
- Concurrency limits for pipeline execution
- Role-based access control (RBAC) availability
The editions are intended for the following use cases:
Cloud Data Fusion edition | Use case |
---|---|
Developer edition | For development, testing, or small-scale integrations |
Basic edition | For production with moderate needs |
Enterprise edition | For large-scale, mission-critical data pipelines with RBAC |
Public or private instance
Depending on your requirements, decide if you need a public or a private instance. The key differences between private and public instances in Cloud Data Fusion are network connectivity and security:
Cloud Data Fusion instance type | Behavior |
---|---|
Public instance |
|
Private instance |
|
Authorization and service account
Cloud Data Fusion typically has two service accounts:
- Design-time service account
- This Google-managed service account, called the Cloud Data Fusion API Service Agent, is used in the tenant project of Cloud Data Fusion to access customer project resources.
- Execution-time service account
- This is the default Compute Engine service account that Cloud Data Fusion creates to deploy jobs that access other Google Cloud resources. By default, it attaches to a Dataproc cluster VM to enable Cloud Data Fusion to access Dataproc resources during a pipeline run.
For more information, see Service accounts in Cloud Data Fusion.
Logging and monitoring
Cloud Logging and Cloud Monitoring are crucial for gaining insights into the health and performance of your Cloud Data Fusion pipelines. You enable Logging and Monitoring only when you create the Cloud Data Fusion instance.
Enabling Logging and Monitoring lets you view Cloud Data Fusion pipeline logs in the Google Cloud console on the Logging viewer page.
Monitoring provides built-in dashboards for Cloud Data Fusion. You can also create custom dashboards to monitor specific metrics.
Lineage integration with Dataplex
Cloud Data Fusion provides an integration with Dataplex for lineage. For more information, see View lineage in Dataplex.
Encryption
Customer-managed encryption keys (CMEK) enable encryption of data at rest with a key that you can control through the Cloud Key Management Service. CMEK provides user control over the data written to Google Cloud internal resources in tenant projects and data written by Cloud Data Fusion pipelines. For more information, see Customer managed data encryption.
Manage permissions with role-based access control (RBAC)
Cloud Data Fusion lets you control with Identity and Access Management (IAM).
For granular permissions for actions performed in Cloud Data Fusion: Studio operations, use RBAC. For more information, see the RBAC overview.
Version upgrades
Cloud Data Fusion has versions. You can upgrade an instance to a later version in the Cloud Data Fusion console. For more information, see Versioning in Cloud Data Fusion.
What's next
- Learn more about Cloud Data Fusion: Studio.