Cloud Data Fusion pricing
This document explains the pricing for Cloud Data Fusion. To see the pricing for other products, read the Pricing documentation.
For pricing purposes, usage is measured as the length of time, in minutes, between the time a Cloud Data Fusion instance is created to the time it is deleted. Although the rate for pricing is defined on the hour, Cloud Data Fusion is billed by the minute. Usage is measured in hours (30 minutes is 0.5 hours, for example) to apply hourly pricing to minute-by-minute use.
If you pay in a currency other than USD, the prices listed in your currency on Google Cloud SKUs apply.
Pricing overview
Cloud Data Fusion pricing is split across two functions: pipeline development and execution.
Development
For pipeline development, Cloud Data Fusion offers the following three editions:
Cloud Data Fusion Edition | Price per instance per hour |
---|---|
Developer | $0.35 (~$250 per month) |
Basic | $1.80 (~$1100 per month) |
Enterprise | $4.20 (~$3000 per month) |
The Basic edition offers the first 120 hours per month per account free.
Execution
For pipeline execution, you are charged for the Dataproc clusters that Cloud Data Fusion creates to run your pipelines at the current Dataproc rates.
Comparison of Developer, Basic, and Enterprise editions
Capability | Developer | Basic | Enterprise |
---|---|---|---|
Number of concurrent users | 2 | Limited* | Limited* |
Workloads | Development, product exploration | Testing, sandbox, PoC | Production |
Internal IP support | |||
Role-based access control (RBAC) | |||
Visual Designer | |||
Connector ecosystem | |||
Visual transformations | |||
Structured, unstructured, semi-structured | |||
Streaming pipelines | |||
Integration lineage - field and dataset level | |||
Integration with Dataplex | |||
High Availability | Zonal | Regional | Regional |
Create and customize compute profiles | |||
Devops support: REST API, Source Control Management | |||
Triggers and schedules | |||
Execution environment selection | |||
Concurrent pipeline execution | Limited** | Limited** | |
Developer SDK for extensibility |
Usage of other Google Cloud resources
In addition to the development cost of a Cloud Data Fusion instance,̦ you are billed only for any resources that you use to execute your pipelines, such as:
Supported regions
Currently, pricing for Cloud Data Fusion is the same for all supported regions.
Region | Location |
---|---|
africa-south1
* |
Johannesburg, South Africa |
asia-east1 |
Changhua County, Taiwan |
asia-east2 |
Hong Kong |
asia-northeast1 |
Tokyo, Japan |
asia-northeast2 |
Osaka, Japan |
asia-northeast3 |
Seoul, South Korea |
asia-south1 |
Mumbai, India |
asia-south2 |
Delhi, India |
asia-southeast1 |
Jurong West, Singapore |
asia-southeast2 |
Jakarta, Indonesia |
australia-southeast1 |
Sydney, Australia |
europe-north1 |
Hamina, Finland |
europe-southwest1 |
Madrid, Spain |
europe-west1 |
St. Ghislain, Belgium |
europe-west2 |
London, England, UK |
europe-west3 |
Frankfurt, Germany |
europe-west4 |
Eemshaven, Netherlands |
europe-west6 |
Zürich, Switzerland |
europe-west8 |
Milan, Italy |
europe-west9 |
Paris, France |
europe-west12
* |
Turin, Italy |
me-central1 * |
Doha, Qatar |
me-central2 * |
Dammam, Saudi Arabia |
me-west1 |
Tel Aviv, Israel |
northamerica-northeast1 |
Montréal, Québec, Canada |
southamerica-east1 |
Osasco (São Paulo), Brazil |
southamerica-west1 |
Santiago, Chile |
us-central1 |
Council Bluffs, Iowa, North America |
us-east1 |
Moncks Corner, South Carolina, North America |
us-east4 |
Ashburn, Northern Virginia, North America |
us-east5 |
Columbus, Ohio, North America |
us-south1 |
Dallas, Texas, North America |
us-west1 |
The Dalles, Oregon, North America |
us-west2 |
Los Angeles, California, North America |
africa-south1
,
me-central1
, me-central1
, or
europe-west12
.
Pricing example
Consider a Cloud Data Fusion instance that has been running for 24 hours, and there are no free hours remaining for the Basic edition. Based on the edition, the instance charge for Cloud Data Fusion is summarized in the following table:
Edition | Cost per hour | Number of hours | Development cost |
---|---|---|---|
Developer | $0.35 | 24 | 24*0.35 = $8.4 |
Basic | $1.80 | 24 | 24*1.8 = $43.2 |
Enterprise | $4.20 | 24 | 24*4.2 = $100.8 |
During this 24-hour period, you ran a pipeline that read raw data from Cloud Storage, performed transformations, and wrote the data to BigQuery every hour. Each run took approximately 15 minutes to complete. In other words, the Dataproc clusters that were created for these runs were alive for 15 minutes (0.25 hours) each. Assume that the configuration of each Dataproc cluster was the following:
Item | Machine Type | Virtual CPUs | Attached Persistent Disk | Number in cluster |
---|---|---|---|---|
Master Node | n1-standard-4 | 4 | 500 GB | 1 |
Worker Nodes | n1-standard-4 | 4 | 500 GB | 5 |
The Dataproc clusters each have 24 virtual CPUs: 4 for the master and 20 spread across the workers. For Dataproc billing purposes, the pricing for this cluster would be based on those 24 virtual CPUs and the length of time each cluster ran.
Across all runs of your pipeline, the total charge incurred for Dataproc can be calculated as:
Dataproc charge = # of vCPUs * number of clusters * hours per cluster * Dataproc price = 24 * 24 * 0.25 * $0.01 = $1.44
The Dataproc clusters use other Google Cloud products, which would be billed separately. Specifically, these clusters would incur charges for Compute Engine and Standard Persistent Disk Provisioned Space. You will incur storage charges for Cloud Storage and BigQuery, depending on the amount of data your pipeline processes.
To determine these additional costs based on current rates, you can use the billing calculator.
What's next
- Read the Cloud Data Fusion documentation.
- Get started with Cloud Data Fusion.
- Try the Pricing calculator.