Create materialized view replicas
This document describes how to create materialized view replicas in BigQuery. A materialized view replica is a replication of external Amazon Simple Storage Service (Amazon S3), Apache Iceberg, or Salesforce Data Cloud data in a BigQuery dataset so that the data is available locally in BigQuery, which can help you avoid data egress costs and improve query performance.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
- Ensure that you have the required Identity and Access Management (IAM) permissions to perform the tasks in this document.
Required roles
To get the permissions that you need to perform the tasks in this document,
ask your administrator to grant you the
BigQuery Admin (roles/bigquery.admin
) IAM role.
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the permissions required to perform the tasks in this document. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to perform the tasks in this document:
-
bigquery.tables.create
-
bigquery.tables.get
-
bigquery.tables.getData
-
bigquery.tables.replicateData
-
bigquery.jobs.create
You might also be able to get these permissions with custom roles or other predefined roles.
For more information about BigQuery IAM, see Introduction to IAM in BigQuery.
Prepare a dataset for materialized view replicas
Before creating a materialized view replica, you must complete the following tasks:
- Create a dataset in a region that supports Amazon S3
- Create a source table in the dataset you created in the preceding step. The
source table can be any of the following table types:
- An Amazon S3 BigLake table that has metadata caching enabled and doesn't use an Iceberg file format.
- A BigLake external table for Apache Iceberg.
- A Data Cloud table.
Create materialized view replicas
Select one of the following options:
Console
In the Google Cloud console, go to the BigQuery page.
In the Explorer pane, navigate to the project and dataset where you want to create the materialized view replica, and then click > Create table.
View actionsIn the Source section of the Create table dialog, do the following:
- For Create table from, select Existing table/view.
- For Project, enter the project where the source table or view is located.
- For Dataset, enter the dataset where the source table or view is located.
- For View, enter the source table or view that you are replicating. If you choose a view, it must be an authorized view, or if not, all tables that are used to generate that view must be located in the view's dataset.
Optional: For Local materialized view max staleness, enter a
max_staleness
value for your local materialized view.In the Destination section of the Create table dialog, do the following:
- For Project, enter the project in which you want to create the materialized view replica.
- For Dataset, enter the dataset in which you want to create the materialized view replica.
- For Replica materialized view name, enter a name for your replica.
Optional: Specify tags and advanced options for your materialized view replica. If you don't specify a dataset for Local Materialized View Dataset, then one is automatically created in the same project and region as the source data and named
bq_auto_generated_local_mv_dataset
. If you don't specify a name for Local Materialized View Name, then one is automatically created in the same project and region as the source data and given the prefixbq_auto_generated_local_mv_
.Click Create table.
A new local materialized view is created (if it wasn't specified) and authorized in the source dataset. Then the materialized view replica is created in the destination dataset.
SQL
- Create a materialized view over the base table in the dataset that you created. You can also create the materialized view in a different dataset that is in an Amazon S3 region.
- Authorize the materialized view on the datasets that contain the source tables used in the query that created the materialized view.
- If you configured manual metadata cache refreshing for the source table,
run the
BQ.REFRESH_EXTERNAL_METADATA_CACHE
system procedure to refresh the metadata cache. - Run the
BQ.REFRESH_MATERIALIZED_VIEW
system procedure to refresh the materialized view. Create materialized view replicas by using the
CREATE MATERIALIZED VIEW AS REPLICA OF
statement:CREATE MATERIALIZED VIEW PROJECT_ID.BQ_DATASET.REPLICA_NAME OPTIONS(replication_interval_seconds=REPLICATION_INTERVAL) AS REPLICA OF PROJECT_ID.S3_DATASET.MATERIALIZED_VIEW_NAME;
Replace the following:
PROJECT_ID
: the name of your project in which you want to create the materialized view replica—for example,myproject
.BQ_DATASET
: the name of the BigQuery dataset that you want to create the materialized view replica in—for example,bq_dataset
. The dataset must be in the BigQuery region that maps to the region of the source materialized view.REPLICA_NAME
: the name of the materialized view replica that you want to create—for example,my_mv_replica
.REPLICATION_INTERVAL
: specifies how often to replicate the data from the source materialized view to the replica, in seconds. Must be a value between 60 and 3,600, inclusive. Defaults to 300 (5 minutes).S3_DATASET
: the name of the dataset that contains the source materialized view—for example,s3_dataset
.MATERIALIZED_VIEW_NAME
: the name of the materialized view to replicate—for example,my_mv
.
The following example creates a materialized view replica named
mv_replica
inbq_dataset
:CREATE MATERIALIZED VIEW `myproject.bq_dataset.mv_replica` OPTIONS( replication_interval_seconds=600 ) AS REPLICA OF `myproject.s3_dataset.my_s3_mv`
After you create the materialized view replica, the replication process polls
the source materialized view for changes and replicates data to the materialized
view replica, refreshing the data at the interval you specified in the
replication_interval_seconds
or max_staleness
option. If you query the
replica before the first backfill completes, you get a backfill in progress
error. You can query the data in the materialized view replica after the first
replication completes.