Introduction to search in BigQuery

BigQuery search indexes let you use GoogleSQL to efficiently find unique data elements that are buried in unstructured text and semi-structured JSON data, without having to know the table schemas in advance.

With search indexes, BigQuery provides a powerful columnar store and text search in one platform, enabling efficient row lookups when you need to find individual rows of data. A common use case is log analytics. For example, you might want to identify the rows of data associated with a user for General Data Protection Regulation (GDPR) reporting, or to find specific error codes in a text payload.

BigQuery stores and manages your indexes, so that when data becomes available in BigQuery, you can immediately retrieve it with the SEARCH function or other operators and functions, such as the equal (=), IN, or LIKE operators and certain string and JSON functions. To optimize your searches, read about best practices.

Use cases

BigQuery search indexes help you perform the following tasks:

  • Search system, network, or application logs stored in BigQuery tables.
  • Identify data elements for deletion to comply with regulatory processes.
  • Support developer troubleshooting.
  • Perform security audits.
  • Create a dashboard that requires highly selective search filters.
  • Search pre-processed data for exact matches.

For more information, see Create a search index and Search with an index.

Pricing

There is no charge for the processing required to build and refresh your search indexes when the total size of indexed tables in your organization is below your region's limit. To support indexing beyond this limit, you need to provide your own reservation for handling the index-management jobs. Search indexes incur storage costs when they are active. You can find the index storage size in the INFORMATION_SCHEMA.SEARCH_INDEXES view.

Roles and permissions

To create a search index, you need the bigquery.tables.createIndex IAM permission on the table where you're creating the index. To drop a search index, you need the bigquery.tables.deleteIndex permission. Each of the following predefined IAM roles includes the permissions that you need to work with search indexes:

  • BigQuery Data Owner (roles/bigquery.dataOwner)
  • BigQuery Data Editor (roles/bigquery.dataEditor)
  • BigQuery Admin (roles/bigquery.admin)

Limitations

  • You cannot create a search index directly on a view or materialized view, but calling the SEARCH function on a view of an indexed table makes use of the underlying search index.
  • If you rename a table after you create a search index on it, the index becomes invalid.
  • The SEARCH function is designed for point lookups. Fuzzy searching, typo correction, wildcards, and other types of document searches are not available.
  • If the search index is not yet at 100% coverage, you are still charged for all index storage that is reported in the INFORMATION_SCHEMA.SEARCH_INDEXES view.
  • Queries that contain the SEARCH function are not accelerated by BigQuery BI Engine.
  • Search indexes are not used when the indexed table is modified by a DML statement, but they can be used when the predicate that is optimizable by search indexes is part of a subquery in a DML statement.

    • A search index is not used in the following query:
    DELETE FROM my_dataset.indexed_table
    WHERE SEARCH(user_id, '123');
    • A search index can be used in the following query:
    DELETE FROM my_dataset.other_table
    WHERE
      user_id IN (
        SELECT user_id
        FROM my_dataset.indexed_table
        WHERE SEARCH(user_id, '123')
      );
  • Search indexes are not used when the query references Materialized Views.

  • Search indexes are not used in a multi-statement transaction query.

What's next