DataDiscoverySpec

JSON representation
BigQueryPublishingConfig
- JSON representation
TableType
StorageConfig
- JSON representation
CsvOptions
- JSON representation
JsonOptions
- JSON representation

Spec for a data discovery scan.

JSON representation

JSON representation
{ "bigqueryPublishingConfig": { object (`BigQueryPublishingConfig`) }, // Union field `resource_config` can be only one of the following: "storageConfig": { object (`StorageConfig`) } // End of list of possible types for union field `resource_config`. }

{
  "bigqueryPublishingConfig": {
    object (BigQueryPublishingConfig)
  },

  // Union field resource_config can be only one of the following:
  "storageConfig": {
    object (StorageConfig)
  }
  // End of list of possible types for union field resource_config.
}

Fields

Fields
`bigqueryPublishingConfig`	`object (BigQueryPublishingConfig)` Optional. Configuration for metadata publishing.
Union field `resource_config`. The configurations of the data discovery scan resource. `resource_config` can be only one of the following:
`storageConfig`	`object (StorageConfig)` Cloud Storage related configurations.

bigqueryPublishingConfig

object (BigQueryPublishingConfig)

Optional. Configuration for metadata publishing.

Union field resource_config. The configurations of the data discovery scan resource. resource_config can be only one of the following:

storageConfig

object (StorageConfig)

Cloud Storage related configurations.

BigQueryPublishingConfig

Describes BigQuery publishing configurations.

JSON representation
{ "tableType": enum (`TableType`), "connection": string, "location": string }

Fields

Fields
`tableType`	`enum (TableType)` Optional. Determines whether to publish discovered tables as BigLake external tables or non-BigLake external tables.
`connection`	`string` Optional. The BigQuery connection used to create BigLake tables. Must be in the form `projects/{projectId}/locations/{locationId}/connections/{connection_id}`
`location`	`string` Optional. The location of the BigQuery dataset to publish BigLake external or non-BigLake external tables to. 1. If the Cloud Storage bucket is located in a multi-region bucket, then BigQuery dataset can be in the same multi-region bucket or any single region that is included in the same multi-region bucket. The datascan can be created in any single region that is included in the same multi-region bucket 2. If the Cloud Storage bucket is located in a dual-region bucket, then BigQuery dataset can be located in regions that are included in the dual-region bucket, or in a multi-region that includes the dual-region. The datascan can be created in any single region that is included in the same dual-region bucket. 3. If the Cloud Storage bucket is located in a single region, then BigQuery dataset can be in the same single region or any multi-region bucket that includes the same single region. The datascan will be created in the same single region as the bucket. 4. If the BigQuery dataset is in single region, it must be in the same single region as the datascan. For supported values, refer to https://cloud.google.com/bigquery/docs/locations#supportedLocations.

tableType

enum (TableType)

Optional. Determines whether to publish discovered tables as BigLake external tables or non-BigLake external tables.

connection

string

Optional. The BigQuery connection used to create BigLake tables. Must be in the form projects/{projectId}/locations/{locationId}/connections/{connection_id}

location

string

Optional. The location of the BigQuery dataset to publish BigLake external or non-BigLake external tables to. 1. If the Cloud Storage bucket is located in a multi-region bucket, then BigQuery dataset can be in the same multi-region bucket or any single region that is included in the same multi-region bucket. The datascan can be created in any single region that is included in the same multi-region bucket 2. If the Cloud Storage bucket is located in a dual-region bucket, then BigQuery dataset can be located in regions that are included in the dual-region bucket, or in a multi-region that includes the dual-region. The datascan can be created in any single region that is included in the same dual-region bucket. 3. If the Cloud Storage bucket is located in a single region, then BigQuery dataset can be in the same single region or any multi-region bucket that includes the same single region. The datascan will be created in the same single region as the bucket. 4. If the BigQuery dataset is in single region, it must be in the same single region as the datascan.

For supported values, refer to https://cloud.google.com/bigquery/docs/locations#supportedLocations.

TableType

Determines how discovered tables are published.

Enums
`TABLE_TYPE_UNSPECIFIED`	Table type unspecified.
`EXTERNAL`	Default. Discovered tables are published as BigQuery external tables whose data is accessed using the credentials of the user querying the table.
`BIGLAKE`	Discovered tables are published as BigLake external tables whose data is accessed using the credentials of the associated BigQuery connection.

StorageConfig

Configurations related to Cloud Storage as the data source.

JSON representation
{ "includePatterns": [ string ], "excludePatterns": [ string ], "csvOptions": { object (`CsvOptions`) }, "jsonOptions": { object (`JsonOptions`) } }

Fields
`includePatterns[]`	`string` Optional. Defines the data to include during discovery when only a subset of the data should be considered. Provide a list of patterns that identify the data to include. For Cloud Storage bucket assets, these patterns are interpreted as glob patterns used to match object names. For BigQuery dataset assets, these patterns are interpreted as patterns to match table names.
`excludePatterns[]`	`string` Optional. Defines the data to exclude during discovery. Provide a list of patterns that identify the data to exclude. For Cloud Storage bucket assets, these patterns are interpreted as glob patterns used to match object names. For BigQuery dataset assets, these patterns are interpreted as patterns to match table names.
`csvOptions`	`object (CsvOptions)` Optional. Configuration for CSV data.
`jsonOptions`	`object (JsonOptions)` Optional. Configuration for JSON data.

CsvOptions

Describes CSV and similar semi-structured data formats.

JSON representation
{ "headerRows": integer, "delimiter": string, "encoding": string, "typeInferenceDisabled": boolean, "quote": string }

Fields
`headerRows`	`integer` Optional. The number of rows to interpret as header rows that should be skipped when reading data rows.
`delimiter`	`string` Optional. The delimiter that is used to separate values. The default is `,` (comma).
`encoding`	`string` Optional. The character encoding of the data. The default is UTF-8.
`typeInferenceDisabled`	`boolean` Optional. Whether to disable the inference of data types for CSV data. If true, all columns are registered as strings.
`quote`	`string` Optional. The character used to quote column values. Accepts `"` (double quotation mark) or `'` (single quotation mark). If unspecified, defaults to `"` (double quotation mark).

JsonOptions

Describes JSON data format.

JSON representation
{ "encoding": string, "typeInferenceDisabled": boolean }

Fields

Fields
`encoding`	`string` Optional. The character encoding of the data. The default is UTF-8.
`typeInferenceDisabled`	`boolean` Optional. Whether to disable the inference of data types for JSON data. If true, all columns are registered as their primitive types (strings, number, or boolean).

encoding

string

Optional. The character encoding of the data. The default is UTF-8.

typeInferenceDisabled

boolean

Optional. Whether to disable the inference of data types for JSON data. If true, all columns are registered as their primitive types (strings, number, or boolean).