Class Client (1.20.0)

Client(
    project=None,
    credentials=None,
    _http=None,
    location=None,
    default_query_job_config=None,
    client_info=None,
    client_options=None,
)

Client to bundle configuration needed for API requests.

Parameters

Name	Description
project	`str` Project ID for the project which the client acts on behalf of. Will be passed when creating a dataset / job. If not passed, falls back to the default inferred from the environment.
credentials	`google.auth.credentials.Credentials` (Optional) The OAuth2 Credentials to use for this client. If not passed (and if no `_http` object is passed), falls back to the default inferred from the environment.
_http	`requests.Session` (Optional) HTTP object to make requests. Can be any object that defines `request()` with the same interface as `requests.Session.request`. If not passed, an `_http` object is created that is bound to the `credentials` for the current object. This parameter should be considered private, and could change in the future.
location	`str` (Optional) Default location for jobs / datasets / tables.
default_query_job_config	`google.cloud.bigquery.job.QueryJobConfig` (Optional) Default `QueryJobConfig`. Will be merged into job configs passed into the `query` method.
client_info	`google.api_core.client_info.ClientInfo` The client info used to send a user-agent string along with API requests. If `None`, then default info will be used. Generally, you only need to set this if you're developing your own library or partner tool.
client_options	Union[`google.api_core.client_options.ClientOptions`, dict] (Optional) Client options used to set user options on the client. API Endpoint should be set through client_options.

Inheritance

builtins.object > google.cloud.client._ClientFactoryMixin > google.cloud.client.Client > builtins.object > google.cloud.client._ClientProjectMixin > google.cloud.client.ClientWithProject > Client

Properties

location

Default location for jobs / datasets / tables.

Methods

init

__init__(
    project=None,
    credentials=None,
    _http=None,
    location=None,
    default_query_job_config=None,
    client_info=None,
    client_options=None,
)

Initialize self. See help(type(self)) for accurate signature.

cancel_job

cancel_job(job_id, project=None, location=None, retry=<google.api_core.retry.Retry object>)

Attempt to cancel a job from a job ID.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel

Parameter

Name	Description
job_id	`str :keyword project: (Optional) ID of the project which owns the job (defaults to the client's project). :kwtype project: str :keyword location: Location where the job was run. :kwtype location: str :keyword retry: (Optional) How to retry the RPC. :kwtype retry: google.api_core.retry.Retry` Unique job identifier.

Returns

Type	Description
Union[google.cloud.bigquery.job.LoadJob, google.cloud.bigquery.job.CopyJob, google.cloud.bigquery.job.ExtractJob, google.cloud.bigquery.job.QueryJob]	Job instance, based on the resource returned by the API.

close

close()

Clean up transport, if set.

Suggested use:

import contextlib
with contextlib.closing(client):  # closes on exit
    do_something_with(client)

copy_table

copy_table(sources, destination, job_id=None, job_id_prefix=None, location=None, project=None, job_config=None, retry=<google.api_core.retry.Retry object>)

Copy one or more tables to another table.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.copy

Parameter

Name	Description
sources	`Union[ Table, TableReference, str, Sequence[ Union[ Table, TableReference, str, ] ], ]` Table or tables to be copied.

Returns

Type	Description
google.cloud.bigquery.job.CopyJob	A new copy job instance.

create_dataset

create_dataset(dataset, exists_ok=False, retry=<google.api_core.retry.Retry object>)

API call: create the dataset via a POST request.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/insert

Parameters

Name	Description
dataset	`Union[ Dataset, DatasetReference, str, ]` A Dataset to create. If `dataset` is a reference, an empty dataset is created with the specified ID and client's default location.
exists_ok	`bool` Defaults to `False`. If `True`, ignore "already exists" errors when creating the dataset.
retry	`google.api_core.retry.Retry` Optional. How to retry the RPC.

Returns

Type	Description
google.cloud.bigquery.dataset.Dataset .. rubric:: Example >>> from google.cloud import bigquery >>> client = bigquery.Client() >>> dataset = bigquery.Dataset(client.dataset('my_dataset')) >>> dataset = client.create_dataset(dataset)	A new ``Dataset`` returned from the API.

create_routine

create_routine(routine, exists_ok=False, retry=<google.api_core.retry.Retry object>)

[Beta] Create a routine via a POST request.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/routines/insert

Parameters

Name	Description
routine	`Routine` A Routine to create. The dataset that the routine belongs to must already exist.
exists_ok	`bool` Defaults to `False`. If `True`, ignore "already exists" errors when creating the routine.
retry	`google.api_core.retry.Retry` Optional. How to retry the RPC.

Returns

Type	Description
google.cloud.bigquery.routine.Routine	A new ``Routine`` returned from the service.

create_table

create_table(table, exists_ok=False, retry=<google.api_core.retry.Retry object>)

API call: create a table via a PUT request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/insert

Parameters

Name	Description
table	`Union[ Table, TableReference, str, ]` A Table to create. If `table` is a reference, an empty table is created with the specified ID. The dataset that the table belongs to must already exist.
exists_ok	`bool` Defaults to `False`. If `True`, ignore "already exists" errors when creating the table.
retry	`google.api_core.retry.Retry` Optional. How to retry the RPC.

Returns

Type	Description
google.cloud.bigquery.table.Table	A new ``Table`` returned from the service.

dataset

dataset(dataset_id, project=None)

Construct a reference to a dataset.

Parameters

Name	Description
dataset_id	`str` ID of the dataset.
project	`str` (Optional) project ID for the dataset (defaults to the project of the client).

Returns

Type	Description
DatasetReference	a new ``DatasetReference`` instance

delete_dataset

delete_dataset(dataset, delete_contents=False, retry=<google.api_core.retry.Retry object>, not_found_ok=False)

Delete a dataset.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete

Args dataset (Union[ Dataset, DatasetReference, str, ]): A reference to the dataset to delete. If a string is passed in, this method attempts to create a dataset reference from a string using from_string. delete_contents (boolean): (Optional) If True, delete all the tables in the dataset. If False and the dataset contains tables, the request will fail. Default is False. retry (google.api_core.retry.Retry): (Optional) How to retry the RPC. not_found_ok (bool): Defaults to False. If True, ignore "not found" errors when deleting the dataset.

delete_model

delete_model(model, retry=<google.api_core.retry.Retry object>, not_found_ok=False)

[Beta] Delete a model

See https://cloud.google.com/bigquery/docs/reference/rest/v2/models/delete

Parameters

Name	Description
model	`Union[ Model, ModelReference, str, ]` A reference to the model to delete. If a string is passed in, this method attempts to create a model reference from a string using from_string.
retry	`google.api_core.retry.Retry` (Optional) How to retry the RPC.
not_found_ok	`bool` Defaults to `False`. If `True`, ignore "not found" errors when deleting the model.

delete_routine

delete_routine(routine, retry=<google.api_core.retry.Retry object>, not_found_ok=False)

[Beta] Delete a routine.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/routines/delete

Parameters

Name	Description
model	`Union[ Routine, RoutineReference, str, ]` A reference to the routine to delete. If a string is passed in, this method attempts to create a routine reference from a string using from_string.
retry	`google.api_core.retry.Retry` (Optional) How to retry the RPC.
not_found_ok	`bool` Defaults to `False`. If `True`, ignore "not found" errors when deleting the routine.

delete_table

delete_table(table, retry=<google.api_core.retry.Retry object>, not_found_ok=False)

Delete a table

See https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete

Parameters

Name	Description
table	`Union[ Table, TableReference, str, ]` A reference to the table to delete. If a string is passed in, this method attempts to create a table reference from a string using from_string.
retry	`google.api_core.retry.Retry` (Optional) How to retry the RPC.
not_found_ok	`bool` Defaults to `False`. If `True`, ignore "not found" errors when deleting the table.

extract_table

extract_table(source, destination_uris, job_id=None, job_id_prefix=None, location=None, project=None, job_config=None, retry=<google.api_core.retry.Retry object>)

Start a job to extract a table into Cloud Storage files.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.extract

Parameters

Name	Description
source	`TableReference` table to be extracted.
destination_uris	Union[str, Sequence[str]] :keyword job_id: (Optional) The ID of the job. :kwtype job_id: str :keyword job_id_prefix: (Optional) the user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str :keyword location: Location where to run the job. Must match the location of the source table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str :keyword job_config: (Optional) Extra configuration options for the job. :kwtype job_config: google.cloud.bigquery.job.ExtractJobConfig :keyword retry: (Optional) How to retry the RPC. :kwtype retry: google.api_core.retry.Retry URIs of Cloud Storage file(s) into which table data is to be extracted; in format `gs://<bucket_name>/<object_name_or_glob>`.

Returns

Type	Description
google.cloud.bigquery.job.ExtractJob	A new extract job instance.

from_service_account_info

from_service_account_info(info, *args, **kwargs)

Factory to retrieve JSON credentials while creating client.

Parameters

Name	Description
args	`tuple` Remaining positional arguments to pass to constructor.
info	`str` The JSON object with a private key and other credentials information (downloaded from the Google APIs console).

Exceptions

Type	Description
TypeError	if there is a conflict with the kwargs and the credentials created by the factory.

Returns

Type	Description
`_ClientFactoryMixin`	The client created with the retrieved JSON credentials.

from_service_account_json

from_service_account_json(json_credentials_path, *args, **kwargs)

Factory to retrieve JSON credentials while creating client.

Parameters

Name	Description
args	`tuple` Remaining positional arguments to pass to constructor.
json_credentials_path	`str` The path to a private key file (this file was given to you when you created the service account). This file must contain a JSON object with a private key and other credentials information (downloaded from the Google APIs console).

Exceptions

Type	Description
TypeError	if there is a conflict with the kwargs and the credentials created by the factory.

Returns

Type	Description
`_ClientFactoryMixin`	The client created with the retrieved JSON credentials.

get_dataset

get_dataset(dataset_ref, retry=<google.api_core.retry.Retry object>)

Fetch the dataset referenced by dataset_ref

Parameters

Name	Description
dataset_ref	`Union[ DatasetReference, str, ]` A reference to the dataset to fetch from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string.
retry	`google.api_core.retry.Retry` (Optional) How to retry the RPC.

Returns

Type	Description
google.cloud.bigquery.dataset.Dataset	A ``Dataset`` instance.

get_job

get_job(job_id, project=None, location=None, retry=<google.api_core.retry.Retry object>)

Fetch a job for the project associated with this client.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameter

Name	Description
job_id	`str :keyword project: (Optional) ID of the project which ownsthe job (defaults to the client's project). :kwtype project: str :keyword location: Location where the job was run. :kwtype location: str :keyword retry: (Optional) How to retry the RPC. :kwtype retry: google.api_core.retry.Retry` Unique job identifier.

Returns

Type	Description
Union[google.cloud.bigquery.job.LoadJob, google.cloud.bigquery.job.CopyJob, google.cloud.bigquery.job.ExtractJob, google.cloud.bigquery.job.QueryJob]	Job instance, based on the resource returned by the API.

get_model

get_model(model_ref, retry=<google.api_core.retry.Retry object>)

[Beta] Fetch the model referenced by model_ref.

Parameters

Name	Description
model_ref	`Union[ ModelReference, str, ]` A reference to the model to fetch from the BigQuery API. If a string is passed in, this method attempts to create a model reference from a string using from_string.
retry	`google.api_core.retry.Retry` (Optional) How to retry the RPC.

Returns

Type	Description
google.cloud.bigquery.model.Model	A ``Model`` instance.

get_routine

get_routine(routine_ref, retry=<google.api_core.retry.Retry object>)

[Beta] Get the routine referenced by routine_ref.

Parameters

Name	Description
routine_ref	`Union[ Routine, RoutineReference, str, ]` A reference to the routine to fetch from the BigQuery API. If a string is passed in, this method attempts to create a reference from a string using from_string.
retry	`google.api_core.retry.Retry` (Optional) How to retry the API call.

Returns

Type	Description
google.cloud.bigquery.routine.Routine	A ``Routine`` instance.

get_service_account_email

get_service_account_email(project=None)

Get the email address of the project's BigQuery service account

.. note::

This is the service account that BigQuery uses to manage tables encrypted by a key in KMS.

Parameter

Name	Description
project	`str, optional` Project ID to use for retreiving service account email. Defaults to the client's project.

Returns

Type	Description
str .. rubric:: Example >>> from google.cloud import bigquery >>> client = bigquery.Client() >>> client.get_service_account_email() my_service_account@my-project.iam.gserviceaccount.com	service account email address

get_table

get_table(table, retry=<google.api_core.retry.Retry object>)

Fetch the table referenced by table.

Parameters

Name	Description
table	`Union[ Table, TableReference, str, ]` A reference to the table to fetch from the BigQuery API. If a string is passed in, this method attempts to create a table reference from a string using from_string.
retry	`google.api_core.retry.Retry` (Optional) How to retry the RPC.

Returns

Type	Description
google.cloud.bigquery.table.Table	A ``Table`` instance.

insert_rows

insert_rows(table, rows, selected_fields=None, **kwargs)

Insert rows into a table via the streaming API.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll

Parameters

Name	Description
kwargs	`dict` Keyword arguments to insert_rows_json.
table	`Union[ Table, TableReference, str, ]` The destination table for the row data, or a reference to it.
rows	`Union[ Sequence[Tuple], Sequence[dict], ]` Row data to be inserted. If a list of tuples is given, each tuple should contain data for each schema field on the current table and in the same order as the schema fields. If a list of dictionaries is given, the keys must include all required fields in the schema. Keys which do not correspond to a field in the schema are ignored.
selected_fields	`Sequence[ SchemaField, ]` The fields to return. Required if `table` is a TableReference.

Exceptions

Type	Description
ValueError	if table's schema is not set

Returns

Type	Description
Sequence[Mappings]	One mapping per row with insert errors: the "index" key identifies the row, and the "errors" key contains a list of the mappings describing one or more problems with the row.

insert_rows_from_dataframe

insert_rows_from_dataframe(
    table, dataframe, selected_fields=None, chunk_size=500, **kwargs
)

Insert rows into a table from a dataframe via the streaming API.

Parameters

Name	Description
kwargs	`dict` Keyword arguments to insert_rows_json.
table	`Union[ Table, TableReference, str, ]` The destination table for the row data, or a reference to it.
dataframe	`pandas.DataFrame` A `pandas.DataFrame` containing the data to load.
selected_fields	`Sequence[ SchemaField, ]` The fields to return. Required if `table` is a TableReference.
chunk_size	`int` The number of rows to stream in a single chunk. Must be positive.

Exceptions

Type	Description
ValueError	if table's schema is not set

Returns

Type	Description
Sequence[Sequence[Mappings]]	A list with insert errors for each insert chunk. Each element is a list containing one mapping per row with insert errors: the "index" key identifies the row, and the "errors" key contains a list of the mappings describing one or more problems with the row.

insert_rows_json

insert_rows_json(table, json_rows, row_ids=None, skip_invalid_rows=None, ignore_unknown_values=None, template_suffix=None, retry=<google.api_core.retry.Retry object>)

Insert rows into a table without applying local type conversions.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll

table (Union[ xref_Table xref_TableReference, str, ]): The destination table for the row data, or a reference to it. json_rows (Sequence[dict]): Row data to be inserted. Keys must match the table schema fields and values must be JSON-compatible representations. row_ids (Sequence[str]): (Optional) Unique ids, one per row being inserted. If omitted, unique IDs are created. skip_invalid_rows (bool): (Optional) Insert all valid rows of a request, even if invalid rows exist. The default value is False, which causes the entire request to fail if any invalid rows exist. ignore_unknown_values (bool): (Optional) Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is False, which treats unknown values as errors. template_suffix (str): (Optional) treat name as a template table and provide a suffix. BigQuery will create the table <name> + <template_suffix> based on the schema of the template table. See https://cloud.google.com/bigquery/streaming-data-into-bigquery#template-tables retry (google.api_core.retry.Retry): (Optional) How to retry the RPC.

Returns

Type	Description
Sequence[Mappings]	One mapping per row with insert errors: the "index" key identifies the row, and the "errors" key contains a list of the mappings describing one or more problems with the row.

job_from_resource

job_from_resource(resource)

Detect correct job type from resource and instantiate.

Parameter

Name	Description
resource	`dict` one job resource from API response

Returns

Type	Description
One of: LoadJob, CopyJob, ExtractJob, or QueryJob	the job instance, constructed via the resource

list_datasets

list_datasets(project=None, include_all=False, filter=None, max_results=None, page_token=None, retry=<google.api_core.retry.Retry object>)

List datasets for the project associated with this client.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/list

Parameters

Name	Description
project	`str` Optional. Project ID to use for retreiving datasets. Defaults to the client's project.
include_all	`bool` Optional. True if results include hidden datasets. Defaults to False.
filter	`str` Optional. An expression for filtering the results by label. For syntax, see https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/list#filter.
max_results	`int` Optional. Maximum number of datasets to return.
page_token	`str` Optional. Token representing a cursor into the datasets. If not passed, the API will return the first page of datasets. The token marks the beginning of the iterator to be returned and the value of the `page_token` can be accessed at `next_page_token` of the `google.api_core.page_iterator.HTTPIterator`.
retry	`google.api_core.retry.Retry` Optional. How to retry the RPC.

Returns

Type	Description
google.api_core.page_iterator.Iterator	Iterator of DatasetListItem. associated with the project.

list_jobs

list_jobs(project=None, max_results=None, page_token=None, all_users=None, state_filter=None, retry=<google.api_core.retry.Retry object>, min_creation_time=None, max_creation_time=None)

List jobs for the project associated with this client.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/list

Parameters

Name	Description
project	`str, optional` Project ID to use for retreiving datasets. Defaults to the client's project.
max_results	`int, optional` Maximum number of jobs to return.
page_token	`str, optional` Opaque marker for the next "page" of jobs. If not passed, the API will return the first page of jobs. The token marks the beginning of the iterator to be returned and the value of the `page_token` can be accessed at `next_page_token` of `google.api_core.page_iterator.HTTPIterator`.
all_users	`bool, optional` If true, include jobs owned by all users in the project. Defaults to :data:`False`.
state_filter	`str, optional` If set, include only jobs matching the given state. One of: * `"done"` * `"pending"` * `"running"`
retry	`google.api_core.retry.Retry, optional` How to retry the RPC.
min_creation_time	`datetime.datetime, optional` Min value for job creation time. If set, only jobs created after or at this timestamp are returned. If the datetime has no time zone assumes UTC time.
max_creation_time	`datetime.datetime, optional` Max value for job creation time. If set, only jobs created before or at this timestamp are returned. If the datetime has no time zone assumes UTC time.

Returns

Type	Description
google.api_core.page_iterator.Iterator	Iterable of job instances.

list_models

list_models(dataset, max_results=None, page_token=None, retry=<google.api_core.retry.Retry object>)

[Beta] List models in the dataset.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/models/list

Parameters

Name	Description
dataset	`Union[ Dataset, DatasetReference, str, ]` A reference to the dataset whose models to list from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string.
max_results	`int` (Optional) Maximum number of models to return. If not passed, defaults to a value set by the API.
page_token	`str` (Optional) Token representing a cursor into the models. If not passed, the API will return the first page of models. The token marks the beginning of the iterator to be returned and the value of the `page_token` can be accessed at `next_page_token` of the `google.api_core.page_iterator.HTTPIterator`.
retry	`google.api_core.retry.Retry` Returns: google.api_core.page_iterator.Iterator: Iterator of Model contained within the requested dataset. (Optional) How to retry the RPC.

list_partitions

list_partitions(table, retry=<google.api_core.retry.Retry object>)

List the partitions in a table.

Parameters

Name	Description
table	`Union[ Table, TableReference, str, ]` The table or reference from which to get partition info
retry	`google.api_core.retry.Retry` (Optional) How to retry the RPC.

Returns

Type	Description
List[str]	A list of the partition ids present in the partitioned table

list_projects

list_projects(max_results=None, page_token=None, retry=<google.api_core.retry.Retry object>)

List projects for the project associated with this client.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/projects/list

Parameters

Name	Description
max_results	`int` (Optional) maximum number of projects to return, If not passed, defaults to a value set by the API.
page_token	`str` (Optional) Token representing a cursor into the projects. If not passed, the API will return the first page of projects. The token marks the beginning of the iterator to be returned and the value of the `page_token` can be accessed at `next_page_token` of the `google.api_core.page_iterator.HTTPIterator`.
retry	`google.api_core.retry.Retry` (Optional) How to retry the RPC.

Returns

Type	Description
`google.api_core.page_iterator.Iterator`	Iterator of Project accessible to the current client.

list_routines

list_routines(dataset, max_results=None, page_token=None, retry=<google.api_core.retry.Retry object>)

[Beta] List routines in the dataset.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/routines/list

Parameters

Name	Description
dataset	`Union[ Dataset, DatasetReference, str, ]` A reference to the dataset whose routines to list from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string.
max_results	`int` (Optional) Maximum number of routines to return. If not passed, defaults to a value set by the API.
page_token	`str` (Optional) Token representing a cursor into the routines. If not passed, the API will return the first page of routines. The token marks the beginning of the iterator to be returned and the value of the `page_token` can be accessed at `next_page_token` of the `google.api_core.page_iterator.HTTPIterator`.
retry	`google.api_core.retry.Retry` Returns: google.api_core.page_iterator.Iterator: Iterator of all Routines contained within the requested dataset, limited by ``max_results``. (Optional) How to retry the RPC.

list_rows

list_rows(table, selected_fields=None, max_results=None, page_token=None, start_index=None, page_size=None, retry=<google.api_core.retry.Retry object>)

List the rows of the table.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/list

.. note::

This method assumes that the provided schema is up-to-date with the schema as defined on the back-end: if the two schemas are not identical, the values returned may be incomplete. To ensure that the local copy of the schema is up-to-date, call client.get_table.

Parameters

Name	Description
table	`Union[ Table, TableListItem, TableReference, str, ]` The table to list, or a reference to it. When the table object does not contain a schema and `selected_fields` is not supplied, this method calls `get_table` to fetch the table schema.
selected_fields	`Sequence[ SchemaField ]` The fields to return. If not supplied, data for all columns are downloaded.
max_results	`int` (Optional) maximum number of rows to return.
page_token	`str` (Optional) Token representing a cursor into the table's rows. If not passed, the API will return the first page of the rows. The token marks the beginning of the iterator to be returned and the value of the `page_token` can be accessed at `next_page_token` of the RowIterator.
start_index	`int` (Optional) The zero-based index of the starting row to read.
page_size	`int` Optional. The maximum number of rows in each page of results from this request. Non-positive values are ignored. Defaults to a sensible value set by the API.
retry	`google.api_core.retry.Retry` (Optional) How to retry the RPC.

Returns

Type	Description
google.cloud.bigquery.table.RowIterator	Iterator of row data Row-s. During each page, the iterator will have the ``total_rows`` attribute set, which counts the total number of rows in the table (this is distinct from the total number of rows in the current page: ``iterator.page.num_items``).

list_tables

list_tables(dataset, max_results=None, page_token=None, retry=<google.api_core.retry.Retry object>)

List tables in the dataset.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/list

Parameters

Name	Description
dataset	`Union[ Dataset, DatasetReference, str, ]` A reference to the dataset whose tables to list from the BigQuery API. If a string is passed in, this method attempts to create a dataset reference from a string using from_string.
max_results	`int` (Optional) Maximum number of tables to return. If not passed, defaults to a value set by the API.
page_token	`str` (Optional) Token representing a cursor into the tables. If not passed, the API will return the first page of tables. The token marks the beginning of the iterator to be returned and the value of the `page_token` can be accessed at `next_page_token` of the `google.api_core.page_iterator.HTTPIterator`.
retry	`google.api_core.retry.Retry` (Optional) How to retry the RPC.

Returns

Type	Description
google.api_core.page_iterator.Iterator	Iterator of TableListItem contained within the requested dataset.

load_table_from_dataframe

load_table_from_dataframe(
    dataframe,
    destination,
    num_retries=6,
    job_id=None,
    job_id_prefix=None,
    location=None,
    project=None,
    job_config=None,
    parquet_compression="snappy",
)

Upload the contents of a table from a pandas DataFrame.

Similar to load_table_from_uri, this method creates, starts and returns a xref_LoadJob.

Parameters

Name	Description
dataframe	`pandas.DataFrame` A `pandas.DataFrame` containing the data to load.
destination	google.cloud.bigquery.table.TableReference :keyword num_retries: Number of upload retries. :kwtype num_retries: int, optional :keyword job_id: Name of the job. :kwtype job_id: str, optional :keyword job_id_prefix: The user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str, optional :keyword location: Location where to run the job. Must match the location of the destination table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str, optional :keyword job_config: Extra configuration options for the job. To override the default pandas data type conversions, supply a value for schema with column names matching those of the dataframe. The BigQuery schema is used to determine the correct data type conversion. Indexes are not loaded. Requires the `pyarrow` library. :kwtype job_config: LoadJobConfig, optional :keyword parquet_compression: [Beta] The compression method to use if intermittently serializing ``dataframe`` to a parquet file. If ``pyarrow`` and job config schema are used, the argument is directly passed as the ``compression`` argument to the underlying ``pyarrow.parquet.write_table()`` method (the default value "snappy" gets converted to uppercase). https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html#pyarrow-parquet-write-table If either ``pyarrow`` or job config schema are missing, the argument is directly passed as the ``compression`` argument to the underlying ``DataFrame.to_parquet()`` method. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_parquet.html#pandas.DataFrame.to_parquet :kwtype parquet_compression: str The destination table to use for loading the data. If it is an existing table, the schema of the `pandas.DataFrame` must match the schema of the destination table. If the table does not yet exist, the schema is inferred from the `pandas.DataFrame`. If a string is passed in, this method attempts to create a table reference from a string using from_string.

Exceptions

Type	Description
ImportError	If a usable parquet engine cannot be found. This method requires `pyarrow` or `fastparquet` to be installed.

Returns

Type	Description
google.cloud.bigquery.job.LoadJob	A new load job.

load_table_from_file

load_table_from_file(
    file_obj,
    destination,
    rewind=False,
    size=None,
    num_retries=6,
    job_id=None,
    job_id_prefix=None,
    location=None,
    project=None,
    job_config=None,
)

Upload the contents of this table from a file-like object.

Similar to load_table_from_uri, this method creates, starts and returns a xref_LoadJob.

Parameters

Name	Description
file_obj	`file` A file handle opened in binary mode for reading.
destination	Union[ Table, TableReference, str, ] :keyword rewind: If True, seek to the beginning of the file handle before reading the file. :kwtype rewind: bool :keyword size: The number of bytes to read from the file handle. If size is ``None`` or large, resumable upload will be used. Otherwise, multipart upload will be used. :kwtype size: int :keyword num_retries: Number of upload retries. Defaults to 6. :kwtype num_retries: int :keyword job_id: (Optional) Name of the job. :kwtype job_id: str :keyword job_id_prefix: (Optional) the user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str :keyword location: Location where to run the job. Must match the location of the destination table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str :keyword job_config: (Optional) Extra configuration options for the job. :kwtype job_config: google.cloud.bigquery.job.LoadJobConfig Table into which data is to be loaded. If a string is passed in, this method attempts to create a table reference from a string using from_string.

Exceptions

Type	Description
ValueError	If ``size`` is not passed in and can not be determined, or if the ``file_obj`` can be detected to be a file opened in text mode.

Returns

Type	Description
google.cloud.bigquery.job.LoadJob	A new load job.

load_table_from_json

load_table_from_json(
    json_rows,
    destination,
    num_retries=6,
    job_id=None,
    job_id_prefix=None,
    location=None,
    project=None,
    job_config=None,
)

Upload the contents of a table from a JSON string or dict.

Parameters

Name	Description
json_rows	`Iterable[Dict[str, Any]]` Row data to be inserted. Keys must match the table schema fields and values must be JSON-compatible representations.
destination	Union[ Table, TableReference, str, ] :keyword num_retries: Number of upload retries. :kwtype num_retries: int, optional :keyword job_id: (Optional) Name of the job. :kwtype job_id: str :keyword job_id_prefix: (Optional) the user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str :keyword location: Location where to run the job. Must match the location of the destination table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str :keyword job_config: (Optional) Extra configuration options for the job. The ``source_format`` setting is always set to NEWLINE_DELIMITED_JSON. :kwtype job_config: google.cloud.bigquery.job.LoadJobConfig Table into which data is to be loaded. If a string is passed in, this method attempts to create a table reference from a string using from_string.

Returns

Type	Description
google.cloud.bigquery.job.LoadJob	A new load job.

load_table_from_uri

load_table_from_uri(source_uris, destination, job_id=None, job_id_prefix=None, location=None, project=None, job_config=None, retry=<google.api_core.retry.Retry object>)

Starts a job for loading data into a table from CloudStorage.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.load

Parameters

Name	Description
source_uris	`Union[str, Sequence[str]]` URIs of data files to be loaded; in format `gs://<bucket_name>/<object_name_or_glob>`.
destination	Union[ Table, TableReference, str, ] :keyword job_id: (Optional) Name of the job. :kwtype job_id: str :keyword job_id_prefix: (Optional) the user-provided prefix for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str :keyword location: Location where to run the job. Must match the location of the destination table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str :keyword job_config: (Optional) Extra configuration options for the job. :kwtype job_config: google.cloud.bigquery.job.LoadJobConfig :keyword retry: (Optional) How to retry the RPC. :kwtype retry: google.api_core.retry.Retry Table into which data is to be loaded. If a string is passed in, this method attempts to create a table reference from a string using from_string.

Returns

Type	Description
google.cloud.bigquery.job.LoadJob	A new load job.

query

query(query, job_config=None, job_id=None, job_id_prefix=None, location=None, project=None, retry=<google.api_core.retry.Retry object>)

Run a SQL query.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query

Parameter

Name Description

query

str :keyword job_config: (Optional) Extra configuration options for the job. To override any options that were previously set in the ``default_query_job_config`` given to the ``Client`` constructor, manually set those options to ``None``, or whatever value is preferred. :kwtype job_config: google.cloud.bigquery.job.QueryJobConfig :keyword job_id: (Optional) ID to use for the query job. :kwtype job_id: str :keyword job_id_prefix: (Optional) The prefix to use for a randomly generated job ID. This parameter will be ignored if a ``job_id`` is also given. :kwtype job_id_prefix: str :keyword location: Location where to run the job. Must match the location of the any table used in the query as well as the destination table. :kwtype location: str :keyword project: Project ID of the project of where to run the job. Defaults to the client's project. :kwtype project: str :keyword retry: (Optional) How to retry the RPC. :kwtype retry: google.api_core.retry.Retry

SQL query to be executed. Defaults to the standard SQL dialect. Use the job_config parameter to change dialects.

Returns

Type	Description
google.cloud.bigquery.job.QueryJob	A new query job instance.

schema_from_json

schema_from_json(file_or_path)

Takes a file object or file path that contains json that describes a table schema.

schema_to_json

schema_to_json(schema_list, destination)

Takes a list of schema field objects.

Serializes the list of schema field objects as json to a file.

Destination is a file path or a file object.

update_dataset

update_dataset(dataset, fields, retry=<google.api_core.retry.Retry object>)

Change some fields of a dataset.

Use fields to specify which fields to update. At least one field must be provided. If a field is listed in fields and is None in dataset, it will be deleted.

If dataset.etag is not None, the update will only succeed if the dataset on the server has the same ETag. Thus reading a dataset with get_dataset, changing its fields, and then passing it to update_dataset will ensure that the changes will only be saved if no modifications to the dataset occurred since the read.

Parameters

Name	Description
dataset	`google.cloud.bigquery.dataset.Dataset` The dataset to update.
fields	`Sequence[str]` The properties of `dataset` to change (e.g. "friendly_name").
retry	`google.api_core.retry.Retry, optional` How to retry the RPC.

Returns

Type	Description
google.cloud.bigquery.dataset.Dataset	The modified ``Dataset`` instance.

update_model

update_model(model, fields, retry=<google.api_core.retry.Retry object>)

[Beta] Change some fields of a model.

Use fields to specify which fields to update. At least one field must be provided. If a field is listed in fields and is None in model, the field value will be deleted.

If model.etag is not None, the update will only succeed if the model on the server has the same ETag. Thus reading a model with get_model, changing its fields, and then passing it to update_model will ensure that the changes will only be saved if no modifications to the model occurred since the read.

Parameters

Name	Description
model	`google.cloud.bigquery.model.Model` The model to update.
fields	`Sequence[str]` The fields of `model` to change, spelled as the Model properties (e.g. "friendly_name").
retry	`google.api_core.retry.Retry` (Optional) A description of how to retry the API call.

Returns

Type	Description
google.cloud.bigquery.model.Model	The model resource returned from the API call.

update_routine

update_routine(routine, fields, retry=<google.api_core.retry.Retry object>)

[Beta] Change some fields of a routine.

Use fields to specify which fields to update. At least one field must be provided. If a field is listed in fields and is None in routine, the field value will be deleted.

.. warning:: During beta, partial updates are not supported. You must provide all fields in the resource.

If xref_etag is not None, the update will only succeed if the resource on the server has the same ETag. Thus reading a routine with xref_get_routine, changing its fields, and then passing it to this method will ensure that the changes will only be saved if no modifications to the resource occurred since the read.

Parameters

Name	Description
routine	`google.cloud.bigquery.routine.Routine` The routine to update.
fields	`Sequence[str]` The fields of `routine` to change, spelled as the Routine properties (e.g. `type_`).
retry	`google.api_core.retry.Retry` (Optional) A description of how to retry the API call.

Returns

Type	Description
google.cloud.bigquery.routine.Routine	The routine resource returned from the API call.

update_table

update_table(table, fields, retry=<google.api_core.retry.Retry object>)

Change some fields of a table.

Use fields to specify which fields to update. At least one field must be provided. If a field is listed in fields and is None in table, the field value will be deleted.

If table.etag is not None, the update will only succeed if the table on the server has the same ETag. Thus reading a table with get_table, changing its fields, and then passing it to update_table will ensure that the changes will only be saved if no modifications to the table occurred since the read.

Parameters

Name	Description
table	`google.cloud.bigquery.table.Table` The table to update.
fields	`Sequence[str]` The fields of `table` to change, spelled as the Table properties (e.g. "friendly_name").
retry	`google.api_core.retry.Retry` (Optional) A description of how to retry the API call.

Returns

Type	Description
google.cloud.bigquery.table.Table	The table resource returned from the API call.