Class QueryJob (3.4.0)

QueryJob(job_id, query, client, job_config=None)

Asynchronous job: query tables.

Parameters
Name	Description
`job_id`	`str` the job's ID, within the project belonging to `client`.
`query`	`str` SQL query string.
`client`	`google.cloud.bigquery.client.Client` A client which holds credentials and project configuration for the dataset (which requires a project).
`job_config`	`Optional[google.cloud.bigquery.job.QueryJobConfig]` Extra configuration options for the query job.

Inheritance

builtins.object > google.api_core.future.base.Future > google.api_core.future.polling.PollingFuture > google.cloud.bigquery.job.base._AsyncJob > QueryJob

Properties

allow_large_results

See allow_large_results.

billing_tier

Return billing tier from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.billing_tier

Returns
Type	Description
`Optional[int]`	Billing tier used by the job, or None if job is not yet complete.

cache_hit

Return whether or not query results were served from cache.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.cache_hit

Returns
Type	Description
`Optional[bool]`	whether the query results were returned from cache, or None if job is not yet complete.

clustering_fields

See clustering_fields.

connection_properties

See connection_properties.

.. versionadded:: 2.29.0

create_disposition

See create_disposition.

create_session

See create_session.

.. versionadded:: 2.29.0

created

Datetime at which the job was created.

Returns
Type	Description
`Optional[datetime.datetime]`	the creation time (None until set from the server).

ddl_operation_performed

Optional[str]: Return the DDL operation performed.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.ddl_operation_performed

ddl_target_routine

Optional[google.cloud.bigquery.routine.RoutineReference]: Return the DDL target routine, present for CREATE/DROP FUNCTION/PROCEDURE queries.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.ddl_target_routine

ddl_target_table

Optional[google.cloud.bigquery.table.TableReference]: Return the DDL target table, present for CREATE/DROP TABLE/VIEW queries.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.ddl_target_table

Custom encryption configuration (e.g., Cloud KMS keys) or :data:None if using default encryption.

See destination_encryption_configuration.

dry_run

See dry_run.

ended

Datetime at which the job finished.

Returns
Type	Description
`Optional[datetime.datetime]`	the end time (None until set from the server).

error_result

Error information about the job as a whole.

Returns
Type	Description
`Optional[Mapping]`	the error information (None until set from the server).

errors

Information about individual errors generated by the job.

Returns
Type	Description
`Optional[List[Mapping]]`	the error information (None until set from the server).

estimated_bytes_processed

Return the estimated number of bytes processed by the query.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.estimated_bytes_processed

Returns
Type	Description
`Optional[int]`	number of DML rows affected by the job, or None if job is not yet complete.

etag

ETag for the job resource.

Returns
Type	Description
`Optional[str]`	the ETag (None until set from the server).

flatten_results

See flatten_results.

job_id

str: ID of the job.

job_type

Type of job.

Returns
Type	Description
`str`	one of 'load', 'copy', 'extract', 'query'.

labels

Dict[str, str]: Labels for the job.

location

str: Location where the job runs.

maximum_billing_tier

See maximum_billing_tier.

maximum_bytes_billed

See maximum_bytes_billed.

num_child_jobs

The number of child jobs executed.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.num_child_jobs

num_dml_affected_rows

Return the number of DML rows affected by the job.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.num_dml_affected_rows

Returns
Type	Description
`Optional[int]`	number of DML rows affected by the job, or None if job is not yet complete.

parent_job_id

Return the ID of the parent job.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.parent_job_id

Returns
Type	Description
`Optional[str]`	parent job id.

path

URL path for the job's APIs.

Returns
Type	Description
`str`	the path based on project and job ID.

priority

See priority.

project

Project bound to the job.

Returns
Type	Description
`str`	the project (derived from the client).

query

str: The query text used in this query job.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery.FIELDS.query

query_parameters

See query_parameters.

query_plan

Return query plan from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.query_plan

Returns
Type	Description
`List[google.cloud.bigquery.job.QueryPlanEntry]`	mappings describing the query plan, or an empty list if the query has not yet completed.

range_partitioning

See range_partitioning.

referenced_tables

Return referenced tables from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.referenced_tables

Returns
Type	Description
`List[Dict]`	mappings describing the query plan, or an empty list if the query has not yet completed.

reservation_usage

Job resource usage breakdown by reservation.

Returns
Type	Description
`List[google.cloud.bigquery.job.ReservationUsage]`	Reservation usage stats. Can be empty if not set from the server.

schema

The schema of the results.

Present only for successful dry run of non-legacy SQL queries.

schema_update_options

See schema_update_options.

script_statistics

Statistics for a child job of a script.

self_link

URL for the job resource.

Returns
Type	Description
`Optional[str]`	the URL (None until set from the server).

session_info

[Preview] Information of the session if this job is part of one.

.. versionadded:: 2.29.0

slot_millis

Union[int, None]: Slot-milliseconds used by this query job.

started

Datetime at which the job was started.

Returns
Type	Description
`Optional[datetime.datetime]`	the start time (None until set from the server).

state

Status of the job.

Returns
Type	Description
`Optional[str]`	the state (None until set from the server).

statement_type

Return statement type from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.statement_type

Returns
Type	Description
`Optional[str]`	type of statement used by the job, or None if job is not yet complete.

table_definitions

See table_definitions.

time_partitioning

See time_partitioning.

timeline

List(TimelineEntry): Return the query execution timeline from job statistics.

total_bytes_billed

Return total bytes billed from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.total_bytes_billed

Returns
Type	Description
`Optional[int]`	Total bytes processed by the job, or None if job is not yet complete.

total_bytes_processed

Return total bytes processed from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.total_bytes_processed

Returns
Type	Description
`Optional[int]`	Total bytes processed by the job, or None if job is not yet complete.

transaction_info

Information of the multi-statement transaction if this job is part of one.

Since a scripting query job can execute multiple transactions, this property is only expected on child jobs. Use the list_jobs method with the parent_job parameter to iterate over child jobs.

.. versionadded:: 2.24.0

udf_resources

See udf_resources.

undeclared_query_parameters

Return undeclared query parameters from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.undeclared_query_parameters

Returns
Type	Description
`List[Union[ google.cloud.bigquery.query.ArrayQueryParameter, google.cloud.bigquery.query.ScalarQueryParameter, google.cloud.bigquery.query.StructQueryParameter ]]`	Undeclared parameters, or an empty list if the query has not yet completed.

use_legacy_sql

See use_legacy_sql.

use_query_cache

See use_query_cache.

user_email

E-mail address of user who submitted the job.

Returns
Type	Description
`Optional[str]`	the URL (None until set from the server).

write_disposition

See write_disposition.

bi_engine_stats

API documentation for bigquery.job.QueryJob.bi_engine_stats property.

dml_stats

API documentation for bigquery.job.QueryJob.dml_stats property.

Methods

add_done_callback

add_done_callback(fn)

Add a callback to be executed when the operation is complete.

If the operation is not already complete, this will start a helper thread to poll for the status of the operation in the background.

Parameter
Name	Description
`fn`	`Callable[Future]` The callback to execute when the operation is complete.

cancel

cancel(client=None, retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None)

API call: cancel job via a POST request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel

Parameters
Name	Description
`timeout`	`Optional[float]` The number of seconds to wait for the underlying HTTP transport before using `retry`
`client`	`Optional[google.cloud.bigquery.client.Client]` the client to use. If not passed, falls back to the `client` stored on the current dataset.
`retry`	`Optional[google.api_core.retry.Retry]` How to retry the RPC.

Returns
Type	Description
`bool`	Boolean indicating that the cancel request was sent.

cancelled

cancelled()

Check if the job has been cancelled.

This always returns False. It's not possible to check if a job was cancelled in the API. This method is here to satisfy the interface for google.api_core.future.Future.

Returns
Type	Description
`bool`	False

done

done(retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None, reload: bool = True)

Checks if the job is complete.

Parameters
Name	Description
`timeout`	`Optional[float]` The number of seconds to wait for the underlying HTTP transport before using `retry`.
`reload`	`Optional[bool]` If `True`, make an API call to refresh the job state of unfinished jobs before checking. Default `True`.
`retry`	`Optional[google.api_core.retry.Retry]` How to retry the RPC. If the job state is `DONE`, retrying is aborted early, as the job will not change anymore.

Returns
Type	Description
`bool`	True if the job is complete, False otherwise.

exception

exception(timeout=None)

Get the exception from the operation, blocking if necessary.

Parameter
Name	Description
`timeout`	`int` How long to wait for the operation to complete. If None, wait indefinitely.

Returns
Type	Description
`Optional[google.api_core.GoogleAPICallError]`	The operation's error.

exists

exists(client=None, retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None)

API call: test for the existence of the job via a GET request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameters
Name	Description
`timeout`	`Optional[float]` The number of seconds to wait for the underlying HTTP transport before using `retry`.
`client`	`Optional[google.cloud.bigquery.client.Client]` the client to use. If not passed, falls back to the `client` stored on the current dataset.
`retry`	`Optional[google.api_core.retry.Retry]` How to retry the RPC.

Returns
Type	Description
`bool`	Boolean indicating existence of the job.

from_api_repr

from_api_repr(resource: dict, client: Client)

Factory: construct a job given its API representation

Parameters
Name	Description
`resource`	`Dict` dataset job representation returned from the API
`client`	`google.cloud.bigquery.client.Client` Client which holds credentials and project configuration for the dataset.

Returns
Type	Description
`google.cloud.bigquery.job.QueryJob`	Job parsed from `resource`.

reload

reload(client=None, retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None)

API call: refresh job properties via a GET request.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameters
Name	Description
`timeout`	`Optional[float]` The number of seconds to wait for the underlying HTTP transport before using `retry`.
`client`	`Optional[google.cloud.bigquery.client.Client]` the client to use. If not passed, falls back to the `client` stored on the current dataset.
`retry`	`Optional[google.api_core.retry.Retry]` How to retry the RPC.

result

result(page_size: int = None, max_results: int = None, retry: retries.Retry = <google.api_core.retry.Retry object>, timeout: float = None, start_index: int = None, job_retry: retries.Retry = <google.api_core.retry.Retry object>)

Start the job and wait for it to complete and get the result.

Parameters
Name	Description
`page_size`	`Optional[int]` The maximum number of rows in each page of results from this request. Non-positive values are ignored.
`max_results`	`Optional[int]` The maximum total number of rows from this request.
`timeout`	`Optional[float]` The number of seconds to wait for the underlying HTTP transport before using `retry`. If multiple requests are made under the hood, `timeout` applies to each individual request.
`start_index`	`Optional[int]` The zero-based index of the starting row to read.
`retry`	`Optional[google.api_core.retry.Retry]` How to retry the call that retrieves rows. This only applies to making RPC calls. It isn't used to retry failed jobs. This has a reasonable default that should only be overridden with care. If the job state is `DONE`, retrying is aborted early even if the results are not available, as this will not change anymore.
`job_retry`	`Optional[google.api_core.retry.Retry]` How to retry failed jobs. The default retries rate-limit-exceeded errors. Passing `None` disables job retry. Not all jobs can be retried. If `job_id` was provided to the query that created this job, then the job returned by the query will not be retryable, and an exception will be raised if non-`None` non-default `job_retry` is also provided.

Exceptions
Type	Description
`google.cloud.exceptions.GoogleAPICallError`	If the job failed and retries aren't successful.
`concurrent.futures.TimeoutError`	If the job did not complete in the given timeout.
`TypeError`	If Non-`None` and non-default `job_retry` is provided and the job is not retryable.

Returns
Type	Description
`google.cloud.bigquery.table.RowIterator`	Iterator of row data Row-s. During each page, the iterator will have the `total_rows` attribute set, which counts the total number of rows in the result set (this is distinct from the total number of rows in the current page: `iterator.page.num_items`). If the query is a special query that produces no results, e.g. a DDL query, an `_EmptyRowIterator` instance is returned.

running

running()

True if the operation is currently running.

set_exception

set_exception(exception)

Set the Future's exception.

set_result

set_result(result)

Set the Future's result.

to_api_repr

to_api_repr()

Generate a resource for _begin.

to_arrow

to_arrow(
    progress_bar_type: str = None,
    bqstorage_client: Optional[bigquery_storage.BigQueryReadClient] = None,
    create_bqstorage_client: bool = True,
    max_results: Optional[int] = None,
)

[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.

Parameters
Name	Description
`progress_bar_type`	`Optional[str]` If set, use the `tqdm https://tqdm.github.io/`_ library to display a progress bar while the data downloads. Install the `tqdm` package to use this feature. Possible values of `progress_bar_type` include: `None` No progress bar. `'tqdm'` Use the `tqdm.tqdm` function to print a progress bar to :data:`sys.stdout`. `'tqdm_notebook'` Use the `tqdm.notebook.tqdm` function to display a progress bar as a Jupyter notebook widget. `'tqdm_gui'` Use the `tqdm.tqdm_gui` function to display a progress bar as a graphical dialog box.
`bqstorage_client`	`Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]` A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires `google-cloud-bigquery-storage` library. Reading from a specific partition or snapshot is not currently supported by this method.
`create_bqstorage_client`	`Optional[bool]` If `True` (default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the `bqstorage_client` parameter for more information. This argument does nothing if `bqstorage_client` is supplied. .. versionadded:: 1.24.0
`max_results`	`Optional[int]` Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0

to_dataframe

to_dataframe(
    bqstorage_client: Optional[bigquery_storage.BigQueryReadClient] = None,
    dtypes: Dict[str, Any] = None,
    progress_bar_type: str = None,
    create_bqstorage_client: bool = True,
    max_results: Optional[int] = None,
    geography_as_object: bool = False,
)

Return a pandas DataFrame from a QueryJob

Parameters
Name	Description
`bqstorage_client`	`Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]` A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires the `fastavro` and `google-cloud-bigquery-storage` libraries. Reading from a specific partition or snapshot is not currently supported by this method.
`dtypes`	`Optional[Map[str, Union[str, pandas.Series.dtype]]]` A dictionary of column names pandas `dtype`s. The provided `dtype` is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.
`progress_bar_type`	`Optional[str]` If set, use the `tqdm https://tqdm.github.io/`_ library to display a progress bar while the data downloads. Install the `tqdm` package to use this feature. See to_dataframe for details. .. versionadded:: 1.11.0
`create_bqstorage_client`	`Optional[bool]` If `True` (default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the `bqstorage_client` parameter for more information. This argument does nothing if `bqstorage_client` is supplied. .. versionadded:: 1.24.0
`max_results`	`Optional[int]` Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0
`geography_as_object`	`Optional[bool]` If `True`, convert GEOGRAPHY data to `shapely` geometry objects. If `False` (default), don't cast geography data to `shapely` geometry objects. .. versionadded:: 2.24.0

Exceptions
Type	Description
`ValueError`	If the `pandas` library cannot be imported, or the bigquery_storage_v1 module is required but cannot be imported. Also if `geography_as_object` is `True`, but the `shapely` library cannot be imported.

Returns
Type	Description
`pandas.DataFrame`	A `pandas.DataFrame` populated with row data and column headers from the query results. The column headers are derived from the destination table's schema.

to_geodataframe

to_geodataframe(
    bqstorage_client: bigquery_storage.BigQueryReadClient = None,
    dtypes: Dict[str, Any] = None,
    progress_bar_type: str = None,
    create_bqstorage_client: bool = True,
    max_results: Optional[int] = None,
    geography_column: Optional[str] = None,
)

Return a GeoPandas GeoDataFrame from a QueryJob

Parameters
Name	Description
`dtypes`	`Optional[Map[str, Union[str, pandas.Series.dtype]]]` A dictionary of column names pandas `dtype`s. The provided `dtype` is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.
`progress_bar_type`	`Optional[str]` If set, use the `tqdm https://tqdm.github.io/`_ library to display a progress bar while the data downloads. Install the `tqdm` package to use this feature. See to_dataframe for details. .. versionadded:: 1.11.0
`create_bqstorage_client`	`Optional[bool]` If `True` (default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the `bqstorage_client` parameter for more information. This argument does nothing if `bqstorage_client` is supplied. .. versionadded:: 1.24.0
`max_results`	`Optional[int]` Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0
`geography_column`	`Optional[str]` If there are more than one GEOGRAPHY column, identifies which one to use to construct a GeoPandas GeoDataFrame. This option can be ommitted if there's only one GEOGRAPHY column.
`bqstorage_client`	`Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]` A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires the `fastavro` and `google-cloud-bigquery-storage` libraries. Reading from a specific partition or snapshot is not currently supported by this method.

Exceptions
Type	Description
`ValueError`	If the `geopandas` library cannot be imported, or the bigquery_storage_v1 module is required but cannot be imported. .. versionadded:: 2.24.0

Returns
Type	Description
`geopandas.GeoDataFrame`	A `geopandas.GeoDataFrame` populated with row data and column headers from the query results. The column headers are derived from the destination table's schema.

init

__init__(job_id, query, client, job_config=None)

Initialize self. See help(type(self)) for accurate signature.

QueryJob

QueryJob(job_id, query, client, job_config=None)

Asynchronous job: query tables.

Parameters
Name	Description
`job_id`	`str` the job's ID, within the project belonging to `client`.
`query`	`str` SQL query string.
`client`	`google.cloud.bigquery.client.Client` A client which holds credentials and project configuration for the dataset (which requires a project).
`job_config`	`Optional[google.cloud.bigquery.job.QueryJobConfig]` Extra configuration options for the query job.

Class QueryJob (3.4.0)

Parameters

Inheritance

Properties

allow_large_results

billing_tier

cache_hit

clustering_fields

connection_properties

create_disposition

create_session

created

ddl_operation_performed

ddl_target_routine

ddl_target_table

default_dataset

destination

destination_encryption_configuration

dry_run

ended

error_result

errors

estimated_bytes_processed

etag

flatten_results

job_id

job_type

labels

location

maximum_billing_tier

maximum_bytes_billed

num_child_jobs

num_dml_affected_rows

parent_job_id

path

priority

project

query

query_parameters

query_plan

range_partitioning

referenced_tables

reservation_usage

schema

schema_update_options

script_statistics

self_link

session_info

slot_millis

started

state

statement_type

table_definitions

time_partitioning

timeline

total_bytes_billed

total_bytes_processed

transaction_info

udf_resources

undeclared_query_parameters

use_legacy_sql

use_query_cache

user_email

write_disposition

bi_engine_stats

dml_stats

Methods

add_done_callback

cancel

cancelled

done

exception

exists

from_api_repr

reload

result

running

set_exception

set_result

to_api_repr

init