The CREATE MODEL statement for remote models over Vertex AI hosted models
This document describes the CREATE MODEL
statement for creating remote models
in BigQuery over models deployed to
Vertex AI.
CREATE MODEL
syntax
{CREATE MODEL | CREATE MODEL IF NOT EXISTS | CREATE OR REPLACE MODEL} `project_id.dataset.model_name` INPUT (field_name field_type) OUTPUT (field_name field_type) REMOTE WITH CONNECTION `project_id.region.connection_id` OPTIONS(ENDPOINT = vertex_ai_https_endpoint);
CREATE MODEL
Creates and trains a new model in the specified dataset. If the model name
exists, CREATE MODEL
returns an error.
CREATE MODEL IF NOT EXISTS
Creates and trains a new model only if the model doesn't exist in the specified dataset.
CREATE OR REPLACE MODEL
Creates and trains a model and replaces an existing model with the same name in the specified dataset.
model_name
The name of the model you're creating or replacing. The model name must be unique in the dataset: no other model or table can have the same name. The model name must follow the same naming rules as a BigQuery table. A model name can:
- Contain up to 1,024 characters
- Contain letters (upper or lower case), numbers, and underscores
model_name
is not case-sensitive.
If you don't have a default project configured, then you must prepend the project ID to the model name in the following format, including backticks:
`[PROJECT_ID].[DATASET].[MODEL]`
For example, `myproject.mydataset.mymodel`.
INPUT
and OUTPUT
clauses
You must specify the INPUT
and OUTPUT
clauses when you create a remote
model with an HTTPS endpoint. The INPUT
clause must contain the fields needed
for the Vertex AI endpoint request, and the OUTPUT
clause must
contain the fields needed for the Vertex AI endpoint response.
Supported data types
You can use the following BigQuery data types in the INPUT
and
OUTPUT
clauses:
Field name format
The INPUT
and OUTPUT
field names must be identical as the field names of
the Vertex AI endpoint request and response. For a Vertex AI
endpoint with a single OUTPUT
, there is no field name in the response, and
therefore you can specify any field name in the OUTPUT
statement.
Example
If the Vertex AI request looks like the following example:
{
"instances": [
{ "f1": 10, "f2": 12.3, "f3": "abc", "f4": [1, 2, 3, 4] },
{ "f1": 40, "f2": 32.5, "f3": "def", "f4": [11, 12, 13, 14] },
]
}
The INPUT
statement must be:
INPUT(f1 INT64, f2 FLOAT64, f3 STRING, f4 ARRAY<INT64>)
If the Vertex AI response looks like the following example:
{
"predictions": [
{
"out1": 300,
"out2": 40
},
{
"out1": 200,
"out2": 30
}
]
}
The OUTPUT
statement must be:
OUTPUT(out1 INT64, out2 INT64)
REMOTE WITH CONNECTION
Syntax
`[PROJECT_ID].[LOCATION].[CONNECTION_ID]`
BigQuery uses a Cloud resource connection to interact with the Vertex AI endpoint.
The connection elements are as follows:
PROJECT_ID
: the project ID of the project that contains the connection.LOCATION
: the location used by the connection. The connection must be in the same location as the dataset that contains the model.CONNECTION_ID
: the connection ID—for example,myconnection
.To find your connection ID, view the connection details in the Google Cloud console. The connection ID is the value in the last section of the fully qualified connection ID that is shown in Connection ID—for example
projects/myproject/locations/connection_location/connections/myconnection
.
Example
`myproject.us.my_connection`
ENDPOINT
Syntax
ENDPOINT = vertex_ai_https_endpoint
Description
For vertex_ai_https_endpoint
, specify the
HTTPS endpoint
that represents a model deployed to Vertex AI.
After you create a remote model based on a model deployed to
Vertex AI, you can use the model with
ML.PREDICT
to perform inference.
The following example shows how to create a remote model that uses an HTTPS endpoint:
ENDPOINT = 'https://us-central1-aiplatform.googleapis.com/v1/projects/myproject/locations/us-central1/endpoints/1234'
Example
The following example creates a BigQuery ML remote model over a model deployed to a Vertex AI endpoint:
CREATE MODEL `project_id.mydataset.mymodel` INPUT(f1 INT64, f2 FLOAT64, f3 STRING, f4 ARRAY) OUTPUT(out1 INT64, out2 INT64) REMOTE WITH CONNECTION `myproject.us.test_connection` OPTIONS(ENDPOINT = 'https://us-central1-aiplatform.googleapis.com/v1/projects/myproject/locations/us-central1/endpoints/1234')
What's next
For more information about the supported SQL statements and functions for remote models that use HTTPS endpoints, see End-to-end user journey for each model.