The Vertex AI RAG Engine is a component of the Vertex AI platform, which facilitates Retrieval-Augmented Generation (RAG). RAG Engine enables Large Language Models (LLMs) to access and incorporate data from external knowledge sources, such as documents and databases. By using RAG, LLMs can generate more accurate and informative LLM responses.
Example syntax
This section provides syntax to create a RAG corpus.
curl
PROJECT_ID: Your project ID. LOCATION: The region to process the request. curl -X POST \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora\ -d '{ "display_name" : "...", "description": "..." }'
Python
corpus = rag.create_corpus(display_name=..., description=...) print(corpus)
Parameters list
This section lists the following:
Parameters | Examples |
---|---|
See Corpus management parameters. | See Corpus management examples. |
See File management parameters. | See File management examples. |
Corpus management parameters
For information about a RAG corpus, see Corpus management.
Create a RAG corpus
This table lists the parameters used to create a RAG corpus.
Body Request
Parameters | |
---|---|
|
Required: The display name of the RAG corpus. |
|
Optional: The description of the RAG corpus. |
|
Optional: Immutable: The configuration for the Vector DBs. |
RagVectorDbConfig
Parameters | |
---|---|
|
If no vector database is specified, |
|
Specifies your Pinecone instance. |
|
This is the name used to create the Pinecone index that's used with the RAG corpus. This value can't be changed after it's set. You can leave it empty in
the |
|
Specifies your Vertex Vector Search instance. |
|
This is the resource name of the Vector Search index that's used with the RAG corpus. Format: This value can't be changed after it's set. You can leave it empty in
the |
|
This is the resource name of the Vector Search index endpoint that's used with the RAG corpus. Format: This value can't be changed after it's set. You can leave it empty in
the |
|
This the full resource name of the secret that is stored in Secret Manager, which contains your Pinecone API key. Format: You can leave it empty in the |
|
Optional: Immutable: The embedding model to use for the RAG corpus. This value can't be changed after it's set. If you leave it empty, we use text-embedding-004 as the embedding model. |
Update a RAG corpus
This table lists the parameters used to update a RAG corpus.
Body Request
Parameters | |
---|---|
|
Optional: The display name of the RAG corpus. |
|
Optional: The description of the RAG corpus. |
|
This is the name used to create the Pinecone index that's used with the RAG corpus. If your |
|
This is the resource name of the Vector Search index that's used with the RAG corpus. Format: If your |
|
This is the resource name of the Vector Search index endpoint that's used with the RAG corpus. Format: If your |
|
The full resource name of the secret that is stored in Secret Manager, which contains your Pinecone API key. Format: |
List RAG corpora
This table lists the parameters used to list RAG corpora.
Parameters | |
---|---|
|
Optional: The standard list page size. |
|
Optional: The standard list page token. Typically obtained from |
Get a RAG corpus
This table lists parameters used to get a RAG corpus.
Parameters | |
---|---|
|
The name of the |
Delete a RAG corpus
This table lists parameters used to delete a RAG corpus.
Parameters | |
---|---|
|
The name of the |
File management parameters
For information about a RAG file, see File management.
Upload a RAG file
This table lists parameters used to upload a RAG file.
Body Request
Parameters | |
---|---|
|
The name of the |
|
Required: The file to upload. |
|
Required: The configuration for the |
RagFile |
|
---|---|
|
Required: The display name of the RAG file. |
|
Optional: The description of the RAG file. |
UploadRagFileConfig |
|
---|---|
|
Number of tokens each chunk has. |
|
The overlap between chunks. |
Import RAG files
This table lists parameters used to import a RAG file.
Parameters | |
---|---|
|
Required: The name of the Format: |
|
Cloud Storage location. Supports importing individual files as well as entire Cloud Storage directories. |
|
Cloud Storage URI that contains the upload file. |
|
Google Drive location. Supports importing individual files as well as Google Drive folders. |
|
The slack channel where the file is uploaded. |
|
The Jira query where the file is uploaded. |
|
The SharePoint sources where the file is uploaded. |
|
Number of tokens each chunk has. |
|
The overlap between chunks. |
|
Optional: The maximum number of queries per minute that this job is allowed to make to the embedding model specified on the corpus. This value is specific to this job and not shared across other import jobs. Consult the Quotas page on the project to set an appropriate value. If unspecified, a default value of 1,000 QPM is used. |
GoogleDriveSource |
|
---|---|
|
Required: The ID of the Google Drive resource. |
|
Required: The type of the Google Drive resource. |
SlackSource |
|
---|---|
|
Repeated: Slack channel information, include ID and time range to import. |
|
Required: The Slack channel ID. |
|
Optional: The starting timestamp for messages to import. |
|
Optional: The ending timestamp for messages to import. |
|
Required: The full resource name of the secret that is stored in Secret Manager,
which contains a Slack channel access token that has access to the slack channel IDs.
Format: |
JiraSource |
|
---|---|
|
Repeated: A list of Jira projects to import in their entirety. |
|
Repeated: A list of custom Jira queries to import. For information about JQL (Jira Query Language), see
|
|
Required: The Jira email address. |
|
Required: The Jira server URI. |
|
Required: The full resource name of the secret that is stored in Secret Manager,
which contains Jira API key that has access to the slack channel IDs.
Format: |
SharePointSources |
|
---|---|
|
The path of the SharePoint folder to download from. |
|
The ID of the SharePoint folder to download from. |
|
The name of the drive to download from. |
|
The ID of the drive to download from. |
|
The Application ID for the app registered in Microsoft Azure Portal.
|
|
Required: The full resource name of the secret that is stored in Secret Manager, which contains the application secret for the app registered in Azure. Format: |
|
Unique identifier of the Azure Active Directory Instance. |
|
The name of the SharePoint site to download from. This can be the site name or the site id. |
Get a RAG file
This table lists parameters used to get a RAG file.
Parameters | |
---|---|
|
The name of the |
Delete a RAG file
This table lists parameters used to delete a RAG file.
Parameters | |
---|---|
|
The name of the |
Retrieval and prediction
This section lists the retrieval and prediction parameters.
Retrieval parameters
This table lists parameters for RetrieveContexts
API.
Parameters | |
---|---|
|
Required: The resource name of the Location to retrieve Format: |
|
The data source for Vertex RagStore. |
|
Required: Single RAG retrieve query. |
VertexRagStore
VertexRagStore |
|
---|---|
|
list: The representation of the RAG source. It can be used to specify the corpus
only or |
|
Optional:
Format: |
|
list: A list of Format: |
RagQuery |
|
---|---|
|
The query in text format to get relevant contexts. |
|
Optional: The retrieval configuration for the query. |
RagRetrievalConfig |
|
---|---|
|
Optional: The number of contexts to retrieve. |
|
Only returns contexts with a vector distance smaller than the threshold. |
|
Only returns contexts with vector similarity larger than the threshold. |
Prediction parameters
This table lists prediction parameters.
GenerateContentRequest |
|
---|---|
|
Set to use a data source powered by Vertex AI RAG store. |
See VertexRagStore for details.
Corpus management examples
This section provides examples of how to use the API to manage your RAG corpus.
Create a RAG corpus example
These code samples demonstrate how to create a RAG corpus.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- CORPUS_DISPLAY_NAME: The display name of the RAG corpus.
- CORPUS_DESCRIPTION: The description of the RAG corpus.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora
Request JSON body:
{
"display_name" : "CORPUS_DISPLAY_NAME",
"description": "CORPUS_DESCRIPTION",
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json, and run the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora"
Powershell
Save the request body in a file named request.json, and run the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora" | Select-Object -Expand Content
You should receive a successful status code (2xx).
The following example demonstrates how to create a RAG corpus by using the REST API.
// CreateRagCorpus
// Input: LOCATION, PROJECT_ID, CORPUS_DISPLAY_NAME
// Output: CreateRagCorpusOperationMetadata
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora \
-d '{
"display_name" : "CORPUS_DISPLAY_NAME"
}'
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- CORPUS_DISPLAY_NAME: The display name of the RAG corpus.
- CORPUS_DESCRIPTION: The description of the RAG corpus.
from vertexai import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
display_name = "CORPUS_DISPLAY_NAME"
description = "CORPUS_DESCRIPTION"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="LOCATION")
# Configure embedding model
embedding_model_config = rag.EmbeddingModelConfig(
publisher_model="publishers/google/models/text-embedding-004"
)
corpus = rag.create_corpus(
display_name=display_name,
description=description,
embedding_model_config=embedding_model_config,
)
print(corpus)
# Example response:
# RagCorpus(name='projects/1234567890/locations/us-central1/ragCorpora/1234567890',
# display_name='test_corpus', description='Corpus Description', embedding_model_config=...
# ...
Update a RAG corpus example
You can update your RAG corpus with a new display name, description, and vector database configuration. However, you can't change the following parameters in your RAG corpus:
- The vector database type. For example, you can't change the vector database from Weaviate to Vertex AI Feature Store.
- If you're using the managed database option, you can't update the vector database configuration.
These examples demonstrate how to update a RAG corpus.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- CORPUS_ID: The corpus ID of your RAG corpus.
- CORPUS_DISPLAY_NAME: The display name of the RAG corpus.
- CORPUS_DESCRIPTION: The description of the RAG corpus.
- INDEX_NAME: The resource name of the
Vector Search Index. Format:
projects/{project}/locations/{location}/indexes/{index}
. - INDEX_ENDPOINT_NAME: The resource name of the
Vector Search index endpoint. Format:
projects/{project}/locations/{location}/indexEndpoints/{index_endpoint}
.
HTTP method and URL:
PATCH https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID
Request JSON body:
{
"display_name" : "CORPUS_DISPLAY_NAME",
"description": "CORPUS_DESCRIPTION",
"rag_vector_db_config": {
"vertex_vector_search": {
"index": "INDEX_NAME",
"index_endpoint": "INDEX_ENDPOINT_NAME",
}
}
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json, and run the following command:
curl -X PATCH \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID"
Powershell
Save the request body in a file named request.json, and run the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method PATCH `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/CORPUS_ID" | Select-Object -Expand Content
You should receive a successful status code (2xx).
List RAG corpora example
These code samples demonstrate how to list all of the RAG corpora.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- PAGE_SIZE: The standard list page size. You might adjust
the number of RAG corpora to return per page by updating the
page_size
parameter. - PAGE_TOKEN: The standard list page token. Obtained
typically using
ListRagCorporaResponse.next_page_token
of the previousVertexRagDataService.ListRagCorpora
call.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN
To send your request, choose one of these options:
curl
Run the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"
Powershell
Run the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
You should receive a successful status code (2xx
) and a list of RAG
corpora under the given PROJECT_ID
.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
from vertexai import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
LOCATION = "us-central1"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)
corpora = rag.list_corpora()
print(corpora)
# Example response:
# ListRagCorporaPager<rag_corpora {
# name: "projects/[PROJECT_ID]/locations/us-central1/ragCorpora/2305843009213693952"
# display_name: "test_corpus"
# create_time {
# ...
Get a RAG corpus example
These code samples demonstrate how to get a RAG corpus.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the RAG corpus resource.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID
To send your request, choose one of these options:
curl
Run the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
Powershell
Run the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content
A successful response returns the RagCorpus
resource.
The get
and list
commands are used in an example to demonstrate how
RagCorpus
uses the rag_embedding_model_config
field with in the
vector_db_config
, which points to the embedding model you have chosen.
PROJECT_ID: Your project ID.
LOCATION: The region to process the request.
RAG_CORPUS_ID: The corpus ID of your RAG corpus.
```
```sh
// GetRagCorpus
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID
// Output: RagCorpus
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID
// ListRagCorpora
curl -sS -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/
```
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the RAG corpus resource.
from vertexai import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
LOCATION = "LOCATION"
corpus_name = "projects/{PROJECT_ID}/locations/{LOCATION}/ragCorpora/{rag_corpus_id}"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location=LOCATION)
corpus = rag.get_corpus(name=corpus_name)
print(corpus)
# Example response:
# RagCorpus(name='projects/[PROJECT_ID]/locations/us-central1/ragCorpora/1234567890',
# display_name='test_corpus', description='Corpus Description',
# ...
Delete a RAG corpus example
These code samples demonstrate how to delete a RAG corpus.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource.
HTTP method and URL:
DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID
To send your request, choose one of these options:
curl
Run the following command:
curl -X DELETE \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
Powershell
Run the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID" | Select-Object -Expand Content
A successful response returns the DeleteOperationMetadata
.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
from vertexai import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
corpus_name = "projects/{PROJECT_ID}/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="LOCATION")
rag.delete_corpus(name=corpus_name)
print(f"Corpus {corpus_name} deleted.")
# Example response:
# Successfully deleted the RagCorpus.
# Corpus projects/[PROJECT_ID]/locations/us-central1/ragCorpora/123456789012345 deleted. import rag
File management examples
This section provides examples of how to use the API to manage RAG files.
Upload a RAG file example
These code samples demonstrate how to upload a RAG file.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The corpus ID of your RAG corpus.
- LOCAL_FILE_PATH: The local path to the file to be uploaded.
- DISPLAY_NAME: The display name of the RAG file.
- DESCRIPTION: The description of the RAG file.
To send your request, use the following command:
curl -X POST \
-H "X-Goog-Upload-Protocol: multipart" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-F metadata="{'rag_file': {'display_name':' DISPLAY_NAME', 'description':'DESCRIPTION'}}" \
-F file=@LOCAL_FILE_PATH \
"https://LOCATION-aiplatform.googleapis.com/upload/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:upload"
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The corpus ID of your RAG corpus.
- LOCAL_FILE_PATH: The local path to the file to be uploaded.
- DISPLAY_NAME: The display name of the RAG file.
- DESCRIPTION: The description of the RAG file.
from vertexai import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
corpus_name = "projects/{PROJECT_ID}/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
path = "path/to/local/file.txt"
display_name = "file_display_name"
description = "file description"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="LOCATION")
rag_file = rag.upload_file(
corpus_name=corpus_name,
path=path,
display_name=display_name,
description=description,
)
print(rag_file)
# RagFile(name='projects/[PROJECT_ID]/locations/us-central1/ragCorpora/1234567890/ragFiles/09876543',
# display_name='file_display_name', description='file description')
Import RAG files example
Files and folders can be imported from Drive or
Cloud Storage. You can use response.metadata
to view partial
failures, request time, and response time in the SDK's response
object.
The response.skipped_rag_files_count
refers to the number of files that
were skipped during import. A file is skipped when the following conditions are
met:
- The file has already been imported.
- The file hasn't changed.
- The chunking configuration for the file hasn't changed.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The corpus ID of your RAG corpus.
- FOLDER_RESOURCE_ID: The resource ID of your Drive folder.
- GCS_URIS: A list of Cloud Storage locations.
Example:
gs://my-bucket1
. - CHUNK_SIZE: Number of tokens each chunk should have.
- CHUNK_OVERLAP: Number of tokens overlap between chunks.
- EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAG's access to your embedding model. Example: 1,000.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import
Request JSON body:
{
"import_rag_files_config": {
"gcs_source": {
"uris": "GCS_URIS"
},
"rag_file_chunking_config": {
"chunk_size": "CHUNK_SIZE",
"chunk_overlap": "CHUNK_OVERLAP"
}
}
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json, and run the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import"
Powershell
Save the request body in a file named request.json, and run the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import" | Select-Object -Expand Content
A successful response returns the ImportRagFilesOperationMetadata
resource.
The following sample demonstrates how to import a file from
Cloud Storage. Use the max_embedding_requests_per_min
control field
to limit the rate at which RAG Engine calls the embedding model during the
ImportRagFiles
indexing process. The field has a default value of 1000
calls
per minute.
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The corpus ID of your RAG corpus.
- GCS_URIS: A list of Cloud Storage locations.
Example:
gs://my-bucket1
. - CHUNK_SIZE: Number of tokens each chunk should have.
- CHUNK_OVERLAP: Number of tokens overlap between chunks.
- EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAGs access to your embedding model. Example: 1,000.
// ImportRagFiles
// Import a single Cloud Storage file or all files in a Cloud Storage bucket.
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, GCS_URIS
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
-d '{
"import_rag_files_config": {
"gcs_source": {
"uris": "GCS_URIS"
},
"rag_file_chunking_config": {
"chunk_size": CHUNK_SIZE,
"chunk_overlap": CHUNK_OVERLAP
},
"max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
}
}'
The following sample demonstrates how to import a file from
Drive. Use the max_embedding_requests_per_min
control field to
limit the rate at which RAG Engine calls the embedding model during the
ImportRagFiles
indexing process. The field has a default value of 1000
calls
per minute.
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The corpus ID of your RAG corpus.
- FOLDER_RESOURCE_ID: The resource ID of your Drive folder.
- CHUNK_SIZE: Number of tokens each chunk should have.
- CHUNK_OVERLAP: Number of tokens overlap between chunks.
- EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAG's access to your embedding model. Example: 1,000.
// ImportRagFiles
// Import all files in a Google Drive folder.
// Input: LOCATION, PROJECT_ID, RAG_CORPUS_ID, FOLDER_RESOURCE_ID
// Output: ImportRagFilesOperationMetadataNumber
// Use ListRagFiles to find the server-generated rag_file_id.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles:import \
-d '{
"import_rag_files_config": {
"google_drive_source": {
"resource_ids": {
"resource_id": "FOLDER_RESOURCE_ID",
"resource_type": "RESOURCE_TYPE_FOLDER"
}
},
"max_embedding_requests_per_min": EMBEDDING_MODEL_QPM_RATE
}
}'
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The corpus ID of your RAG corpus.
- FOLDER_RESOURCE_ID: The resource ID of your Drive folder.
- CHUNK_SIZE: Number of tokens each chunk should have.
- CHUNK_OVERLAP: Number of tokens overlap between chunks.
- EMBEDDING_MODEL_QPM_RATE: The QPM rate to limit RAG's access to your embedding model. Example: 1,000.
from vertexai import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
corpus_name = "projects/{PROJECT_ID}/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
paths = ["https://drive.google.com/file/123", "gs://my_bucket/my_files_dir"]
# Supports Google Cloud Storage and Google Drive Links
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="LOCATION")
response = rag.import_files(
corpus_name=corpus_name,
paths=paths,
chunk_size=512, # Optional
chunk_overlap=100, # Optional
max_embedding_requests_per_min=900, # Optional
)
print(f"Imported {response.imported_rag_files_count} files.")
# Example response:
# Imported 2 files.
List RAG files example
These code samples demonstrate how to list RAG files.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource. - PAGE_SIZE: The standard list page size. You might adjust
the number of
RagFiles
to return per page by updating the page_size parameter. - PAGE_TOKEN: The standard list page token. Obtained using
ListRagFilesResponse.next_page_token
of the previousVertexRagDataService.ListRagFiles
call.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN
To send your request, choose one of these options:
curl
Run the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN"
Powershell
Run the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles?page_size=PAGE_SIZE&page_token=PAGE_TOKEN" | Select-Object -Expand Content
You should receive a successful status code (2xx) along with a list of
RagFiles
under the given RAG_CORPUS_ID
.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
Replace the following variables used in the code sample:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource. - PAGE_SIZE: The standard list page size. You might adjust
the number of
RagFiles
to return per page by updating the page_size parameter. - PAGE_TOKEN: The standard list page token. Obtained using
ListRagFilesResponse.next_page_token
of the previousVertexRagDataService.ListRagFiles
call.
from vertexai import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
corpus_name = "projects/{PROJECT_ID}/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="LOCATION")
files = rag.list_files(corpus_name=corpus_name)
for file in files:
print(file.display_name)
print(file.name)
# Example response:
# g-drive_file.txt
# projects/1234567890/locations/us-central1/ragCorpora/111111111111/ragFiles/222222222222
# g_cloud_file.txt
# projects/1234567890/locations/us-central1/ragCorpora/111111111111/ragFiles/333333333333
Get a RAG file example
These code samples demonstrate how to get a RAG file.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource. - RAG_FILE_ID: The ID of the
RagFile
resource.
HTTP method and URL:
GET https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID
To send your request, choose one of these options:
curl
Run the following command:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"
Powershell
Run the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
A successful response returns the RagFile
resource.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the
RagCorpus
resource. - RAG_FILE_ID: The ID of the
RagFile
resource.
from vertexai import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
file_name = "projects/{PROJECT_ID}/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="LOCATION")
rag_file = rag.get_file(name=file_name)
print(rag_file)
# Example response:
# RagFile(name='projects/1234567890/locations/LOCATION/ragCorpora/11111111111/ragFiles/22222222222',
# display_name='file_display_name', description='file description')
Delete a RAG file example
These code samples demonstrate how to delete a RAG file.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID>: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the RagCorpus resource.
- RAG_FILE_ID: The ID of the RagFile resource. Format:
projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}
.
HTTP method and URL:
DELETE https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID
To send your request, choose one of these options:
curl
Run the following command:
curl -X DELETE \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"
Powershell
Run the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method DELETE `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID" | Select-Object -Expand Content
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
- PROJECT_ID>: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_ID: The ID of the RagCorpus resource.
- RAG_FILE_ID: The ID of the RagFile resource. Format:
projects/{project}/locations/{location}/ragCorpora/{rag_corpus}/ragFiles/{rag_file_id}
.
from vertexai import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
file_name = "projects/{PROJECT_ID}/locations/LOCATION/ragCorpora/RAG_CORPUS_ID/ragFiles/RAG_FILE_ID"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="LOCATION")
rag.delete_file(name=file_name)
print(f"File {file_name} deleted.")
# Example response:
# Successfully deleted the RagFile.
# File projects/1234567890/locations/us-central1/ragCorpora/1111111111/ragFiles/2222222222 deleted.
Retrieval query
When a user asks a question or provides a prompt, the retrieval component in RAG searches through its knowledge base to find information that is relevant to the query.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_RESOURCE: The name of the
RagCorpus
resource. Format:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
. - VECTOR_DISTANCE_THRESHOLD: Only contexts with a vector distance smaller than the threshold are returned.
- TEXT: The query text to get relevant contexts.
- SIMILARITY_TOP_K: The number of top contexts to retrieve.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts
Request JSON body:
{
"vertex_rag_store": {
"rag_resources": {
"rag_corpus": "RAG_CORPUS_RESOURCE"
},
"vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
},
"query": {
"text": TEXT
"similarity_top_k": SIMILARITY_TOP_K
}
}
curl
Save the request body in a file named request.json, and run the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts"
Powershell
Save the request body in a file named request.json, and run the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION:retrieveContexts" | Select-Object -Expand Content
You should receive a successful status code (2xx) and a list of related
RagFiles
.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- RAG_CORPUS_RESOURCE: The name of the
RagCorpus
resource. Format:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
. - VECTOR_DISTANCE_THRESHOLD: Only contexts with a vector distance smaller than the threshold are returned.
- TEXT: The query text to get relevant contexts.
- SIMILARITY_TOP_K: The number of top contexts to retrieve.
from vertexai import rag
import vertexai
PROJECT_ID = "PROJECT_ID"
corpus_name = "projects/[PROJECT_ID]/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="LOCATION")
response = rag.retrieval_query(
rag_resources=[
rag.RagResource(
rag_corpus=corpus_name,
# Optional: supply IDs from `rag.list_files()`.
# rag_file_ids=["rag-file-1", "rag-file-2", ...],
)
],
text="TEXT",
similarity_top_k=SIMILARITY_TOP_K, # Optional
vector_distance_threshold=VECTOR_DISTANCE_THRESHOLD, # Optional
)
print(response)
# Example response:
# contexts {
# contexts {
# source_uri: "gs://your-bucket-name/file.txt"
# text: "....
# ....
Generation
The LLM generates a grounded response using the retrieved contexts.
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- MODEL_ID: LLM model for content generation. Example:
gemini-1.5-pro-002
. - GENERATION_METHOD: LLM method for content generation.
Options:
generateContent
,streamGenerateContent
. - INPUT_PROMPT: The text sent to the LLM for content generation. Try to use a prompt relevant to the uploaded rag Files.
- RAG_CORPUS_RESOURCE: The name of the
RagCorpus
resource. Format:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
. - SIMILARITY_TOP_K: Optional: The number of top contexts to retrieve.
- VECTOR_DISTANCE_THRESHOLD: Optional: Contexts with a vector distance smaller than the threshold are returned.
- USER: Your username.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD
Request JSON body:
{
"contents": {
"role": "USER",
"parts": {
"text": "INPUT_PROMPT"
}
},
"tools": {
"retrieval": {
"disable_attribution": false,
"vertex_rag_store": {
"rag_resources": {
"rag_corpus": "RAG_CORPUS_RESOURCE"
},
"similarity_top_k": "SIMILARITY_TOP_K",
"vector_distance_threshold": VECTOR_DISTANCE_THRESHOLD
}
}
}
}
To send your request, choose one of these options:
curl
Save the request body in a file named request.json, and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD"
Powershell
Save the request body in a file named request.json, and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:GENERATION_METHOD" | Select-Object -Expand Content
A successful response returns the generated content with citations.
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python. For more information, see the Python API reference documentation.
- PROJECT_ID: Your project ID.
- LOCATION: The region to process the request.
- MODEL_ID: LLM model for content generation. Example:
gemini-1.5-pro-002
. - GENERATION_METHOD: LLM method for content generation.
Options:
generateContent
,streamGenerateContent
. - INPUT_PROMPT: The text sent to the LLM for content generation. Try to use a prompt relevant to the uploaded rag Files.
- RAG_CORPUS_RESOURCE: The name of the
RagCorpus
resource. Format:projects/{project}/locations/{location}/ragCorpora/{rag_corpus}
. - SIMILARITY_TOP_K: Optional: The number of top contexts to retrieve.
- VECTOR_DISTANCE_THRESHOLD: Optional: Contexts with a vector distance smaller than the threshold are returned.
from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool
import vertexai
PROJECT_ID = "PROJECT_ID"
corpus_name = "projects/{PROJECT_ID}/locations/LOCATION/ragCorpora/RAG_CORPUS_ID"
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="LOCATION")
rag_retrieval_tool = Tool.from_retrieval(
retrieval=rag.Retrieval(
source=rag.VertexRagStore(
rag_resources=[
rag.RagResource(
rag_corpus="RAG_CORPUS_RESOURCE",
# Optional: supply IDs from `rag.list_files()`.
# rag_file_ids=["rag-file-1", "rag-file-2", ...],
)
],
similarity_top_k=SIMILARITY_TOP_K, # Optional
vector_distance_threshold=VECTOR_DISTANCE_THRESHOLD, # Optional
),
)
)
rag_model = GenerativeModel(
model_name="MODEL_ID", tools=[rag_retrieval_tool]
)
response = rag_model.generate_content("Why is the sky blue?")
print(response.text)
# Example response:
# The sky appears blue due to a phenomenon called Rayleigh scattering.
# Sunlight, which contains all colors of the rainbow, is scattered
# by the tiny particles in the Earth's atmosphere....
# ...
What's next
- To learn more about supported generation models, see Generative AI models that support RAG.
- To learn more about supported embedding models, see Embedding models.
- To learn more about open models, see Open models.
- To learn more about RAG Engine, see
RAG Engine overview.