Translation and OCR combine to provide the Document Vision Service (DVS) and document processing feature, which use the Translate Document API for directly translating formatted documents such as PDF files. Compared to plain text translations, the feature preserves the original formatting and layout in your translated documents, helping you retain much of the original context, like paragraph breaks. DVS supports document translations both inline and from storage buckets.
This quickstart guides the Application Operator (AO) through the process of using the Vertex AI Translate Document pre-trained API on Google Distributed Cloud (GDC) air-gapped.
Supported formats
DVS supports the following input file types and their associated output file types.
Inputs | Document MIME type | Output |
---|---|---|
application/pdf |
PDF, DOCX | |
DOC | application/msword |
DOC, DOCX |
DOCX | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
DOCX |
PPT | application/vnd.ms-powerpoint |
PPT, PPTX |
PPTX | application/vnd.openxmlformats-officedocument.presentationml.presentation |
PPTX |
XLS | application/vnd.ms-excel |
XLS, XLSX |
XLSX | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
XLSX |
Original and scanned PDF document translations
DVS supports both original and scanned PDF files, including translations to or from right-to-left languages. Also, DVS preserves hyperlinks, font size, and font color from files.
Before you begin
Follow these steps before trying DVS:
Create the
dvs-project
project. For information about creating and using projects, see Create a project.Alternatively, you can create the project using a custom resource (CR):
apiVersion: resourcemanager.gdc.goog/v1 kind: Project metadata: labels: atat.config.google.com/clin-number: CLIN_NUMBER atat.config.google.com/task-order-number: TASK_ORDER_NUMBER name: dvs-project namespace: platform
Ask your Project IAM Admin to grant you the AI Translation Developer (
ai-translation-developer
) role in thedvs-project
project namespace. For more information, see Grant access to project resources.Download the gdcloud command-line interface (CLI).
Install Vertex AI client libraries. You must download the Vision and Translation client libraries according to your operating system.
Set up your service account
Set up your service account with a name, project ID, and service key.
${HOME}/gdcloud init # set URI and project
${HOME}/gdcloud auth login
${HOME}/gdcloud iam service-accounts create SERVICE_ACCOUNT --project=PROJECT_ID
${HOME}/gdcloud iam service-accounts keys create "SERVICE_KEY".json --project=PROJECT_ID --iam-account=SERVICE_ACCOUNT
Replace the following:
SERVICE_ACCOUNT
: the name you want to give to your service account.PROJECT_ID
: your project ID number.SERVICE_KEY
: the name of the JSON file for the service key.
Grant access to project resources
Grant access to the Translation API service account by providing
your project ID, name of your service account, and the role
ai-translation-developer
.
${HOME}/gdcloud iam service-accounts add-iam-policy-binding --project=PROJECT_ID --iam-account=SERVICE_ACCOUNT --role=role/ai-translation-developer
Authenticate the gdcloud CLI
You must get a token to authenticate the gdcloud CLI before sending requests to the Translation pre-trained services. Follow these steps:
Install the
google-auth
client library.pip install google-auth
Save the following code to a Python script.
import google.auth from google.auth.transport import requests api_endpoint = "https://ENDPOINT.GDC_URL" creds, project_id = google.auth.default() creds = creds.with_gdch_audience(api_endpoint) def test_get_token(): req = requests.Request() creds.refresh(req) print(creds.token) if __name__=="__main__": test_get_token()
Replace the following:
ENDPOINT
: the Translation endpoint.GDC_URL
: the URL of your organization in Distributed Cloud, for example,org-1.zone1.gdch.test
.
For more information, see View service statuses and endpoints.
Run the script to fetch the token.
For any grpcurl
or curl
request, you must replace TOKEN
with the fetched token in the header as in the following example:
-H "Authorization: Bearer TOKEN"
Translate documents
DVS in Distributed Cloud provides the following two types of document translations:
Translate a document from a storage bucket
To translate a document that is stored in a bucket, follow these steps:
Prepare your environment
Before using the Translation API to detect text offline, follow these steps:
- Create a storage bucket in the
dvs-project
project, using theStandard
class. - Grant
read
andwrite
permissions on the bucket to the Vertex AI Translation system service account (ai-translation-system-sa
) used by the Translation service.
Alternatively, you can follow these steps to create the storage bucket, role, and role binding using custom resources (CR):
Create the storage bucket by deploying a
Bucket
CR in thedvs-project
namespace:apiVersion: object.gdc.goog/v1 kind: Bucket metadata: name: dvs-bucket namespace: dvs-project spec: description: bucket for document vision service storageClass: Standard bucketPolicy: lockingPolicy: defaultObjectRetentionDays: 90
Create the role by deploying a
Role
CR in thedvs-project
namespace:apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: dvs-reader-writer namespace: dvs-project rules: - apiGroups: - object.gdc.goog resources: - buckets verbs: - read-object - write-object
Create the role binding by deploying a
RoleBinding
CR in thedvs-project
namespace:apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: dvs-reader-writer-rolebinding namespace: dvs-project roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: dvs-reader-writer subjects: - kind: ServiceAccount name: ai-translation-system-sa namespace: ai-translation-system
Upload files to the storage bucket
You must upload your documents to the storage bucket to let the Translation service process the files.
To upload files to the storage bucket, follow these steps:
- Configure the gdcloud CLI storage by following the instructions from Configure the gdcloud CLI for object storage.
- Upload your document to the storage bucket you created. For more information about how to upload objects to storage buckets, see Upload and download storage objects in projects.
The following example translates a file from a bucket and outputs the result to another bucket path. The response also returns a byte stream. You can specify the MIME type; if you don't, DVS determines it by using the input file's extension.
If you don't specify a source language code, DVS detects the language for you.
The detected language is included in the output in the detectedLanguageCode
field.
HTTP
The following example uses the curl
tool to make an HTTP call with an input
PDF document in a storage bucket.
Save the following
request.json
file:cat <<- EOF > request.json { "parent": "projects/PROJECT_ID/locations/global", "source_language_code": "SOURCE_LANGUAGE", "target_language_code": "TARGET_LANGUAGE", "document_input_config": { "mime_type": "application/pdf", "s3_source": { "input_uri": "s3://INPUT_FILE_PATH" } }, "document_output_config": { "mime_type": "application/pdf" }, "enable_rotation_correction": "true" } EOF
Replace the following:
PROJECT_ID
: The ID of the project that you want to use.SOURCE_LANGUAGE
: the language in which your document is written. For a list of supported languages, see Get supported languages.TARGET_LANGUAGE
: the language or languages into which you want to translate your document. For a list of supported languages, see Get supported languages.INPUT_FILE_PATH
: the path of your document file in the storage bucket.
Use the
curl
tool to call the endpoint and take the request from therequest.json
file:curl --cacert CACERT --data-binary @- -H "Content-Type: application/json" -H "Authorization: Bearer TOKEN" https://ENDPOINT.GDC_URL:443/v3/projects/PROJECT_ID<request.json
If the request shows an error, add the
x-goog-user-project
field to the request metadata:curl -vv --cacert CACERT --data-binary @- -H "Content-Type: application/json" -H "Authorization: Bearer TOKEN" -H "x-goog-user-project: projects/PROJECT_ID" https://ENDPOINT.GDC_URL:443/v3/projects/PROJECT_ID:translateDocument<request.json
Replace the following:
CACERT
: the path to find the CA certificate.TOKEN
: the token you obtained when you authenticated the gdcloud CLI.ENDPOINT
: the Translation endpoint that you use for your organization.GDC_URL
: the URL of your organization in Distributed Cloud, for example,org-1.zone1.gdch.test
.PROJECT_ID
: The ID of the project that you want to use.
You obtain the output following the command.
gRPC
If you don't have grpcurl
installed, download and install it from a resource outside of Distributed Cloud (https://github.com/fullstorydev/grpcurl#from-source).
The following example uses the grpcurl
tool to make a gRPC call with an input PDF document in a
storage bucket.
Save the following
request.json
file:cat <<- EOF > request.json { "parent": "projects/PROJECT_ID", "source_language_code": "SOURCE_LANGUAGE", "target_language_code": "TARGET_LANGUAGE", "document_input_config": { "mime_type": "application/pdf", "s3_source": { "input_uri": "s3://INPUT_FILE_PATH" } }, "document_output_config": { "mime_type": "application/pdf", }, "enable_rotation_correction": "true" } EOF
Replace the following:
PROJECT_ID
: The ID of the project that you want to use.SOURCE_LANGUAGE
: the language in which your document is written. For a list of supported languages, see Get supported languages.TARGET_LANGUAGE
: the language or languages into which you want to translate your document. For a list of supported languages, see Get supported languages.INPUT_FILE_PATH
: the path of your document file in the storage bucket.
Use the
grpcurl
tool to call the endpoint and take the request from therequest.json
file:grpcurl --cacert CACERT -authority ENDPOINT.GDC_URL -max-msg-sz 50000000 -d @ -H "Authorization: Bearer TOKEN" ENDPOINT.GDC_URL:443 google.cloud.translation.v3.TranslationService/TranslateDocument<request.json
If the request shows an error, add the
x-goog-user-project
field to the request metadata:grpcurl -vv --cacert CACERT -authority ENDPOINT.GDC_URL -max-msg-sz 50000000 -d @ -H "Authorization: Bearer TOKEN" -H "x-goog-user-project: projects/PROJECT_ID" ENDPOINT.GDC_URL:443 google.cloud.translation.v3.TranslationService/TranslateDocument<request.json
Replace the following:
CACERT
: the path to find the CA certificate.ENDPOINT
: the Translation endpoint that you use for your organization.GDC_URL
: the URL of your organization in Distributed Cloud, for example,org-1.zone1.gdch.test
.TOKEN
: the token you obtained when you authenticated the gdcloud CLI.PROJECT_ID
: The ID of the project that you want to use.
You obtain the output following the command.
Translate a document inline
The following example sends a document inline as part of the request. You must include the MIME type for inline document translations.
If you don't specify a source language code, DVS detects the language for you.
The detected language is included in the output in the detectedLanguageCode
field.
HTTP
The following example uses the curl
tool to make an HTTP call with an inline PDF document.
echo '{"parent": "projects/PROJECT_ID/locations/global","source_language_code": "SOURCE_LANGUAGE", "target_language_code": "TARGET_LANGUAGE", "document_input_config": { "mime_type": "application/pdf", "content": "'$(base64 -w 0 INPUT_FILE_PATH)'" }, "document_output_config": { "mime_type": "application/pdf" }, "enable_rotation_correction": "true"}' | curl --cacert CACERT --data-binary @- -H "Content-Type: application/json" -H "Authorization: Bearer TOKEN" https://ENDPOINT.GDC_URL/v3/projects/PROJECT_ID/locations/global:translateDocument
If the request shows an error, add the x-goog-user-project
field to the request metadata:
echo '{"parent": "projects/PROJECT_ID/locations/global","source_language_code": "SOURCE_LANGUAGE", "target_language_code": "TARGET_LANGUAGE", "document_input_config": { "mime_type": "application/pdf", "content": "'$(base64 -w 0 INPUT_FILE_PATH)'" }, "document_output_config": { "mime_type": "application/pdf" }, "enable_rotation_correction": "true"}' | curl --cacert CACERT --data-binary @- -H "Content-Type: application/json" -H "Authorization: Bearer TOKEN" -H "x-goog-user-project: projects/PROJECT_ID" https://ENDPOINT.GDC_URL/v3/projects/PROJECT_ID/locations/global:translateDocument
Replace the following:
PROJECT_ID
: The ID of the project that you want to use.SOURCE_LANGUAGE
: the language in which your document is written. For a list of supported languages, see Get supported languages.TARGET_LANGUAGE
: the language or languages into which you want to translate your document. For a list of supported languages, see Get supported languages.INPUT_FILE_PATH
: the path of your document file locally.ENDPOINT
: the Translation endpoint that you use for your organization.GDC_URL
: the URL of your organization in Distributed Cloud, for example,org-1.zone1.gdch.test
.TOKEN
: the token you obtained when you authenticated the gdcloud CLI.
You obtain the output following the command.
gRPC
If you don't have grpcurl
installed, download and install it from a resource outside of Distributed Cloud (https://github.com/fullstorydev/grpcurl#from-source).
The following example uses the grpcurl
tool to make a gRPC call with an inline PDF document.
echo '{"parent": "projects/PROJECT_ID","source_language_code": "SOURCE_LANGUAGE", "target_language_code": "TARGET_LANGUAGE", "document_input_config": { "mime_type": "application/pdf", "content": "'$(base64 -w 0 INPUT_FILE_PATH)'" }, "document_output_config": { "mime_type": "application/pdf" }, "enable_rotation_correction": "true"}' | grpcurl --cacert CACERT -authority ENDPOINT.GDC_URL -max-msg-sz 50000000 -d @ -H "Authorization: Bearer TOKEN" ENDPOINT.GDC_URL:443 google.cloud.translation.v3.TranslationService/TranslateDocument
If the request shows an error, add the x-goog-user-project
field to the request metadata:
echo '{"parent": "projects/PROJECT_ID","source_language_code": "SOURCE_LANGUAGE", "target_language_code": "TARGET_LANGUAGE", "document_input_config": { "mime_type": "application/pdf", "content": "'$(base64 -w 0 INPUT_FILE_PATH)'" }, "document_output_config": { "mime_type": "application/pdf" }, "enable_rotation_correction": "true"}' | grpcurl --cacert CACERT -authority ENDPOINT.GDC_URL -max-msg-sz 50000000 -d @ -H "Authorization: Bearer TOKEN" -H "x-goog-user-project: projects/PROJECT_ID" ENDPOINT.GDC_URL:443 google.cloud.translation.v3.TranslationService/TranslateDocument
Replace the following:
PROJECT_ID
: The ID of the project that you want to use.SOURCE_LANGUAGE
: the language in which your document is written. For a list of supported languages, see Get supported languages.TARGET_LANGUAGE
: the language or languages into which you want to translate your document. For a list of supported languages, see Get supported languages.INPUT_FILE_PATH
: the path of your document file locally.ENDPOINT
: the Translation endpoint that you use for your organization.GDC_URL
: the URL of your organization in Distributed Cloud, for example,org-1.zone1.gdch.test
.TOKEN
: the token you obtained when you authenticated the gdcloud CLI.
You obtain the output following the command.