Get a list of tokens

This page shows you how to compute tokens for a given prompt.

Tokens are the smallest unit of text that carries meaning for a language model. To prepare text for understanding, models use tokenization, a process that breaks down sentences or larger chunks of text into individual tokens. Then, each unique token is assigned a numerical ID. This allows the model to work with text as numbers. When you create a token, a Large Language Model (LLM) can compute the statistical relationships between tokens and produces the next most likely token in a sequence of tokens.

Supported models

The following foundation models support getting a list of tokens and token IDs:

text-bison
chat-bison
textembedding-gecko
code-bison
codechat-bison
code-gecko

Get a list of tokens and token IDs for a prompt

You can get a list of tokens and token IDs by using the Vertex AI API.

REST

To get a list of tokens and token IDs for a prompt using the Vertex AI API, send a POST request to the publisher model endpoint.

Before using any of the request data, make the following replacements:

PROJECT_ID: Your project ID.
MODEL_ID: The name of the model for which you want to compute tokens for your prompt. The foundation model options are:
- text-bison
- chat-bison
- textembedding-gecko
- code-bison
- codechat-bison
- code-gecko
You can specify a stable version by appending a version number to the model name, such as @001 to the model name. You can also specify a latest version by not appending a version number to the model name. To learn which *stable* model versions are available, see Available stable model versions.
PROMPT: The prompt to compute the tokens for. (Don't add quotes around the prompt here.)

HTTP method and URL:

POST https://us-central1-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:computeTokens

Request JSON body:

{
  "instances": [
    { "prompt": "PROMPT"}
  ],
}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login , or by using Cloud Shell, which automatically logs you into the gcloud CLI . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

cat > request.json << 'EOF'
{
  "instances": [
    { "prompt": "PROMPT"}
  ],
}
EOF

Then execute the following command to send your REST request:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:computeTokens"

PowerShell

Note: The following command assumes that you have logged in to the gcloud CLI with your user account by running gcloud init or gcloud auth login . You can check the currently active account by running gcloud auth list.

Save the request body in a file named request.json. Run the following command in the terminal to create or overwrite this file in the current directory:

@'
{
  "instances": [
    { "prompt": "PROMPT"}
  ],
}
'@  | Out-File -FilePath request.json -Encoding utf8

Then execute the following command to send your REST request:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://us-central1-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:computeTokens" | Select-Object -Expand Content

The output tokens are represented in base64 string. For improved readability, you can convert the output back to regular string. Here is an example:


  {
    "tokensInfo": [
      {
        "tokens": [
          "IFByb3ZpZGU=",
          "IGE=",
          "IHN1bW1hcnk=",
          "IG9m"
        ],
        "tokenIds": [
          "45895",
          "1016",
          "14292",
          "1024"
        ]
      }
    ]
  }

Example curl command

MODEL_ID="text-bison"
PROJECT_ID="my-project"
PROMPT="Provide a summary with about two sentences for the following article."

curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:computeTokens -d \
$'{
  "instances": [
    { "prompt": "'"$PROMPT"'"}
  ],
}'

Pricing and quota

There is no charge for using the ComputeTokens API. There is a quota restriction of 3000 requests per minute, the same quota for the CountTokens API.

What's next

Learn how to count tokens.
Learn how to test chat prompts.
Learn how to test text prompts.
Learn how to get text embeddings.