Set up Speech-to-Text model adaptation

Agent Assist uses the Speech-to-Text model adaptation to improve transcription quality by recognizing certain phrases more frequently than others. This page provides a guide to setting up model adaptation for Speech-to-Text transcription.

Use the Speech-to-Text console

You can create only global phrase sets with the Speech-to-Text console. Regional phrase sets must be created using the Speech-to-Text API.

  1. In the Google Cloud console, go to the Speech-to-Text page. Go to Speech-to-Text
  2. Click Model Adaptations.
  3. Click add_boxNew Resource.
  4. Choose the Phrase set resource and API version V1, then fill in the phrases and boost values, then copy the phrase set name.
  5. Click Save.
  6. Navigate to the Agent Assist console.
  7. Click Conversation Profiles, then choose the conversation profile you want to edit.
  8. Go to the Phrase sets section and paste the phrase set name.

Use the Speech-to-Text API

  1. Create a phrase set script by following the speech recognition instructions.
  2. Run the following Python script to update your conversation profile:

    # Conversation Profile to update
    PROJECT_ID = "sample-project"
    LOCATION = "global"
    CONVERSATION_PROFILE_ID = "sample-conversation-profile"
    # Speech model adaptation resource names
    SPEECH_ADAPTATION_PHRASES = ["projects/sample-project/locations/global/phraseSets/sample-phrase-sets"]

    import google.auth from google.auth.transport.requests import AuthorizedSession

    scopes=['https://www.googleapis.com/auth/cloud-platform'] credentials, project = google.auth.default( scopes=scopes, quota_project_id=PROJECT_ID, ) session = AuthorizedSession(credentials)

    profile_url = f"https://dialogflow.googleapis.com/v2beta1/projects/{PROJECT_ID}/locations/{LOCATION}/conversationProfiles/{CONVERSATION_PROFILE_ID}" get_response = session.get(profile_url) print("Checking for existing ConversationProfile...") print(get_response.status_code) print(get_response.json()) if get_response.status_code == 200: patch_response = session.patch( profile_url, params={ "updateMask": "sttConfig.phraseSets" }, json={ "sttConfig": { "phraseSets": SPEECH_ADAPTATION_PHRASES } } ) print("Updating ConversationProfile...") print(patch_response.status_code) print(patch_response.json())

Regional phrase sets

While Speech-to-Text model adaptation supports only English (en-US), you can configure phrase sets for other language regions with the Speech-to-Text API. This is particularly useful when transcribing English conversations that take place in those regions.

Use the following sample command to create regional phrase sets with the Speech-to-Text API.

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -H "X-Goog-User-Project: sample_project" \
    -d @sample_phrase_sets.json \
"https://us-speech.googleapis.com/v1/projects/sample-project/locations/us/phraseSets"

The json file @sample_phrase_sets.json contains the following contents of the phrase sets:

{
  "parent": "projects/sample-project/locations/us",
  "phraseSetId": "sample-phrase-sets",
  "phraseSet": {
    "name": "sample-phrase-sets",
    "phrases": [
      {
        "value": "Some phrase",
        "boost": 20
      }
    ]
  }
}
For a conversation profile in a single Dialogflow region, the following table shows the corresponding Speech-to-Text region in which to create your phrase set.

Dialogflow region Speech-to-Text region
us
us-central1
us-east1
us-east7
us-west1
northamerica-northeast1
northamerica-northeast2
us
eu
europe-west1
europe-west2
europe-west3
europe-west4
eu
australia-southeast1
asia-northeast1
asia-south1
asia-southeast1
me-west1
global
global