- NAME
-
- gcloud alpha ml speech recognizers run-short - get transcripts of short (less than 60 seconds) audio from an audio file
- SYNOPSIS
-
-
gcloud alpha ml speech recognizers run-short
(RECOGNIZER
:--location
=LOCATION
)--audio
=AUDIO
[--hint-boost
=HINT_BOOST
] [--hint-phrase-sets
=[PHRASE_SET
,…]] [--hint-phrases
=[PHRASE
,…]] [--language-codes
=[LANGUAGE_CODE
,…]] [--model
=MODEL
] [--audio-channel-count
=AUDIO_CHANNEL_COUNT
--encoding
=ENCODING
--sample-rate
=SAMPLE_RATE
] [--[no-]enable-automatic-punctuation
--[no-]enable-spoken-emojis
--[no-]enable-spoken-punctuation
--[no-]enable-word-confidence
--[no-]enable-word-time-offsets
--max-alternatives
=MAX_ALTERNATIVES
--[no-]profanity-filter
--[no-]separate-channel-recognition
--max-speaker-count
=MAX_SPEAKER_COUNT
--min-speaker-count
=MIN_SPEAKER_COUNT
] [GCLOUD_WIDE_FLAG …
]
-
- DESCRIPTION
-
(ALPHA)
Get transcripts of short (less than 60 seconds) audio from an audio file. - POSITIONAL ARGUMENTS
-
-
Recognizer resource - recognizer. The arguments in this group can be used to
specify the attributes of this resource. (NOTE) Some attributes are not given
arguments in this group but can be set in other ways.
To set the
project
attribute:-
provide the argument
recognizer
on the command line with a fully specified name; -
provide the argument
--project
on the command line; -
set the property
core/project
.
This must be specified.
RECOGNIZER
-
ID of the recognizer or fully qualified identifier for the recognizer.
To set the
recognizer
attribute:-
provide the argument
recognizer
on the command line.
This positional argument must be specified if any of the other arguments in this group are specified.
-
provide the argument
--location
=LOCATION
-
Location of the recognizer.
To set the
location
attribute:-
provide the argument
recognizer
on the command line with a fully specified name; -
provide the argument
--location
on the command line.
-
provide the argument
-
provide the argument
-
Recognizer resource - recognizer. The arguments in this group can be used to
specify the attributes of this resource. (NOTE) Some attributes are not given
arguments in this group but can be set in other ways.
- REQUIRED FLAGS
-
--audio
=AUDIO
- Location of the audio file to transcribe. Must be a audio data bytes, local file, or Google Cloud Storage URL (in the format gs://bucket/object).
- OPTIONAL FLAGS
-
--hint-boost
=HINT_BOOST
- Boost value for the phrases passed to --phrases. Can have a value between 1 and 20.
--hint-phrase-sets
=[PHRASE_SET
,…]- A list of phrase set resource names to use for speech recognition.
--hint-phrases
=[PHRASE
,…]- A list of strings containing word and phrase "hints" so that the ' 'speech recognition is more likely to recognize them. This can be ' 'used to improve the accuracy for specific words and phrases, ' 'for example, if specific commands are typically spoken by ' 'the user. This can also be used to add additional words to the ' 'vocabulary of the recognizer. ' 'See https://cloud.google.com/speech/limits#content.
--language-codes
=[LANGUAGE_CODE
,…]-
Language code is one of
en-US
,en-GB
,fr-FR
. Check documentation for using more than one language code. --model
=MODEL
- Which model to use for recognition requests. Select the model best suited to your domain to get best results. Guidance for choosing which model to use can be found in the Transcription Models Documentation and the models supported in each region can be found in the Table Of Supported Models.
-
Encoding format
--audio-channel-count
=AUDIO_CHANNEL_COUNT
- Number of channels present in the audio data sent for recognition. Required if --encoding flag is specified and is not AUTO. Must be set to a value between 1 and 8.
--encoding
=ENCODING
-
Encoding format of the provided audio. For headerless formats, must be set to
LINEAR16
,MULAW,
orALAW
. For other formats, set toAUTO
. Overrides the recognizer configuration if present, else uses recognizer encoding. --sample-rate
=SAMPLE_RATE
- Sample rate in Hertz of the audio data sent for recognition. Required if --encoding flag is specified and is not AUTO. Must be set to a value between 8000 and 48000.
-
ASR Features
--[no-]enable-automatic-punctuation
-
If set, adds punctuation to recognition result hypotheses. Use
--enable-automatic-punctuation
to enable and--no-enable-automatic-punctuation
to disable. --[no-]enable-spoken-emojis
-
If set, adds spoken emoji formatting. Use
--enable-spoken-emojis
to enable and--no-enable-spoken-emojis
to disable. --[no-]enable-spoken-punctuation
-
If set, replaces spoken punctuation with the corresponding symbols in the
request. Use
--enable-spoken-punctuation
to enable and--no-enable-spoken-punctuation
to disable. --[no-]enable-word-confidence
-
If set, the top result includes a list of words and the confidence for those
words. Use
--enable-word-confidence
to enable and--no-enable-word-confidence
to disable. --[no-]enable-word-time-offsets
-
If set, the top result includes a list of words and their timestamps. Use
--enable-word-time-offsets
to enable and--no-enable-word-time-offsets
to disable. --max-alternatives
=MAX_ALTERNATIVES
- Maximum number of recognition hypotheses to be returned. Must be set to a value between 1 and 30.
--[no-]profanity-filter
-
If set, the server will censor profanities. Use
--profanity-filter
to enable and--no-profanity-filter
to disable. --[no-]separate-channel-recognition
-
Mode for recognizing multi-channel audio using Separate Channel Recognition.
When set, the service will recognize each channel independently. Use
--separate-channel-recognition
to enable and--no-separate-channel-recognition
to disable. -
Speaker Diarization
--max-speaker-count
=MAX_SPEAKER_COUNT
-
Maximum number of speakers in the conversation. Must be greater than or equal to
--min-speaker-count. Must be set to a value between 1 and 6.
This flag argument must be specified if any of the other arguments in this group are specified.
--min-speaker-count
=MIN_SPEAKER_COUNT
-
Minimum number of speakers in the conversation. Must be less than or equal to
--max-speaker-count. Must be set to a value between 1 and 6.
This flag argument must be specified if any of the other arguments in this group are specified.
- GCLOUD WIDE FLAGS
-
These flags are available to all commands:
--access-token-file
,--account
,--billing-project
,--configuration
,--flags-file
,--flatten
,--format
,--help
,--impersonate-service-account
,--log-http
,--project
,--quiet
,--trace-token
,--user-output-enabled
,--verbosity
.Run
$ gcloud help
for details. - NOTES
- This command is currently in alpha and might change without notice. If this command fails with API permission errors despite specifying the correct project, you might be trying to access an API with an invitation-only early access allowlist.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-08-13 UTC.