Perform a server-side streaming online prediction request for Vertex LLM streaming.
Arguments
Parameters | |
---|---|
endpoint |
Required. The name of the Endpoint requested to serve the prediction. Format: |
region |
Required. Region of the HTTP endpoint. For example, if region is set to |
body |
Required. |
Raised exceptions
Exceptions | |
---|---|
ConnectionError |
In case of a network problem (such as DNS failure or refused connection). |
HttpError |
If the response status is >= 400 (excluding 429 and 503). |
TimeoutError |
If a long-running operation takes longer to finish than the specified timeout limit. |
TypeError |
If an operation or function receives an argument of the wrong type. |
ValueError |
If an operation or function receives an argument of the right type but an inappropriate value. For example, a negative timeout. |
Response
If successful, the response contains an instance of GoogleCloudAiplatformV1StreamingPredictResponse
.
Subworkflow snippet
Some fields might be optional or required. To identify required fields, refer to the API documentation.
YAML
- serverStreamingPredict: call: googleapis.aiplatform.v1.projects.locations.publishers.models.serverStreamingPredict args: endpoint: ... region: ... body: inputs: ... parameters: boolVal: ... bytesVal: ... doubleVal: ... dtype: ... floatVal: ... int64Val: ... intVal: ... listVal: ... shape: ... stringVal: ... structVal: ... tensorVal: ... uint64Val: ... uintVal: ... result: serverStreamingPredictResult
JSON
[ { "serverStreamingPredict": { "call": "googleapis.aiplatform.v1.projects.locations.publishers.models.serverStreamingPredict", "args": { "endpoint": "...", "region": "...", "body": { "inputs": "...", "parameters": { "boolVal": "...", "bytesVal": "...", "doubleVal": "...", "dtype": "...", "floatVal": "...", "int64Val": "...", "intVal": "...", "listVal": "...", "shape": "...", "stringVal": "...", "structVal": "...", "tensorVal": "...", "uint64Val": "...", "uintVal": "..." } } }, "result": "serverStreamingPredictResult" } } ]