Module types (0.40.0)

API documentation for vision_v1p2beta1.types module.

Classes

AnnotateFileResponse

Response to a single file annotation request. A file may contain one or more images, which individually have their own responses.

Individual responses to images found within the file.

AnnotateImageRequest

Request for performing Google Cloud Vision API tasks over a user-provided image, with user-requested features.

Requested features.

AnnotateImageResponse

Response to an image annotation request.

If present, landmark detection has completed successfully.

If present, label detection has completed successfully.

If present, text (OCR) detection or document (OCR) text detection has completed successfully. This annotation provides the structural hierarchy for the OCR detected text.

If present, image properties were extracted successfully.

If present, web detection has completed successfully.

If present, contextual information is needed to understand where this image comes from.

Any

API documentation for vision_v1p2beta1.types.Any class.

AsyncAnnotateFileRequest

An offline file annotation request.

Required. Requested features.

Required. The desired output location and metadata (e.g. format).

AsyncAnnotateFileResponse

The response for a single offline file annotation request.

AsyncBatchAnnotateFilesRequest

Multiple async file annotation requests are batched into a single service call.

AsyncBatchAnnotateFilesResponse

Response to an async batch file annotation request.

BatchAnnotateImagesRequest

Multiple image annotation requests are batched into a single service call.

BatchAnnotateImagesResponse

Response to a batch image annotation request.

Block

Logical element on the page.

The bounding box for the block. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: - when the text is horizontal it might look like: :: 0----1 | | 3----2 - when it's rotated 180 degrees around the top-left corner it becomes: :: 2----3 | | 1----0 and the vertice order will still be (0, 1, 2, 3).

Detected block type (text, image etc) for this block.

BoolValue

API documentation for vision_v1p2beta1.types.BoolValue class.

BoundingPoly

A bounding polygon for the detected image annotation.

The bounding polygon normalized vertices.

BytesValue

API documentation for vision_v1p2beta1.types.BytesValue class.

CancelOperationRequest

API documentation for vision_v1p2beta1.types.CancelOperationRequest class.

Color

API documentation for vision_v1p2beta1.types.Color class.

ColorInfo

Color information consists of RGB channels, score, and the fraction of the image that the color occupies in the image.

Image-specific score for this color. Value in range [0, 1].

CropHint

Single crop hint that is used to generate a new crop when serving an image.

Confidence of this being a salient region. Range [0, 1].

CropHintsAnnotation

Set of crop hints that are used to generate new crops when serving images.

CropHintsParams

Parameters for crop hints annotation request.

DeleteOperationRequest

API documentation for vision_v1p2beta1.types.DeleteOperationRequest class.

DominantColorsAnnotation

Set of dominant colors and their corresponding scores.

DoubleValue

API documentation for vision_v1p2beta1.types.DoubleValue class.

EntityAnnotation

Set of detected entity features.

The language code for the locale in which the entity textual description is expressed.

Overall score of the result. Range [0, 1].

The relevancy of the ICA (Image Content Annotation) label to the image. For example, the relevancy of "tower" is likely higher to an image containing the detected "Eiffel Tower" than to an image containing a detected distant towering building, even though the confidence that there is a tower in each image may be the same. Range [0, 1].

The location information for the detected entity. Multiple LocationInfo elements can be present because one location may indicate the location of the scene in the image, and another location may indicate the location of the place where the image was taken. Location information is usually present for landmarks.

FaceAnnotation

A face annotation object contains the results of face detection.

The fd_bounding_poly bounding polygon is tighter than the boundingPoly, and encloses only the skin part of the face. Typically, it is used to eliminate the face from any image analysis that detects the "amount of skin" visible in an image. It is not based on the landmarker results, only on the initial face detection, hence the fd (face detection) prefix.

Roll angle, which indicates the amount of clockwise/anti- clockwise rotation of the face relative to the image vertical about the axis perpendicular to the face. Range [-180,180].

Pitch angle, which indicates the upwards/downwards angle that the face is pointing relative to the image's horizontal plane. Range [-180,180].

Face landmarking confidence. Range [0, 1].

Sorrow likelihood.

Surprise likelihood.

Blurred likelihood.

Feature

The type of Google Cloud Vision API detection to perform, and the maximum number of results to return for that type. Multiple Feature objects can be specified in the features list.

Maximum number of results of this type. Does not apply to TEXT_DETECTION, DOCUMENT_TEXT_DETECTION, or CROP_HINTS.

FloatValue

API documentation for vision_v1p2beta1.types.FloatValue class.

GcsDestination

The Google Cloud Storage location where the output will be written to.

GcsSource

The Google Cloud Storage location where the input will be read from.

GetOperationRequest

API documentation for vision_v1p2beta1.types.GetOperationRequest class.

Image

Client image to perform Google Cloud Vision API tasks over.

Google Cloud Storage image location, or publicly-accessible image URL. If both content and source are provided for an image, content takes precedence and is used to perform the image annotation request.

ImageAnnotationContext

If an image was produced from a file (e.g. a PDF), this message gives information about the source of that image.

If the file was a PDF or TIFF, this field gives the page number within the file used to produce the image.

ImageContext

Image context and/or feature-specific parameters.

List of languages to use for TEXT_DETECTION. In most cases, an empty value yields the best results since it enables automatic language detection. For languages based on the Latin alphabet, setting language_hints is not needed. In rare cases, when the language of the text in the image is known, setting a hint will help get better results (although it will be a significant hindrance if the hint is wrong). Text detection returns an error if one or more of the specified languages is not one of the supported languages </vision/docs/languages>__.

Parameters for web detection.

ImageProperties

Stores image properties, such as dominant colors.

ImageSource

External image source (Google Cloud Storage or web URL image location).

The URI of the source image. Can be either: 1. A Google Cloud Storage URI of the form gs://bucket_name/object_name. Object versioning is not supported. See Google Cloud Storage Request URIs <https://cloud.google.com/storage/docs/reference-uris>__ for more info. 2. A publicly-accessible image HTTP/HTTPS URL. When fetching images from HTTP/HTTPS URLs, Google cannot guarantee that the request will be completed. Your request may fail if the specified host denies the request (e.g. due to request throttling or DOS prevention), or if Google throttles requests to the site for abuse prevention. You should not depend on externally-hosted images for production applications. When both gcs_image_uri and image_uri are specified, image_uri takes precedence.

InputConfig

The desired input location and metadata.

The type of the file. Currently only "application/pdf" and "image/tiff" are supported. Wildcards are not supported.

Int32Value

API documentation for vision_v1p2beta1.types.Int32Value class.

Int64Value

API documentation for vision_v1p2beta1.types.Int64Value class.

LatLng

API documentation for vision_v1p2beta1.types.LatLng class.

LatLongRect

Rectangle determined by min and max LatLng pairs.

Max lat/long pair.

ListOperationsRequest

API documentation for vision_v1p2beta1.types.ListOperationsRequest class.

ListOperationsResponse

API documentation for vision_v1p2beta1.types.ListOperationsResponse class.

LocationInfo

Detected entity location information.

NormalizedVertex

X coordinate.

Operation

API documentation for vision_v1p2beta1.types.Operation class.

OperationInfo

API documentation for vision_v1p2beta1.types.OperationInfo class.

OperationMetadata

Contains metadata for the BatchAnnotateImages operation.

The time when the batch request was received.

OutputConfig

The desired output location and metadata.

The max number of response protos to put into each output JSON file on GCS. The valid range is [1, 100]. If not specified, the default value is 20. For example, for one pdf file with 100 pages, 100 response protos will be generated. If batch_size = 20, then 5 json files each containing 20 response protos will be written under the prefix gcs_destination.\ uri. Currently, batch_size only applies to GcsDestination, with potential future support for other output configurations.

Page

Detected page from OCR.

Page width. For PDFs the unit is points. For images (including TIFFs) the unit is pixels.

List of blocks of text, images etc on this page.

Paragraph

Structural unit of text representing a number of words in certain order.

The bounding box for the paragraph. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3 ----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertice order will still be (0, 1, 2, 3).

Confidence of the OCR results for the paragraph. Range [0, 1].

Position

A 3D position in the image, used primarily for Face detection landmarks. A valid Position must have both x and y coordinates. The position coordinates are in the same scale as the original image.

Y coordinate.

Property

A Property consists of a user-supplied name/value pair.

Value of the property.

SafeSearchAnnotation

Set of features pertaining to the image, computed by computer vision methods over safe-search verticals (for example, adult, spoof, medical, violence).

Spoof likelihood. The likelihood that an modification was made to the image's canonical version to make it appear funny or offensive.

Likelihood that this image contains violent content.

Status

API documentation for vision_v1p2beta1.types.Status class.

StringValue

API documentation for vision_v1p2beta1.types.StringValue class.

Symbol

A single symbol representation.

The bounding box for the symbol. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3 ----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertice order will still be (0, 1, 2, 3).

Confidence of the OCR results for the symbol. Range [0, 1].

TextAnnotation

TextAnnotation contains a structured representation of OCR extracted text. The hierarchy of an OCR extracted text structure is like this: TextAnnotation -> Page -> Block -> Paragraph -> Word -> Symbol Each structural component, starting from Page, may further have their own properties. Properties describe detected languages, breaks etc.. Please refer to the TextAnnotation.TextProperty message definition below for more detail.

UTF-8 text detected on the pages.

Timestamp

API documentation for vision_v1p2beta1.types.Timestamp class.

UInt32Value

API documentation for vision_v1p2beta1.types.UInt32Value class.

UInt64Value

API documentation for vision_v1p2beta1.types.UInt64Value class.

Vertex

X coordinate.

WaitOperationRequest

API documentation for vision_v1p2beta1.types.WaitOperationRequest class.

WebDetection

Relevant information for the image from the Internet.

Fully matching images from the Internet. Can include resized copies of the query image.

Web pages containing the matching images from the Internet.

Best guess text labels for the request image.

WebDetectionParams

Parameters for web detection request.

Word

A word representation.

The bounding box for the word. The vertices are in the order of top-left, top-right, bottom-right, bottom-left. When a rotation of the bounding box is detected the rotation is represented as around the top-left corner as defined when the text is read in the 'natural' orientation. For example: * when the text is horizontal it might look like: 0----1 | | 3 ----2 * when it's rotated 180 degrees around the top-left corner it becomes: 2----3 | | 1----0 and the vertice order will still be (0, 1, 2, 3).

Confidence of the OCR results for the word. Range [0, 1].