Package google.cloud.language.v2

Index

LanguageService

Provides text analysis operations such as sentiment analysis and entity recognition.

AnalyzeEntities

rpc AnalyzeEntities(AnalyzeEntitiesRequest) returns (AnalyzeEntitiesResponse)

Finds named entities (currently proper names and common nouns) in the text along with entity types, probability, mentions for each entity, and other properties.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-language
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

AnalyzeSentiment

rpc AnalyzeSentiment(AnalyzeSentimentRequest) returns (AnalyzeSentimentResponse)

Analyzes the sentiment of the provided text.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-language
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

AnnotateText

rpc AnnotateText(AnnotateTextRequest) returns (AnnotateTextResponse)

A convenience method that provides all features in one call.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-language
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ClassifyText

rpc ClassifyText(ClassifyTextRequest) returns (ClassifyTextResponse)

Classifies a document into categories.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-language
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

ModerateText

rpc ModerateText(ModerateTextRequest) returns (ModerateTextResponse)

Moderates a document for harmful and sensitive categories.

Authorization scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-language
  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

AnalyzeEntitiesRequest

The entity analysis request message.

Fields
document

Document

Required. Input document.

encoding_type

EncodingType

The encoding type used by the API to calculate offsets.

AnalyzeEntitiesResponse

The entity analysis response message.

Fields
entities[]

Entity

The recognized entities in the input document.

language_code

string

The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See [Document.language][] field for more details.

language_supported

bool

Whether the language is officially supported. The API may still return a response when the language is not supported, but it is on a best effort basis.

AnalyzeSentimentRequest

The sentiment analysis request message.

Fields
document

Document

Required. Input document.

encoding_type

EncodingType

The encoding type used by the API to calculate sentence offsets.

AnalyzeSentimentResponse

The sentiment analysis response message.

Fields
document_sentiment

Sentiment

The overall sentiment of the input document.

language_code

string

The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See [Document.language][] field for more details.

sentences[]

Sentence

The sentiment for all the sentences in the document.

language_supported

bool

Whether the language is officially supported. The API may still return a response when the language is not supported, but it is on a best effort basis.

AnnotateTextRequest

The request message for the text annotation API, which can perform multiple analysis types in one call.

Fields
document

Document

Required. Input document.

features

Features

Required. The enabled features.

encoding_type

EncodingType

The encoding type used by the API to calculate offsets.

Features

All available features. Setting each one to true will enable that specific analysis for the input.

Fields
extract_entities

bool

Optional. Extract entities.

extract_document_sentiment

bool

Optional. Extract document-level sentiment.

classify_text

bool

Optional. Classify the full document into categories.

moderate_text

bool

Optional. Moderate the document for harmful and sensitive categories.

AnnotateTextResponse

The text annotations response message.

Fields
sentences[]

Sentence

Sentences in the input document. Populated if the user enables AnnotateTextRequest.Features.extract_document_sentiment.

entities[]

Entity

Entities, along with their semantic information, in the input document. Populated if the user enables AnnotateTextRequest.Features.extract_entities or AnnotateTextRequest.Features.extract_entity_sentiment.

document_sentiment

Sentiment

The overall sentiment for the document. Populated if the user enables AnnotateTextRequest.Features.extract_document_sentiment.

language_code

string

The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See [Document.language][] field for more details.

categories[]

ClassificationCategory

Categories identified in the input document.

moderation_categories[]

ClassificationCategory

Harmful and sensitive categories identified in the input document.

language_supported

bool

Whether the language is officially supported by all requested features. The API may still return a response when the language is not supported, but it is on a best effort basis.

ClassificationCategory

Represents a category returned from the text classifier.

Fields
name

string

The name of the category representing the document.

confidence

float

The classifier's confidence of the category. Number represents how certain the classifier is that this category represents the given text.

severity

float

Optional. The classifier's severity of the category. This is only present when the ModerateTextRequest.ModelVersion is set to MODEL_VERSION_2, and the corresponding category has a severity score.

ClassifyTextRequest

The document classification request message.

Fields
document

Document

Required. Input document.

ClassifyTextResponse

The document classification response message.

Fields
categories[]

ClassificationCategory

Categories representing the input document.

language_code

string

The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See [Document.language][] field for more details.

language_supported

bool

Whether the language is officially supported. The API may still return a response when the language is not supported, but it is on a best effort basis.

Document

Represents the input to API methods.

Fields
type

Type

Required. If the type is not set or is TYPE_UNSPECIFIED, returns an INVALID_ARGUMENT error.

language_code

string

Optional. The language of the document (if not specified, the language is automatically detected). Both ISO and BCP-47 language codes are accepted.
Language Support lists currently supported languages for each API method. If the language (either specified by the caller or automatically detected) is not supported by the called API method, an INVALID_ARGUMENT error is returned.

Union field source. The source of the document: a string containing the content or a Google Cloud Storage URI. source can be only one of the following:
content

string

The content of the input in string format. Cloud audit logging exempt since it is based on user data.

gcs_content_uri

string

The Google Cloud Storage URI where the file content is located. This URI must be of the form: gs://bucket_name/object_name. For more details, see https://cloud.google.com/storage/docs/reference-uris. NOTE: Cloud Storage object versioning is not supported.

Type

The document types enum.

Enums
TYPE_UNSPECIFIED The content type is not specified.
PLAIN_TEXT Plain text
HTML HTML

EncodingType

Represents the text encoding that the caller uses to process the output. Providing an EncodingType is recommended because the API provides the beginning offsets for various outputs, such as tokens and mentions, and languages that natively use different text encodings may access offsets differently.

Enums
NONE If EncodingType is not specified, encoding-dependent information (such as begin_offset) will be set at -1.
UTF8 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-8 encoding of the input. C++ and Go are examples of languages that use this encoding natively.
UTF16 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-16 encoding of the input. Java and JavaScript are examples of languages that use this encoding natively.
UTF32 Encoding-dependent information (such as begin_offset) is calculated based on the UTF-32 encoding of the input. Python is an example of a language that uses this encoding natively.

Entity

Represents a phrase in the text that is a known entity, such as a person, an organization, or location. The API associates information, such as probability and mentions, with entities.

Fields
name

string

The representative name for the entity.

type

Type

The entity type.

metadata

map<string, string>

Metadata associated with the entity.

For the metadata associated with other entity types, see the Type table below.

mentions[]

EntityMention

The mentions of this entity in the input document. The API currently supports proper noun mentions.

sentiment

Sentiment

For calls to AnalyzeEntitySentimentRequest or if AnnotateTextRequest.Features.extract_entity_sentiment is set to true, this field will contain the aggregate sentiment expressed for this entity in the provided document.

Type

The type of the entity. The table below lists the associated fields for entities that have different metadata.

Enums
UNKNOWN Unknown
PERSON Person
LOCATION Location
ORGANIZATION Organization
EVENT Event
WORK_OF_ART Artwork
CONSUMER_GOOD Consumer product
OTHER Other types of entities
PHONE_NUMBER

Phone number

The metadata lists the phone number, formatted according to local convention, plus whichever additional elements appear in the text:

  • number - the actual number, broken down into sections as per local convention
  • national_prefix - country code, if detected
  • area_code - region or area code, if detected
  • extension - phone extension (to be dialed after connection), if detected
ADDRESS

Address

The metadata identifies the street number and locality plus whichever additional elements appear in the text:

  • street_number - street number
  • locality - city or town
  • street_name - street/route name, if detected
  • postal_code - postal code, if detected
  • country - country, if detected
  • broad_region - administrative area, such as the state, if detected
  • narrow_region - smaller administrative area, such as county, if detected
  • sublocality - used in Asian addresses to demark a district within a city, if detected
DATE

Date

The metadata identifies the components of the date:

  • year - four digit year, if detected
  • month - two digit month number, if detected
  • day - two digit day number, if detected
NUMBER

Number

The metadata is the number itself.

PRICE

Price

The metadata identifies the value and currency.

EntityMention

Represents a mention for an entity in the text. Currently, proper noun mentions are supported.

Fields
text

TextSpan

The mention text.

type

Type

The type of the entity mention.

sentiment

Sentiment

For calls to AnalyzeEntitySentimentRequest or if AnnotateTextRequest.Features.extract_entity_sentiment is set to true, this field will contain the sentiment expressed for this mention of the entity in the provided document.

probability

float

Probability score associated with the entity.

The score shows the probability of the entity mention being the entity type. The score is in (0, 1] range.

Type

The supported types of mentions.

Enums
TYPE_UNKNOWN Unknown
PROPER Proper name
COMMON Common noun (or noun compound)

ModerateTextRequest

The document moderation request message.

Fields
document

Document

Required. Input document.

model_version

ModelVersion

Optional. The model version to use for ModerateText.

ModelVersion

The model version to use for ModerateText.

Enums
MODEL_VERSION_UNSPECIFIED The default model version.
MODEL_VERSION_1 Use the v1 model, this model is used by default when not provided. The v1 model only returns probability (confidence) score for each category.
MODEL_VERSION_2 Use the v2 model. The v2 model only returns probability (confidence) score for each category, and returns severity score for a subset of the categories.

ModerateTextResponse

The document moderation response message.

Fields
moderation_categories[]

ClassificationCategory

Harmful and sensitive categories representing the input document.

language_code

string

The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See [Document.language][] field for more details.

language_supported

bool

Whether the language is officially supported. The API may still return a response when the language is not supported, but it is on a best effort basis.

Sentence

Represents a sentence in the input document.

Fields
text

TextSpan

The sentence text.

sentiment

Sentiment

For calls to AnalyzeSentimentRequest or if AnnotateTextRequest.Features.extract_document_sentiment is set to true, this field will contain the sentiment for the sentence.

Sentiment

Represents the feeling associated with the entire text or entities in the text.

Fields
magnitude

float

A non-negative number in the [0, +inf] range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).

score

float

Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).

TextSpan

Represents a text span in the input document.

Fields
content

string

The content of the text span, which is a substring of the document.

begin_offset

int32

The API calculates the beginning offset of the content in the original document according to the EncodingType specified in the API request.