REST Resource: projects.locations.processors.processorVersions

Resource: ProcessorVersion

A processor version is an implementation of a processor. Each processor can have multiple versions, pretrained by Google internally or uptrained by the customer. A processor can only have one default version at a time. Its document-processing behavior is defined by that version.

JSON representation
{
  "name": string,
  "displayName": string,
  "documentSchema": {
    object (DocumentSchema)
  },
  "state": enum (State),
  "createTime": string,
  "latestEvaluation": {
    object (EvaluationReference)
  },
  "kmsKeyName": string,
  "kmsKeyVersionName": string,
  "googleManaged": boolean,
  "deprecationInfo": {
    object (DeprecationInfo)
  },
  "modelType": enum (ModelType),
  "satisfiesPzs": boolean,
  "satisfiesPzi": boolean,
  "genAiModelInfo": {
    object (GenAiModelInfo)
  }
}
Fields
name

string

Identifier. The resource name of the processor version. Format: projects/{project}/locations/{location}/processors/{processor}/processorVersions/{processorVersion}

displayName

string

The display name of the processor version.

documentSchema

object (DocumentSchema)

The schema of the processor version. Describes the output.

state

enum (State)

Output only. The state of the processor version.

createTime

string (Timestamp format)

The time the processor version was created.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

latestEvaluation

object (EvaluationReference)

The most recently invoked evaluation for the processor version.

kmsKeyName

string

The KMS key name used for encryption.

kmsKeyVersionName

string

The KMS key version with which data is encrypted.

googleManaged

boolean

Output only. Denotes that this ProcessorVersion is managed by Google.

deprecationInfo

object (DeprecationInfo)

If set, information about the eventual deprecation of this version.

modelType

enum (ModelType)

Output only. The model type of this processor version.

satisfiesPzs

boolean

Output only. Reserved for future use.

satisfiesPzi

boolean

Output only. Reserved for future use.

genAiModelInfo

object (GenAiModelInfo)

Output only. Information about Generative AI model-based processor versions.

DocumentSchema

The schema defines the output of the processed document by a processor.

JSON representation
{
  "displayName": string,
  "description": string,
  "entityTypes": [
    {
      object (EntityType)
    }
  ],
  "metadata": {
    object (Metadata)
  }
}
Fields
displayName

string

Display name to show to users.

description

string

Description of the schema.

entityTypes[]

object (EntityType)

Entity types of the schema.

metadata

object (Metadata)

Metadata of the schema.

EntityType

EntityType is the wrapper of a label of the corresponding model with detailed attributes and limitations for entity-based processors. Multiple types can also compose a dependency tree to represent nested types.

JSON representation
{
  "displayName": string,
  "name": string,
  "description": string,
  "baseTypes": [
    string
  ],
  "properties": [
    {
      object (Property)
    }
  ],
  "entityTypeMetadata": {
    object (EntityTypeMetadata)
  },

  // Union field value_source can be only one of the following:
  "enumValues": {
    object (EnumValues)
  }
  // End of list of possible types for union field value_source.
}
Fields
displayName

string

User defined name for the type.

name

string

Name of the type. It must be unique within the schema file and cannot be a "Common Type". The following naming conventions are used:

  • Use snake_casing.
  • Name matching is case-sensitive.
  • Maximum 64 characters.
  • Must start with a letter.
  • Allowed characters: ASCII letters [a-z0-9_-]. (For backward compatibility internal infrastructure and tooling can handle any ascii character.)
  • The / is sometimes used to denote a property of a type. For example line_item/amount. This convention is deprecated, but will still be honored for backward compatibility.
description

string

The description of the entity type. Could be used to provide more information about the entity type for model calls.

baseTypes[]

string

The entity type that this type is derived from. For now, one and only one should be set.

properties[]

object (Property)

Description the nested structure, or composition of an entity.

entityTypeMetadata

object (EntityTypeMetadata)

Metadata for the entity type.

Union field value_source.

value_source can be only one of the following:

enumValues

object (EnumValues)

If specified, lists all the possible values for this entity. This should not be more than a handful of values. If the number of values is >10 or could change frequently use the EntityType.value_ontology field and specify a list of all possible values in a value ontology file.

EnumValues

Defines the a list of enum values.

JSON representation
{
  "values": [
    string
  ]
}
Fields
values[]

string

The individual values that this enum values type can include.

Property

Defines properties that can be part of the entity type.

JSON representation
{
  "name": string,
  "description": string,
  "displayName": string,
  "valueType": string,
  "occurrenceType": enum (OccurrenceType),
  "propertyMetadata": {
    object (PropertyMetadata)
  }
}
Fields
name

string

The name of the property. Follows the same guidelines as the EntityType name.

description

string

The description of the property. Could be used to provide more information about the property for model calls.

displayName

string

User defined name for the property.

valueType

string

A reference to the value type of the property. This type is subject to the same conventions as the Entity.base_types field.

occurrenceType

enum (OccurrenceType)

Occurrence type limits the number of instances an entity type appears in the document.

propertyMetadata

object (PropertyMetadata)

Any additional metadata about the property can be added here.

OccurrenceType

Types of occurrences of the entity type in the document. This represents the number of instances, not mentions, of an entity. For example, a bank statement might only have one account_number, but this account number can be mentioned in several places on the document. In this case, the account_number is considered a REQUIRED_ONCE entity type. If, on the other hand, we expect a bank statement to contain the status of multiple different accounts for the customers, the occurrence type is set to REQUIRED_MULTIPLE.

Enums
OCCURRENCE_TYPE_UNSPECIFIED Unspecified occurrence type.
OPTIONAL_ONCE There will be zero or one instance of this entity type. The same entity instance may be mentioned multiple times.
OPTIONAL_MULTIPLE The entity type will appear zero or multiple times.
REQUIRED_ONCE The entity type will only appear exactly once. The same entity instance may be mentioned multiple times.
REQUIRED_MULTIPLE The entity type will appear once or more times.

PropertyMetadata

Metadata about a property.

JSON representation
{
  "inactive": boolean,
  "fieldExtractionMetadata": {
    object (FieldExtractionMetadata)
  }
}
Fields
inactive

boolean

Whether the property should be considered as "inactive".

fieldExtractionMetadata

object (FieldExtractionMetadata)

Field extraction metadata on the property.

FieldExtractionMetadata

Metadata for how this field value is extracted.

JSON representation
{
  "summaryOptions": {
    object (SummaryOptions)
  }
}
Fields
summaryOptions

object (SummaryOptions)

Summary options config.

SummaryOptions

Metadata for document summarization.

JSON representation
{
  "length": enum (Length),
  "format": enum (Format)
}
Fields
length

enum (Length)

How long the summary should be.

format

enum (Format)

The format the summary should be in.

Length

The Length enum.

Enums
LENGTH_UNSPECIFIED Default.
BRIEF A brief summary of one or two sentences.
MODERATE A paragraph-length summary.
COMPREHENSIVE The longest option available.

Format

The Format enum.

Enums
FORMAT_UNSPECIFIED Default.
PARAGRAPH Format the output in paragraphs.
BULLETS Format the output in bullets.

EntityTypeMetadata

Metadata about an entity type.

JSON representation
{
  "inactive": boolean
}
Fields
inactive

boolean

Whether the entity type should be considered inactive.

Metadata

Metadata for global schema behavior.

JSON representation
{
  "documentSplitter": boolean,
  "documentAllowMultipleLabels": boolean,
  "prefixedNamingOnProperties": boolean,
  "skipNamingValidation": boolean
}
Fields
documentSplitter

boolean

If true, a document entity type can be applied to subdocument (splitting). Otherwise, it can only be applied to the entire document (classification).

documentAllowMultipleLabels

boolean

If true, on a given page, there can be multiple document annotations covering it.

prefixedNamingOnProperties

boolean

If set, all the nested entities must be prefixed with the parents.

skipNamingValidation

boolean

If set, we will skip the naming format validation in the schema. So the string values in DocumentSchema.EntityType.name and DocumentSchema.EntityType.Property.name will not be checked.

State

The possible states of the processor version.

Enums
STATE_UNSPECIFIED The processor version is in an unspecified state.
DEPLOYED The processor version is deployed and can be used for processing.
DEPLOYING The processor version is being deployed.
UNDEPLOYED The processor version is not deployed and cannot be used for processing.
UNDEPLOYING The processor version is being undeployed.
CREATING The processor version is being created.
DELETING The processor version is being deleted.
FAILED The processor version failed and is in an indeterminate state.
IMPORTING The processor version is being imported.

EvaluationReference

Gives a short summary of an evaluation, and links to the evaluation itself.

JSON representation
{
  "operation": string,
  "evaluation": string,
  "aggregateMetrics": {
    object (Metrics)
  },
  "aggregateMetricsExact": {
    object (Metrics)
  }
}
Fields
operation

string

The resource name of the Long Running Operation for the evaluation.

evaluation

string

The resource name of the evaluation.

aggregateMetrics

object (Metrics)

An aggregate of the statistics for the evaluation with fuzzy matching on.

aggregateMetricsExact

object (Metrics)

An aggregate of the statistics for the evaluation with fuzzy matching off.

DeprecationInfo

Information about the upcoming deprecation of this processor version.

JSON representation
{
  "deprecationTime": string,
  "replacementProcessorVersion": string
}
Fields
deprecationTime

string (Timestamp format)

The time at which this processor version will be deprecated.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

replacementProcessorVersion

string

If set, the processor version that will be used as a replacement.

ModelType

The possible model types of the processor version.

Enums
MODEL_TYPE_UNSPECIFIED The processor version has unspecified model type.
MODEL_TYPE_GENERATIVE The processor version has generative model type.
MODEL_TYPE_CUSTOM The processor version has custom model type.

GenAiModelInfo

Information about Generative AI model-based processor versions.

JSON representation
{

  // Union field model_info can be only one of the following:
  "foundationGenAiModelInfo": {
    object (FoundationGenAiModelInfo)
  },
  "customGenAiModelInfo": {
    object (CustomGenAiModelInfo)
  }
  // End of list of possible types for union field model_info.
}
Fields
Union field model_info. The processor version is either a pretrained Google-managed foundation model or a custom Generative AI model created by the user. model_info can be only one of the following:
foundationGenAiModelInfo

object (FoundationGenAiModelInfo)

Information for a pretrained Google-managed foundation model.

customGenAiModelInfo

object (CustomGenAiModelInfo)

Information for a custom Generative AI model created by the user.

FoundationGenAiModelInfo

Information for a pretrained Google-managed foundation model.

JSON representation
{
  "finetuningAllowed": boolean,
  "minTrainLabeledDocuments": integer
}
Fields
finetuningAllowed

boolean

Whether finetuning is allowed for this base processor version.

minTrainLabeledDocuments

integer

The minimum number of labeled documents in the training dataset required for finetuning.

CustomGenAiModelInfo

Information for a custom Generative AI model created by the user. These are created with Create New Version in either the Call foundation model or Fine tuning tabs.

JSON representation
{
  "customModelType": enum (CustomModelType),
  "baseProcessorVersionId": string
}
Fields
customModelType

enum (CustomModelType)

The type of custom model created by the user.

baseProcessorVersionId

string

The base processor version ID for the custom model.

CustomModelType

The type of custom model created by the user.

Enums
CUSTOM_MODEL_TYPE_UNSPECIFIED The model type is unspecified.
VERSIONED_FOUNDATION The model is a versioned foundation model.
FINE_TUNED The model is a finetuned foundation model.

Methods

batchProcess

LRO endpoint to batch process many documents.

delete

Deletes the processor version, all artifacts under the processor version will be deleted.

deploy

Deploys the processor version.

evaluateProcessorVersion

Evaluates a ProcessorVersion against annotated documents, producing an Evaluation.

get

Gets a processor version detail.

importProcessorVersion

Imports a processor version from source processor version.

list

Lists all versions of a processor.

process

Processes a single document.

train

Trains a new processor version.

undeploy

Undeploys the processor version.