ProcessOptions

Options for Process API

JSON representation
{
  "ocrConfig": {
    object (OcrConfig)
  },
  "layoutConfig": {
    object (LayoutConfig)
  },
  "schemaOverride": {
    object (DocumentSchema)
  },

  // Union field page_range can be only one of the following:
  "individualPageSelector": {
    object (IndividualPageSelector)
  },
  "fromStart": integer,
  "fromEnd": integer
  // End of list of possible types for union field page_range.
}
Fields
ocrConfig

object (OcrConfig)

Only applicable to OCR_PROCESSOR and FORM_PARSER_PROCESSOR. Returns error if set on other processor types.

layoutConfig

object (LayoutConfig)

Optional. Only applicable to LAYOUT_PARSER_PROCESSOR. Returns error if set on other processor types.

schemaOverride

object (DocumentSchema)

Optional. Override the schema of the ProcessorVersion. Will return an Invalid Argument error if this field is set when the underlying ProcessorVersion doesn't support schema override.

Union field page_range. A subset of pages to process. If not specified, all pages are processed. If a page range is set, only the given pages are extracted and processed from the document. In the output document, Document.Page.page_number refers to the page number in the original document. This configuration only applies to online processing with ProcessDocument. page_range can be only one of the following:
individualPageSelector

object (IndividualPageSelector)

Which pages to process (1-indexed).

fromStart

integer

Only process certain pages from the start. Process all if the document has fewer pages.

fromEnd

integer

Only process certain pages from the end, same as above.

IndividualPageSelector

A list of individual page numbers.

JSON representation
{
  "pages": [
    integer
  ]
}
Fields
pages[]

integer

Optional. Indices of the pages (starting from 1).

OcrConfig

Config for Document OCR.

JSON representation
{
  "hints": {
    object (Hints)
  },
  "enableNativePdfParsing": boolean,
  "enableImageQualityScores": boolean,
  "advancedOcrOptions": [
    string
  ],
  "enableSymbol": boolean,
  "computeStyleInfo": boolean,
  "disableCharacterBoxesDetection": boolean,
  "premiumFeatures": {
    object (PremiumFeatures)
  }
}
Fields
hints

object (Hints)

Hints for the OCR model.

enableNativePdfParsing

boolean

Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.

enableImageQualityScores

boolean

Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input. Adds additional latency comparable to regular OCR to the process call.

advancedOcrOptions[]

string

A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:

  • legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm. Customers can choose the best suitable layout algorithm based on their situation.
enableSymbol

boolean

Includes symbol level OCR information if set to true.

computeStyleInfo
(deprecated)

boolean

Turn on font identification model and return font style information. Deprecated, use PremiumFeatures.compute_style_info instead.

disableCharacterBoxesDetection

boolean

Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.

premiumFeatures

object (PremiumFeatures)

Configurations for premium OCR features.

Hints

Hints for OCR Engine

JSON representation
{
  "languageHints": [
    string
  ]
}
Fields
languageHints[]

string

List of BCP-47 language codes to use for OCR. In most cases, not specifying it yields the best results since it enables automatic language detection. For languages based on the Latin alphabet, setting hints is not needed. In rare cases, when the language of the text in the image is known, setting a hint will help get better results (although it will be a significant hindrance if the hint is wrong).

PremiumFeatures

Configurations for premium OCR features.

JSON representation
{
  "enableSelectionMarkDetection": boolean,
  "computeStyleInfo": boolean,
  "enableMathOcr": boolean
}
Fields
enableSelectionMarkDetection

boolean

Turn on selection mark detector in OCR engine. Only available in OCR 2.0 (and later) processors.

computeStyleInfo

boolean

Turn on font identification model and return font style information.

enableMathOcr

boolean

Turn on the model that can extract LaTeX math formulas.

LayoutConfig

Serving config for layout parser processor.

JSON representation
{
  "chunkingConfig": {
    object (ChunkingConfig)
  }
}
Fields
chunkingConfig

object (ChunkingConfig)

Optional. Config for chunking in layout parser processor.

ChunkingConfig

Serving config for chunking.

JSON representation
{
  "chunkSize": integer,
  "includeAncestorHeadings": boolean,
  "semanticChunkingGroupSize": boolean,
  "breakpointPercentileThreshold": integer
}
Fields
chunkSize

integer

Optional. The chunk sizes to use when splitting documents, in order of level.

includeAncestorHeadings

boolean

Optional. Whether or not to include ancestor headings when splitting.

semanticChunkingGroupSize

boolean

Optional. The number of tokens to group together when evaluating semantic similarity.

breakpointPercentileThreshold

integer

Optional. The percentile of cosine dissimilarity that must be exceeded between a group of tokens and the next. The smaller this number is, the more chunks will be generated.