Page(
documentai_object: google.cloud.documentai_v1.types.document.Document.Page,
document_text: str,
)
Represents a wrapped documentai.Document.Page .
Attributes | |
---|---|
Name | Description |
documentai_object
:noindex: |
google.cloud.documentai.Document.Page
Required. The original google.cloud.documentai.Document.Page object. |
document_text
:noindex: |
str
Required. The full text of the Document containing the Page .
|
text
:noindex: |
str
Required. UTF-8 encoded text of the page. |
page_number
:noindex: |
int
Required. The page number of the Page .
|
form_fields
:noindex: |
List[FormField]
Required. A list of visually detected form fields on the page. |
symbols
:noindex: |
List[Symbol]
Required. A list of visually detected text symbols (characters/letters) on the page. |
tokens
:noindex: |
List[Token]
Required. A list of visually detected text tokens (words) on the page. |
lines
:noindex: |
List[Line]
Required. A list of visually detected text lines on the page. A collection of tokens that a human would perceive as a line. |
paragraphs
:noindex: |
List[Paragraph]
Required. A list of visually detected text paragraphs on the page. A collection of lines that a human would perceive as a paragraph. |
blocks
:noindex: |
List[Block]
Required. A list of visually detected text blocks on the page. A collection of lines that a human would perceive as a block. |
tables
:noindex: |
List[Table]
Required. A list of visually detected tables on the page. |
math_formulas
:noindex: |
List[MathFormula]
Optional. A list of visually detected math formulas on the page. |
Properties
hocr_bounding_box
hOCR bounding box of the page element.
Methods
__post_init__
__post_init__() -> None
Order of Init Symbol Token Line Paragraph, Block