Wrappers for Document AI Page type.
Classes
FormField
FormField(
documentai_formfield: google.cloud.documentai_v1.types.document.Document.Page.FormField,
field_name: str,
field_value: str,
)
Represents a wrapped documentai.Document.Page.FormField.
Line
Line(
documentai_line: google.cloud.documentai_v1.types.document.Document.Page.Line,
text: str,
)
Represents a wrapped documentai.Document.Page.Line.
Page
Page(
documentai_page: google.cloud.documentai_v1.types.document.Document.Page, text: str
)
Represents a wrapped documentai.Document.Page .
Required. A list of visually detected text lines on the page. A collection of tokens that a human would perceive as a line.
:type: List[str]
Paragraph
Paragraph(
documentai_paragraph: google.cloud.documentai_v1.types.document.Document.Page.Paragraph,
text: str,
)
Represents a wrapped documentai.Document.Page.Paragraph.
Table
Table(
documentai_table: google.cloud.documentai_v1.types.document.Document.Page.Table,
body_rows: List[List[str]],
header_rows: List[List[str]],
)
Represents a wrapped documentai.Document.Page.Table.
Modules Functions
_get_form_fields
_get_form_fields(
form_fields: List[
google.cloud.documentai_v1.types.document.Document.Page.FormField
],
text: str,
)
Returns a list of FormField.
Parameters | |
---|---|
Name | Description |
form_fields |
List[documentai.Document.Page.FormField]
Required. A list of documentai.Document.Page.FormField objects. |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
List[FormField] | A list of FormFields. |
_get_lines
_get_lines(
lines: List[google.cloud.documentai_v1.types.document.Document.Page.Line], text: str
)
Returns a list of Line.
Parameters | |
---|---|
Name | Description |
lines |
List[documentai.Document.Page.Line]
Required. A list of documentai.Document.Page.Line objects. |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
List[Line] | A list of Lines. |
_get_paragraphs
_get_paragraphs(
paragraphs: List[google.cloud.documentai_v1.types.document.Document.Page.Paragraph],
text: str,
)
Returns a list of Paragraph.
Parameters | |
---|---|
Name | Description |
paragraphs |
List[documentai.Document.Page.Paragraph]
Required. A list of documentai.Document.Page.Paragraph objects. |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
List[Paragraph] | A list of Paragraphs. |
_table_rows_from_documentai_table_rows
_table_rows_from_documentai_table_rows(
table_rows: List[
google.cloud.documentai_v1.types.document.Document.Page.Table.TableRow
],
text: str,
)
Returns a list of rows from table_rows.
Parameters | |
---|---|
Name | Description |
table_rows |
List[documentai.Document.Page.Table.TableRow]
Required. A documentai.Document.Page.Table.TableRow. |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
List[str] | A list of table rows. |
_table_wrapper_from_documentai_table
_table_wrapper_from_documentai_table(
documentai_table: google.cloud.documentai_v1.types.document.Document.Page.Table,
text: str,
)
Returns a Table.
Parameters | |
---|---|
Name | Description |
documentai_table |
documentai.Document.Page.Table
Required. A documentai.Document.Page.Table. |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
Table | A Table. |
_text_from_layout
_text_from_layout(
layout: google.cloud.documentai_v1.types.document.Document.Page.Layout, text: str
)
Returns a text from a single layout element.
Parameters | |
---|---|
Name | Description |
layout |
documentai.Document.Page.Layout
Required. an element with layout fields. |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
str | Text from a single element. |
_trim_text
_trim_text(text: str)
Remove extra space characters from text (blank, newline, tab, etc.)
Parameter | |
---|---|
Name | Description |
text |
str
Required. UTF-8 encoded text in reading order from the document. |
Returns | |
---|---|
Type | Description |
str | Text without trailing spaces/newlines |