Method: dataset.importDocuments

Full name: projects.locations.processors.dataset.importDocuments

Import documents into a dataset.

HTTP request

POST https://{endpoint}/v1beta3/{dataset}:importDocuments

Where {endpoint} is one of the supported service endpoints.

Path parameters

Parameters

Parameters
`dataset`	`string` Required. The dataset resource name. Format: projects/{project}/locations/{location}/processors/{processor}/dataset It takes the form `projects/{project}/locations/{location}/processors/{processor}/dataset`.

dataset

string

Required. The dataset resource name. Format: projects/{project}/locations/{location}/processors/{processor}/dataset It takes the form projects/{project}/locations/{location}/processors/{processor}/dataset.

Request body

The request body contains data with the following structure:

JSON representation
{ "batchDocumentsImportConfigs": [ { object (`BatchDocumentsImportConfig`) } ] }

Fields

Fields
`batchDocumentsImportConfigs[]`	`object (BatchDocumentsImportConfig)` Required. The Cloud Storage uri containing raw documents that must be imported.

batchDocumentsImportConfigs[]

object (BatchDocumentsImportConfig)

Required. The Cloud Storage uri containing raw documents that must be imported.

Response body

If successful, the response body contains an instance of Operation.

Authorization scopes

Requires the following OAuth scope:

https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

IAM Permissions

Requires the following IAM permission on the dataset resource:

documentai.datasets.createDocuments

For more information, see the IAM documentation.

BatchDocumentsImportConfig

Config for importing documents. Each batch can have its own dataset split type.

JSON representation

JSON representation
{ "batchInputConfig": { object (`BatchDocumentsInputConfig`) }, // Union field `split_type_config` can be only one of the following: "datasetSplit": enum (`DatasetSplitType`), "autoSplitConfig": { object (`AutoSplitConfig`) } // End of list of possible types for union field `split_type_config`. }

{
  "batchInputConfig": {
    object (BatchDocumentsInputConfig)
  },

  // Union field split_type_config can be only one of the following:
  "datasetSplit": enum (DatasetSplitType),
  "autoSplitConfig": {
    object (AutoSplitConfig)
  }
  // End of list of possible types for union field split_type_config.
}

Fields
`batchInputConfig`	`object (BatchDocumentsInputConfig)` The common config to specify a set of documents used as input.
Union field `split_type_config`. `split_type_config` can be only one of the following:
`datasetSplit`	`enum (DatasetSplitType)` Target dataset split where the documents must be stored.
`autoSplitConfig`	`object (AutoSplitConfig)` If set, documents will be automatically split into training and test split category with the specified ratio.

AutoSplitConfig

The config for auto-split.

JSON representation
{ "trainingSplitRatio": number }

Fields

Fields
`trainingSplitRatio`	`number` Ratio of training dataset split.

trainingSplitRatio

number

Ratio of training dataset split.