Hello text data: Create a text classification dataset and import documents
Stay organized with collections
Save and categorize content based on your preferences.
Use the Vertex AI console to create a text classification dataset. After
your dataset is created, use the CSV that you copied into your
Cloud Storage bucket to import those documents into the dataset.
From the Get started with Vertex AI page, click Create
dataset.
Specify details about your dataset.
Specify a name for this dataset, such as text_classification_tutorial.
In the Select a data type and objective section, click Text and
then select Text classification (Single-label).
For the Region, select us-central1.
This tutorial uses us-central1, but Vertex AI
supports other regions, such as europe-west4.
Click Create to create the empty dataset and then import documents.
On the import page, select the radio_button_checkedSelect import files
from Cloud Storage and specify the Cloud Storage location of
your CSV file. Tip: Click Browse, select the happiness.csv file
in the Select object dialog, and click Select.
For this tutorial, the CSV file is at:
gs://${BUCKET}/text/happiness.csv. The bucket for this tutorial
is in the same region as the dataset, but you can specify files that are in
buckets from any region.
Keep the Default data split.
Vertex AI automatically assigns documents to training,
validation, and test sets. For more information, see About data splits for
AutoML models.
Click Continue to start the import.
The import process will take a few minutes. When the import completes, you
can browse all of the imported documents and their associated labels in the
dataset's Browse tab.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-02-03 UTC."],[],[]]