Resource: TuningJob
Represents a TuningJob that runs with Google owned models.
name
string
Output only. Identifier. Resource name of a TuningJob. Format: projects/{project}/locations/{location}/tuningJobs/{tuningJob}
tunedModelDisplayName
string
Optional. The display name of the TunedModel
. The name can be up to 128 characters long and can consist of any UTF-8 characters.
description
string
Optional. The description of the TuningJob
.
Output only. The detailed state of the job.
Output only. time when the TuningJob
was created.
Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z"
, "2014-10-02T15:01:23.045123456Z"
or "2014-10-02T15:01:23+05:30"
.
Output only. time when the TuningJob
for the first time entered the JOB_STATE_RUNNING
state.
Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z"
, "2014-10-02T15:01:23.045123456Z"
or "2014-10-02T15:01:23+05:30"
.
Output only. time when the TuningJob entered any of the following JobStates
: JOB_STATE_SUCCEEDED
, JOB_STATE_FAILED
, JOB_STATE_CANCELLED
, JOB_STATE_EXPIRED
.
Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z"
, "2014-10-02T15:01:23.045123456Z"
or "2014-10-02T15:01:23+05:30"
.
Output only. time when the TuningJob
was most recently updated.
Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than "Z" are also accepted. Examples: "2014-10-02T15:01:23Z"
, "2014-10-02T15:01:23.045123456Z"
or "2014-10-02T15:01:23+05:30"
.
Output only. Only populated when job's state is JOB_STATE_FAILED
or JOB_STATE_CANCELLED
.
labels
map (key: string, value: string)
Optional. The labels with user-defined metadata to organize TuningJob
and generated resources such as Model
and Endpoint
.
label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed.
See https://goo.gl/xmQnxf for more information and examples of labels.
experiment
string
Output only. The Experiment associated with this TuningJob
.
Output only. The tuned model resources assiociated with this TuningJob
.
Output only. The tuning data statistics associated with this TuningJob
.
pipelineJob
(deprecated)
string
Output only. The resource name of the PipelineJob associated with the TuningJob
. Format: projects/{project}/locations/{location}/pipelineJobs/{pipelineJob}
.
Customer-managed encryption key options for a TuningJob. If this is set, then all resources created by the TuningJob will be encrypted with the provided encryption key.
serviceAccount
string
The service account that the tuningJob workload runs as. If not specified, the Vertex AI Secure Fine-Tuned service Agent in the project will be used. See https://cloud.google.com/iam/docs/service-agents#vertex-ai-secure-fine-tuning-service-agent
Users starting the pipeline must have the iam.serviceAccounts.actAs
permission on this service account.
source_model
Union type
source_model
can be only one of the following:baseModel
string
The base model that is being tuned, e.g., "gemini-1.0-pro-002". .
tuning_spec
Union type
tuning_spec
can be only one of the following:Tuning Spec for Supervised Fine Tuning.
Tuning Spec for Distillation.
Tuning Spec for open sourced and third party Partner models.
JSON representation |
---|
{ "name": string, "tunedModelDisplayName": string, "description": string, "state": enum ( |
SupervisedTuningSpec
Tuning Spec for Supervised Tuning for first party models.
trainingDatasetUri
string
Required. Cloud Storage path to file containing training dataset for tuning. The dataset must be formatted as a JSONL file.
validationDatasetUri
string
Optional. Cloud Storage path to file containing validation dataset for tuning. The dataset must be formatted as a JSONL file.
Optional. Hyperparameters for SFT.
JSON representation |
---|
{
"trainingDatasetUri": string,
"validationDatasetUri": string,
"hyperParameters": {
object ( |
SupervisedHyperParameters
Hyperparameters for SFT.
Optional. Number of complete passes the model makes over the entire training dataset during training.
learningRateMultiplier
number
Optional. Multiplier for adjusting the default learning rate.
Optional. Adapter size for tuning.
JSON representation |
---|
{
"epochCount": string,
"learningRateMultiplier": number,
"adapterSize": enum ( |
AdapterSize
Supported adapter sizes for tuning.
Enums | |
---|---|
ADAPTER_SIZE_UNSPECIFIED |
Adapter size is unspecified. |
ADAPTER_SIZE_ONE |
Adapter size 1. |
ADAPTER_SIZE_FOUR |
Adapter size 4. |
ADAPTER_SIZE_EIGHT |
Adapter size 8. |
ADAPTER_SIZE_SIXTEEN |
Adapter size 16. |
ADAPTER_SIZE_THIRTY_TWO |
Adapter size 32. |
DistillationSpec
Tuning Spec for Distillation.
trainingDatasetUri
(deprecated)
string
Deprecated. Cloud Storage path to file containing training dataset for tuning. The dataset must be formatted as a JSONL file.
Optional. Hyperparameters for Distillation.
studentModel
(deprecated)
string
The student model that is being tuned, e.g., "google/gemma-2b-1.1-it". Deprecated. Use baseModel instead.
pipelineRootDirectory
(deprecated)
string
Deprecated. A path in a Cloud Storage bucket, which will be treated as the root output directory of the distillation pipeline. It is used by the system to generate the paths of output artifacts.
teacher_model
Union type
teacher_model
can be only one of the following:baseTeacherModel
string
The base teacher model that is being distilled, e.g., "gemini-1.0-pro-002".
tunedTeacherModelSource
string
The resource name of the Tuned teacher model. Format: projects/{project}/locations/{location}/models/{model}
.
validationDatasetUri
string
Optional. Cloud Storage path to file containing validation dataset for tuning. The dataset must be formatted as a JSONL file.
JSON representation |
---|
{
"trainingDatasetUri": string,
"hyperParameters": {
object ( |
DistillationHyperParameters
Hyperparameters for Distillation.
Optional. Adapter size for distillation.
Optional. Number of complete passes the model makes over the entire training dataset during training.
learningRateMultiplier
number
Optional. Multiplier for adjusting the default learning rate.
JSON representation |
---|
{
"adapterSize": enum ( |
PartnerModelTuningSpec
Tuning spec for Partner models.
trainingDatasetUri
string
Required. Cloud Storage path to file containing training dataset for tuning. The dataset must be formatted as a JSONL file.
validationDatasetUri
string
Optional. Cloud Storage path to file containing validation dataset for tuning. The dataset must be formatted as a JSONL file.
Hyperparameters for tuning. The accepted hyperParameters and their valid range of values will differ depending on the base model.
JSON representation |
---|
{ "trainingDatasetUri": string, "validationDatasetUri": string, "hyperParameters": { string: value, ... } } |
TunedModel
The Model Registry Model and Online Prediction Endpoint assiociated with this TuningJob
.
model
string
Output only. The resource name of the TunedModel. Format: projects/{project}/locations/{location}/models/{model}
.
endpoint
string
Output only. A resource name of an Endpoint. Format: projects/{project}/locations/{location}/endpoints/{endpoint}
.
JSON representation |
---|
{ "model": string, "endpoint": string } |
TuningDataStats
The tuning data statistic values for TuningJob
.
tuning_data_stats
Union type
tuning_data_stats
can be only one of the following:The SFT Tuning data stats.
Output only. Statistics for distillation.
JSON representation |
---|
{ // tuning_data_stats "supervisedTuningDataStats": { object ( |
SupervisedTuningDataStats
Tuning data statistics for Supervised Tuning.
Output only. Number of examples in the tuning dataset.
Output only. Number of tuning characters in the tuning dataset.
Output only. Number of billable characters in the tuning dataset.
Output only. Number of billable tokens in the tuning dataset.
Output only. Number of tuning steps for this Tuning Job.
Output only. Dataset distributions for the user input tokens.
Output only. Dataset distributions for the user output tokens.
Output only. Dataset distributions for the messages per example.
Output only. Sample user messages in the training dataset uri.
Output only. The number of examples in the dataset that have been dropped. An example can be dropped for reasons including: too many tokens, contains an invalid image, contains too many images, etc.
Output only. A partial sample of the indices (starting from 1) of the dropped examples.
JSON representation |
---|
{ "tuningDatasetExampleCount": string, "totalTuningCharacterCount": string, "totalBillableCharacterCount": string, "totalBillableTokenCount": string, "tuningStepCount": string, "userInputTokenDistribution": { object ( |
SupervisedTuningDatasetDistribution
Dataset distribution for Supervised Tuning.
Output only. Sum of a given population of values.
Output only. Sum of a given population of values that are billable.
min
number
Output only. The minimum of the population values.
max
number
Output only. The maximum of the population values.
mean
number
Output only. The arithmetic mean of the values in the population.
median
number
Output only. The median of the values in the population.
p5
number
Output only. The 5th percentile of the values in the population.
p95
number
Output only. The 95th percentile of the values in the population.
Output only. Defines the histogram bucket.
JSON representation |
---|
{
"sum": string,
"billableSum": string,
"min": number,
"max": number,
"mean": number,
"median": number,
"p5": number,
"p95": number,
"buckets": [
{
object ( |
DatasetBucket
Dataset bucket used to create a histogram for the distribution given a population of values.
count
number
Output only. Number of values in the bucket.
left
number
Output only. left bound of the bucket.
right
number
Output only. Right bound of the bucket.
JSON representation |
---|
{ "count": number, "left": number, "right": number } |
DistillationDataStats
Statistics computed for datasets used for distillation.
Output only. Statistics computed for the training dataset.
JSON representation |
---|
{
"trainingDatasetStats": {
object ( |
DatasetStats
Statistics computed over a tuning dataset.
Output only. Number of examples in the tuning dataset.
Output only. Number of tuning characters in the tuning dataset.
Output only. Number of billable characters in the tuning dataset.
Output only. Number of tuning steps for this Tuning Job.
Output only. Dataset distributions for the user input tokens.
Output only. Dataset distributions for the messages per example.
Output only. Sample user messages in the training dataset uri.
Output only. Dataset distributions for the user output tokens.
JSON representation |
---|
{ "tuningDatasetExampleCount": string, "totalTuningCharacterCount": string, "totalBillableCharacterCount": string, "tuningStepCount": string, "userInputTokenDistribution": { object ( |
DatasetDistribution
Distribution computed over a tuning dataset.
sum
number
Output only. Sum of a given population of values.
min
number
Output only. The minimum of the population values.
max
number
Output only. The maximum of the population values.
mean
number
Output only. The arithmetic mean of the values in the population.
median
number
Output only. The median of the values in the population.
p5
number
Output only. The 5th percentile of the values in the population.
p95
number
Output only. The 95th percentile of the values in the population.
Output only. Defines the histogram bucket.
JSON representation |
---|
{
"sum": number,
"min": number,
"max": number,
"mean": number,
"median": number,
"p5": number,
"p95": number,
"buckets": [
{
object ( |
DistributionBucket
Dataset bucket used to create a histogram for the distribution given a population of values.
Output only. Number of values in the bucket.
left
number
Output only. left bound of the bucket.
right
number
Output only. Right bound of the bucket.
JSON representation |
---|
{ "count": string, "left": number, "right": number } |
Methods |
|
---|---|
|
Cancels a TuningJob. |
|
Creates a TuningJob. |
|
Gets a TuningJob. |
|
Lists TuningJobs in a Location. |
|
Rebase a TunedModel. |