Custom training jobs let you run your custom machine learning (ML) training code in Vertex AI.
CustomTrainingJobOp
The CustomTrainingJobOp
component exposes the full functionalities of the CustomJob
resource, to allow both single and distributed training using a ContainerSpec
or PythonPackageSpec
instance.
create_custom_training_job_from_component
function
The create_custom_training_job_from_component
utility
converts a given container or Python component to a component that runs a
custom job in Vertex AI. This simplifies the creation of custom training
jobs. All inputs and outputs of the supplied component will be copied over to
the constructed training job operator.
Note that this utility constructs a ClusterSpec
, where the primary and all the
workers use the same specification, meaning all disk and machine
specification-related parameters will apply to all replicas. This is suitable
for use cases where, for example, you are training with
MultiWorkerMirroredStrategy
or
MirroredStrategy
.
This component does not support CustomJob
Python package training, or
distributed training with different worker pool specs.
API reference
- For component reference, see the Google Cloud Pipeline Components SDK reference for CustomJob components.
- For Vertex AI API reference, see the
CustomJob
resource page.
Version history and release notes
To learn more about the version history and changes to the Google Cloud Pipeline Components SDK, see the Google Cloud Pipeline Components SDK Release Notes.
Technical support contacts
If you have any questions, reach out to kubeflow-pipelines-components@google.com.