Stay organized with collections
Save and categorize content based on your preferences.
The following provides a description of the consumption options that are
supported for AI Hypercomputer. Consumption options are the methods used to
request capacity while provisioning models are how you specify what type of
capacity to use when creating your VMs or clusters.
When deploying your VMs or clusters, you must specify a provisioning model
that matches your required consumption option. For more information about
provisioning models, see
Provisioning models.
You request compute resources in advance for a specific amount of
time. These resources are dedicated to you for that period of time.
Reservations provide the highest level of assurance for capacity and are
cost effective as they are available at a much lower price than
the a on-demand request.
Reservations are ideal for long running training jobs and
inference workloads.
All GPU machine types
For A3 Ultra, you must request reservations by using Hypercompute Cluster.
To make this request, see Request capacity.
Hypercompute Cluster reserves a dense allocation of A3 Ultra machines.
You request compute resources for a specific amount of time, around 28 days.
As these are delivered from a secured pool of resources, the
availability of these are much higher than an on-demand request.
DWS is ideal for workloads that need to run at a
specific time. These include small model pre-training jobs,
model fine-tuning jobs, HPC simulation workloads, and short-term expected
increases in inference workloads.
You request compute resources which are delivered based on
availability.
These spot resources might be easier to obtain than the on-demand
resource but can be deleted at any time by the system. These
resources are cost effective as they are available at a much lower
price than the standard model.
Spot is a good fit for scheduling lower priority workloads like model
pre-training, model fine-tuning jobs and simulation jobs that are
tolerant to availability disruptions.
All GPU machine types except A3 Ultra
Pricing and discount
The accelerator-optimized machine types are billed for their attached
GPUs, predefined vCPU, memory, and bundled Local SSD (if applicable). For
more pricing information for accelerator-optimized VMs, see
Accelerator-optimized machine type family
section on the VM instance pricing page.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-03-05 UTC."],[[["AI Hypercomputer supports various consumption options for requesting capacity, including Reservations, Dynamic Workload Scheduler (DWS), and Spot, each with distinct methods and use cases."],["Reservations offer dedicated compute resources for a specified time, providing the highest capacity assurance and cost-effectiveness, making them suitable for long-running training and inference tasks, and support all GPU machine types."],["Dynamic Workload Scheduler (DWS) provides compute resources from a secured pool for a specific duration, generally around 28 days, ideal for workloads with specific timing requirements, supporting all GPU machine types except A3 Ultra."],["Spot offers compute resources based on availability at a lower cost, but they can be preempted, which is best for lower-priority, disruption-tolerant tasks, and supports all GPU machine types except A3 Ultra."],["Accelerator-optimized machine types are billed based on attached GPUs, predefined vCPU, memory, and bundled Local SSD, with detailed pricing information available in the VM instance pricing section."]]],[]]