GPU Cloud
GPU Cloud Pricing
Thousands of GPUs across 30+ regions. Simple pricing plans for teams of all sizes, designed to scale with you.
GPU
Serverless Pricing
Serverless Pricing
Cost effective for every inference workload. Save 15% over other Serverless cloudproviders on flex workers alone.
GPU
Flex
Active
$4.18
$3.35
80GB
H100
Extreme throughput for big models.
$2.72
$2.17
80GB
A100
High throughput GPU, yet still very cost-effective.
$1.90
$1.33
48GB
L40, L40S, 6000 Ada
Extreme inference throughput on LLMs like Llama 3 7B.
$1.22
$0.85
48GB
A6000, A40
A cost-effective option for running big models.
$1.10
$0.77
24GB
4090
Extreme throughput for small-to-medium models.
$0.69
$0.48
24GB
L4, A5000, 3090
Great for small-to-medium sized inference workloads.
$0.58
$0.40
16GB
A4000, A4500, RTX 4000
The most cost-effective for small models.