CentML

CentML provides an optimized AI platform for deploying LLMs on GPU clouds with superior performance and lower costs.

Updated April 2026

Overview

Website: centml.ai
Acquired by: NVIDIA
Segment: Serverless Inference

Product overview

CentML offers the NVIDIA CCluster platform for serverless LLM endpoints, on-demand GPU compute instances, and flexible model deployments across custom models, open-source LLMs, on-premises, cloud VPCs, or their managed infrastructure . Customers like EquoAI use it to save up to $250K/year on LLM-based legal document summarization . It stands out with automated optimizations delivering up to 2x faster speeds and 30% lower costs via advanced compilers and GPU orchestration .

Revenue model

Serverless per-token pricing (e.g., $2.50/million tokens for Llama-405B); dedicated deployments per-minute hardware usage; credit-based billing (1 credit = $1 USD); custom enterprise pricing , .

Moat

Proprietary Technology
Cost Advantages
Scale Advantages

CentML's key competitive moat is its proprietary compiler technology that optimizes AI model training and inference, delivering up to 2x faster performance and 30% lower costs on GPUs compared to competitors.