CentML
CentML provides an optimized AI platform for deploying LLMs on GPU clouds with superior performance and lower costs.
Updated April 2026
Overview
- Website
- centml.ai
- Acquired by
- NVIDIA
- Segment
- Serverless Inference
Product overview
CentML offers the NVIDIA CCluster platform for serverless LLM endpoints, on-demand GPU compute instances, and flexible model deployments across custom models, open-source LLMs, on-premises, cloud VPCs, or their managed infrastructure . Customers like EquoAI use it to save up to $250K/year on LLM-based legal document summarization . It stands out with automated optimizations delivering up to 2x faster speeds and 30% lower costs via advanced compilers and GPU orchestration .
Revenue model
Serverless per-token pricing (e.g., $2.50/million tokens for Llama-405B); dedicated deployments per-minute hardware usage; credit-based billing (1 credit = $1 USD); custom enterprise pricing , .
Moat
- Proprietary Technology
- Cost Advantages
- Scale Advantages
CentML's key competitive moat is its proprietary compiler technology that optimizes AI model training and inference, delivering up to 2x faster performance and 30% lower costs on GPUs compared to competitors.