Our Series E: we raised $300M at a $5B valuation to power a multi-model future. READ

Pankaj Gupta

Co-Founder

Pankaj Gupta

Model performance

Benchmarking fast Mistral 7B inference

Abu Qader

Pankaj Gupta

Philip Kiely

Abu Qader

3 others

Mistral 7B

Model performance

33% faster LLM inference with FP8 quantization

Pankaj Gupta

Philip Kiely

Pankaj Gupta

1 other

Faster inference with FP8

Model performance

FP8: Efficient model inference with 8-bit floating point numbers

Pankaj Gupta

Philip Kiely

Pankaj Gupta

1 other

8-bit floating point numbers

Model performance

40% faster Stable Diffusion XL inference with NVIDIA TensorRT

Pankaj Gupta

Philip Kiely

Pankaj Gupta

2 others

40% faster SDXL

Model performance

Unlocking the full power of NVIDIA H100 GPUs for ML inference with TensorRT

Pankaj Gupta

Philip Kiely

Pankaj Gupta

1 other

H100 w/ TensorRT-LLM

Model performance

Faster Mixtral inference with TensorRT-LLM and quantization

Pankaj Gupta

Philip Kiely

Pankaj Gupta

2 others

Faster Mixtral inference

Infrastructure

Technical deep dive: Truss live reload

Pankaj Gupta

Pankaj Gupta

Live reload

Explore Baseten today

Start deploying

Talk to an engineer

Pankaj Gupta - Co-Founder