Our Series E: we raised $300M at a $5B valuation to power a multi-model future.
READ
Product
Product
Platform
Platform
Developer
Developer
Resources
Resources
Research
Research
Customers
Customers
Pricing
Pricing
Log in
Get started
Bryce Dubayah
Engineering
Model performance
How to run LLM performance benchmarks (and why you should)
Alex Ker
1 other
AI engineering
Tool Calling in Inference
Kenzie Amack
1 other
Model performance
How we run GPT OSS 120B at 500+ tokens per second on NVIDIA GPUs
Amir Haghighat
4 others
News
Introducing our Speculative Decoding Engine Builder integration for ultra-low-latency LLM inference
Justin Yi
3 others
Model performance
How to build function calling and JSON mode for open-source and fine-tuned LLMs
Bryce Dubayah
1 other
News
Introducing function calling and structured output for open-source and fine-tuned LLMs
Bryce Dubayah
1 other
Explore Baseten today
Start deploying
Talk to an engineer