Product
Product
Platform
Platform
Solutions
Solutions
Developer
Developer
Resources
Resources
Pricing
Pricing
Log in
Get started
Philip Kiely
Lead Developer Advocate
Infrastructure
Testing Llama 3.3 70B inference performance on NVIDIA GH200 in Lambda Cloud
Pankaj Gupta
1 other
AI engineering
Private, secure DeepSeek-R1 in production in US & EU data centers
Yineng Zhang
2 others
Model performance
How we built production-ready speculative decoding with TensorRT-LLM
Pankaj Gupta
2 others
Model performance
A quick introduction to speculative decoding
Pankaj Gupta
2 others
Infrastructure
Evaluating NVIDIA H200 Tensor Core GPUs for LLM inference
Pankaj Gupta
1 other
News
Export your model inference metrics to your favorite observability tool
Helen Yang
2 others
Community
Building high-performance compound AI applications with MongoDB Atlas and Baseten
Philip Kiely
Model performance
How to build function calling and JSON mode for open-source and fine-tuned LLMs
Bryce Dubayah
1 other
News
Introducing function calling and structured output for open-source and fine-tuned LLMs
Bryce Dubayah
1 other
1
2
3
...
8
Explore Baseten today
Start deploying
Talk to an engineer