Our Series E: we raised $300M at a $5B valuation to power a multi-model future. READ

Philip Kiely

Lead Developer Advocate

Philip Kiely

Model performance

A quick introduction to speculative decoding

Pankaj Gupta

Philip Kiely

Pankaj Gupta

2 others

Intro to Speculative Decoding

Infrastructure

Evaluating NVIDIA H200 Tensor Core GPUs for LLM inference

Pankaj Gupta

Philip Kiely

Pankaj Gupta

1 other

NVIDIA H200

News

Export your model inference metrics to your favorite observability tool

Helen Yang

Nicolas Gere-lamaysouette

Philip Kiely

Helen Yang

2 others

Export your inference metrics

Community

Building high-performance compound AI applications with MongoDB Atlas and Baseten

Philip Kiely

Philip Kiely

MongoDB + Baseten

Model performance

How to build function calling and JSON mode for open-source and fine-tuned LLMs

Bryce Dubayah

Philip Kiely

Bryce Dubayah

1 other

JSON Mode

News

Introducing function calling and structured output for open-source and fine-tuned LLMs

Bryce Dubayah

Philip Kiely

Bryce Dubayah

1 other

Function calling + JSON Mode

AI engineering

The best open-source image generation model

Philip Kiely

Philip Kiely

Best image generation models

Model performance

How to double tokens per second for Llama 3 with Medusa

Abu Qader

Philip Kiely

Abu Qader

1 other

Double Llama TPS with Medusa

Community

SPC hackathon winners build with Llama 3.1 on Baseten

Philip Kiely

Philip Kiely

SPC Hackathon winners

1 2 3 4...8

Explore Baseten today

Start deploying

Talk to an engineer