Our Series E: we raised $300M at a $5B valuation to power a multi-model future. READ

large language

Llama 3.1 70B Instruct

Formerly SOTA midsize LLM from Meta (try Llama 3.3 70B instead)

Deploy now

‌

Model details

Developed by
Meta
Model family
Llama
Use case
large language
Version
3.1
Variant
Instruct
Size
70B
Optimization
TRT-LLM
Hardware
H100
License
Llama 3.1

View repository

Example usage

Llama 3.1 70B is an OpenAI-compatible model and can be called using the OpenAI SDK in any language.

Input

1from openai import OpenAI
2import os
3
4model_url = "" # Copy in from API pane in Baseten model dashboard
5
6client = OpenAI(
7    api_key=os.environ['BASETEN_API_KEY'],
8    base_url=model_url
9)
10
11# Chat completion
12response_chat = client.chat.completions.create(
13    model="",
14    messages=[
15        {"role": "user", "content": "Tell me a fun fact about cats."}
16    ],
17    temperature=0.3,
18    max_tokens=100,
19)
20print(response_chat)

JSON output

1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}