The OpenAI Assistants API provides persistent AI agents with built-in memory, file search, and code execution. Unlike direct API calls, Assistants maintain conversation threads and can access files you upload. Here’s how to build with it.
Core Concepts
Assistant — A configured AI instance with instructions, tools, and model settings. Reusable across many conversations.
Thread — A conversation session. Persists messages and context indefinitely.
Message — A user or assistant message within a thread.
Run — The execution of an assistant against a thread. Creates a response.
Tool — Optional capabilities: file_search, code_interpreter, or custom functions.
Installation and Setup
pip install openai
export OPENAI_API_KEY="sk-..."
Create an Assistant
from openai import OpenAI
client = OpenAI()
assistant = client.beta.assistants.create(
name="Research Assistant",
instructions="""You are a research assistant that helps analyze documents
and answer questions based on their content. Always cite the specific
document and section when providing information.""",
model="gpt-4o",
tools=[{"type": "file_search"}]
)
print(f"Assistant ID: {assistant.id}")
# Save this ID — you'll reuse it, not create a new one each time
Upload Files
# Upload a file to the Assistants API
with open("report.pdf", "rb") as f:
file = client.files.create(
file=f,
purpose="assistants"
)
# Create a vector store with the file
vector_store = client.beta.vector_stores.create(
name="Research Documents",
file_ids=[file.id]
)
# Attach the vector store to your assistant
client.beta.assistants.update(
assistant.id,
tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}}
)
Have a Conversation
import time
def ask_assistant(assistant_id: str, thread_id: str | None, question: str) -> str:
# Create or reuse a thread
if thread_id is None:
thread = client.beta.threads.create()
thread_id = thread.id
# Add the user message
client.beta.threads.messages.create(
thread_id=thread_id,
role="user",
content=question
)
# Run the assistant
run = client.beta.threads.runs.create(
thread_id=thread_id,
assistant_id=assistant_id
)
# Wait for completion
while run.status in ["queued", "in_progress"]:
time.sleep(0.5)
run = client.beta.threads.runs.retrieve(
thread_id=thread_id,
run_id=run.id
)
if run.status == "failed":
raise Exception(f"Run failed: {run.last_error}")
# Get the latest message
messages = client.beta.threads.messages.list(thread_id=thread_id)
response = messages.data[0].content[0].text.value
return response, thread_id
# First question
answer, thread_id = ask_assistant(
assistant.id,
None,
"What are the key findings in the report?"
)
print(answer)
# Follow-up (same thread = remembers context)
answer, thread_id = ask_assistant(
assistant.id,
thread_id,
"Which finding is most actionable for Q3?"
)
print(answer)
Code Interpreter Tool
Let the assistant run Python to analyze data:
data_analyst = client.beta.assistants.create(
name="Data Analyst",
instructions="Analyze data and create visualizations. Show your Python code.",
model="gpt-4o",
tools=[{"type": "code_interpreter"}]
)
# Upload a CSV
with open("sales_data.csv", "rb") as f:
csv_file = client.files.create(file=f, purpose="assistants")
# Ask it to analyze
thread = client.beta.threads.create()
client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content="Analyze this sales data and create a trend chart.",
attachments=[{
"file_id": csv_file.id,
"tools": [{"type": "code_interpreter"}]
}]
)
run = client.beta.threads.runs.create_and_poll(
thread_id=thread.id,
assistant_id=data_analyst.id
)
# Get the response (may include generated images)
messages = client.beta.threads.messages.list(thread_id=thread.id)
for msg in messages.data:
for content in msg.content:
if content.type == "text":
print(content.text.value)
elif content.type == "image_file":
# Download the generated chart
file_data = client.files.content(content.image_file.file_id)
with open("chart.png", "wb") as f:
f.write(file_data.content)
Custom Function Tools
import json
# Define tools
tools = [
{
"type": "function",
"function": {
"name": "search_database",
"description": "Search the product database",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string"},
"category": {"type": "string", "enum": ["electronics", "clothing", "books"]}
},
"required": ["query"]
}
}
}
]
assistant = client.beta.assistants.create(
name="Product Assistant",
instructions="Help users find products.",
model="gpt-4o",
tools=tools
)
# Handle tool calls in the run loop
run = client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant.id
)
while run.status in ["queued", "in_progress", "requires_action"]:
if run.status == "requires_action":
tool_calls = run.required_action.submit_tool_outputs.tool_calls
outputs = []
for tool_call in tool_calls:
if tool_call.function.name == "search_database":
args = json.loads(tool_call.function.arguments)
# Execute your actual function
result = search_database(args["query"], args.get("category"))
outputs.append({
"tool_call_id": tool_call.id,
"output": json.dumps(result)
})
run = client.beta.threads.runs.submit_tool_outputs(
thread_id=thread.id,
run_id=run.id,
tool_outputs=outputs
)
else:
time.sleep(0.5)
run = client.beta.threads.runs.retrieve(
thread_id=thread.id,
run_id=run.id
)
Production Patterns
Reuse Assistants
Create assistants once; reuse by ID:
ASSISTANT_ID = "asst_..." # Store in environment variable
# Don't create a new assistant every time
assistant = client.beta.assistants.retrieve(ASSISTANT_ID)
Thread Management
Threads persist by default but accrue cost if you’re building context. For new conversations, create new threads. For ongoing conversations (like a chat app), reuse the same thread.
Streaming Runs
with client.beta.threads.runs.stream(
thread_id=thread.id,
assistant_id=assistant.id
) as stream:
for text in stream.text_deltas:
print(text, end="", flush=True)
When to Use Assistants API vs. Direct Messages API
Use Assistants API when:
- You need persistent conversation threads
- File search over uploaded documents is required
- Code Interpreter for data analysis is needed
- You want session persistence without managing history yourself
Use Messages API directly when:
- You manage conversation history yourself
- You want lower latency (Assistants has overhead)
- You need maximum control over the request
- You’re using Claude (Anthropic doesn’t have an Assistants equivalent)
The Assistants API trades latency and control for convenience features. For simple chat applications, the direct Messages API is often better.