The Claude API is straightforward to use with Python. This guide covers everything from initial setup to production patterns.
Prerequisites
- Python 3.8+
- An Anthropic API key (get one at console.anthropic.com)
- Basic Python knowledge
Installation
pip install anthropic
Authentication
Set your API key as an environment variable (recommended — never hardcode it):
export ANTHROPIC_API_KEY="sk-ant-..."
Or pass it directly (only for local development):
import anthropic
client = anthropic.Anthropic(api_key="sk-ant-...")
Your First API Call
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain recursion in one paragraph."}
]
)
print(message.content[0].text)
Understanding the Response
print(message.id) # Unique message ID
print(message.model) # Model used
print(message.stop_reason) # "end_turn", "max_tokens", etc.
print(message.usage) # Input/output token counts
System Prompts
System prompts set the context and behavior for the conversation:
message = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
system="You are a senior Python developer. Respond with production-quality code only.",
messages=[
{"role": "user", "content": "Write a function that validates an email address."}
]
)
Multi-Turn Conversations
Pass the full message history for conversational applications:
messages = []
def chat(user_message: str) -> str:
messages.append({"role": "user", "content": user_message})
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=messages
)
assistant_message = response.content[0].text
messages.append({"role": "assistant", "content": assistant_message})
return assistant_message
# Use it
print(chat("What is Python?"))
print(chat("How is it different from Java?"))
print(chat("Give me an example that shows the key difference."))
Streaming
For long responses, stream the output instead of waiting for the complete response:
with client.messages.stream(
model="claude-3-5-sonnet-20241022",
max_tokens=2048,
messages=[{"role": "user", "content": "Write a comprehensive guide on Python decorators."}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
Tool Use (Function Calling)
Define tools that Claude can call to extend its capabilities:
tools = [
{
"name": "get_weather",
"description": "Get the current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["city"]
}
}
]
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}]
)
# Handle tool calls
if response.stop_reason == "tool_use":
for block in response.content:
if block.type == "tool_use":
tool_name = block.name
tool_input = block.input
print(f"Claude wants to call: {tool_name} with {tool_input}")
Vision (Image Input)
Send images for Claude to analyze:
import base64
with open("screenshot.png", "rb") as f:
image_data = base64.standard_b64encode(f.read()).decode("utf-8")
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data,
},
},
{
"type": "text",
"text": "What does this UI look like? Describe the layout."
}
],
}
],
)
Prompt Caching
For long system prompts that don’t change, enable caching to save costs:
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
system=[
{
"type": "text",
"text": "You are an expert in Python. Here is our full codebase context: [... 50K tokens ...]",
"cache_control": {"type": "ephemeral"}
}
],
messages=[{"role": "user", "content": "How does the auth middleware work?"}]
)
Cached tokens cost 90% less on repeat calls within 5 minutes.
Batch Processing
For large volumes, use the Batch API (50% cost reduction):
requests = [
{
"custom_id": f"item-{i}",
"params": {
"model": "claude-3-haiku-20240307",
"max_tokens": 100,
"messages": [{"role": "user", "content": f"Classify sentiment: {text}"}]
}
}
for i, text in enumerate(texts)
]
batch = client.messages.batches.create(requests=requests)
print(f"Batch ID: {batch.id}")
Error Handling
from anthropic import APIError, RateLimitError, APIStatusError
try:
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello"}]
)
except RateLimitError:
print("Rate limited — back off and retry")
except APIStatusError as e:
print(f"API error: {e.status_code} — {e.message}")
except APIError as e:
print(f"General API error: {e}")
Model Selection Guide
| Model | Use Case | Cost |
|---|---|---|
claude-3-5-sonnet-20241022 | Most tasks, best quality | $3/$15 per 1M tokens |
claude-3-haiku-20240307 | High-volume, simple tasks | $0.25/$1.25 per 1M |
claude-3-opus-20240229 | Maximum reasoning | $15/$75 per 1M |
Next Steps
- Add streaming to your application for better UX
- Implement tool use to extend Claude’s capabilities
- Use Batch API for cost savings on bulk processing
- Set up prompt caching for repeated long system prompts
Full documentation: docs.anthropic.com