
Pydantic AI 1.85: Type-Safe Agents That Survive Crashes
Summary
Type-safe AI agents in Python with durable execution, retries, and validated outputs.
Pydantic AI 1.85 landed last week with a simple promise: build AI agents that fail predictably and recover automatically. If you have ever watched a LangChain or CrewAI agent silently return a wrong-shaped dict at 3 a.m., this release is for you. Outputs are type-checked at runtime, transient model errors get retried with exponential backoff, and long-running workflows can be paused and resumed across crashes.
By the end of this guide you will have a working agent in roughly 50 lines of Python that returns a validated Order object, calls a typed tool, and survives a process restart mid-run. We will use OpenAI as the model provider, but Pydantic AI is model-agnostic, so the same code works with Anthropic, Gemini, Ollama, or Bedrock by swapping one string.
Prerequisites
- Python 3.10 or newer (3.12 recommended).
- An OpenAI API key in
OPENAI_API_KEY. Anthropic, Gemini, or a local Ollama model also work. - About 10 minutes and a coffee.
Step 1: Install Pydantic AI 1.85
Pin the version so the examples below stay reproducible:
python -m venv .venv && source .venv/bin/activate
pip install "pydantic-ai==1.85.1" "pydantic>=2.10"
# Optional, for the durable-execution example
pip install "pydantic-ai[durable]"
If you are on Windows PowerShell, use .venv\Scripts\Activate.ps1. The [durable] extra pulls in a small SQLite-backed checkpointer; you can swap it for Postgres or Redis later.
Step 2: Define a typed agent with a structured output
An agent in Pydantic AI is just three things: a model name, an output schema, and a system prompt. The output schema is a Pydantic model, which means the SDK forces the LLM to return JSON that round-trips into your class — no manual parsing, no regex.
# agent.py
from pydantic import BaseModel, Field
from pydantic_ai import Agent
class Order(BaseModel):
sku: str = Field(..., min_length=3, max_length=20)
quantity: int = Field(..., ge=1, le=999)
rush: bool = False
agent = Agent(
model="openai:gpt-4o-mini",
output_type=Order,
system_prompt=(
"You convert customer messages into Order objects. "
"If the customer says 'urgent' or 'ASAP', set rush=True."
),
)
result = agent.run_sync("Hey, can I get 12 of SKU-7781 ASAP?")
print(result.output)
print(type(result.output))
Run it:
$ python agent.py
sku='SKU-7781' quantity=12 rush=True
<class '__main__.Order'>
If the model returns malformed JSON, Pydantic AI re-prompts it with the validation error attached, up to retries=3 by default. You almost never see ValidationError bubble up to your code.
Step 3: Add a typed tool the agent can call
Tools are decorated functions. Type hints become the JSON schema the model sees, and the docstring becomes the tool description. No JSON Schema by hand.
from pydantic_ai import Agent, RunContext
agent = Agent(
model="openai:gpt-4o-mini",
output_type=Order,
system_prompt="Look up the SKU before confirming the order.",
)
INVENTORY = {"SKU-7781": 50, "SKU-9001": 0}
@agent.tool
def stock_level(ctx: RunContext, sku: str) -> int:
"""Return how many units of `sku` are currently in stock."""
return INVENTORY.get(sku, 0)
result = agent.run_sync("Order 12 of SKU-7781 if available, otherwise nothing.")
print(result.output)
print("Tool calls:", [c.tool_name for c in result.all_messages() if c.kind == "tool-call"])
Example output:
sku='SKU-7781' quantity=12 rush=False
Tool calls: ['stock_level']
Notice the RunContext argument — that is where you inject dependencies like a database session, a tenant ID, or a feature-flag client. The model never sees it; it is server-side only.
Step 4: Make the agent survive a crash
This is the headline feature of 1.85. Wrap your agent in a DurableAgent and every tool call, model call, and partial output is checkpointed. If the process dies, calling resume() with the same run ID picks up exactly where it left off — no duplicate tool calls, no lost state.
from pydantic_ai.durable import DurableAgent, SqliteCheckpointer
durable = DurableAgent(agent, checkpointer=SqliteCheckpointer("runs.db"))
run_id = "order-2026-04-28-001"
try:
result = durable.run_sync("Order 12 of SKU-7781 ASAP", run_id=run_id)
except KeyboardInterrupt:
# Simulate a crash mid-run
pass
# Later, in a fresh process:
result = durable.resume(run_id=run_id)
print(result.output)
Under the hood, every step writes a row to runs.db. Replay is deterministic: cached tool results are returned from the checkpoint instead of re-calling your code, so idempotency is no longer your problem.
Common pitfalls
- Forgetting
output_type. Without it, you get a plain string and lose the validation safety net. Always declare a schema, even if it is a single-field model. - Mutable defaults in tool signatures. Pydantic AI introspects signatures at agent-creation time; a
list = []default leaks across runs. UseField(default_factory=list)on the model side. - Retrying non-deterministic tools. If your tool sends an email, mark it with
@agent.tool(retries=0)or wrap the side effect in an idempotency key — durable replay will happily re-send otherwise. - Mixing
run_syncandrun. The asyncrunreturns a coroutine; calling it inside FastAPI withoutawaitsilently returns the coroutine object as your output. Linters catch this; tests catch it faster.
Quick reference
| Concept | API | Notes |
|---|---|---|
| Define agent | Agent(model, output_type, system_prompt) | output_type is any Pydantic model |
| Sync run | agent.run_sync(prompt) | Blocks; good for scripts |
| Async run | await agent.run(prompt) | Use inside FastAPI / asyncio |
| Tool | @agent.tool def name(ctx, arg: T) -> R | Type hints become JSON schema |
| Inject deps | RunContext.deps | Pass with agent.run_sync(..., deps=x) |
| Durable wrap | DurableAgent(agent, checkpointer) | Checkpoint = SQLite, Postgres, Redis |
| Resume | durable.resume(run_id=...) | Replays from last checkpoint |
Next steps
- Swap
openai:gpt-4o-miniforanthropic:claude-sonnet-4-6orollama:llama3.1— the rest of your code does not change. - Add streaming with
agent.run_stream()to get tokens as they arrive while still validating the final object. - Plug your
DurableAgentbehind a Celery or Temporal worker to get queue-level retries on top of step-level checkpoints. - Read the 1.85 changelog for the new
Capabilityprimitive — it lets you bundle tools, hooks, and prompts into reusable units.
Tested on Python 3.12, pydantic-ai 1.85.1, pydantic 2.10.4. Code is MIT-licensed; copy freely.
Comments
Be the first to comment