Pydantic AI 1.85: Type-Safe Agents That Survive Crashes

Pydantic AI 1.85 landed last week with a simple promise: build AI agents that fail predictably and recover automatically. If you have ever watched a LangChain or CrewAI agent silently return a wrong-shaped dict at 3 a.m., this release is for you. Outputs are type-checked at runtime, transient model errors get retried with exponential backoff, and long-running workflows can be paused and resumed across crashes.

By the end of this guide you will have a working agent in roughly 50 lines of Python that returns a validated Order object, calls a typed tool, and survives a process restart mid-run. We will use OpenAI as the model provider, but Pydantic AI is model-agnostic, so the same code works with Anthropic, Gemini, Ollama, or Bedrock by swapping one string.

Prerequisites

Python 3.10 or newer (3.12 recommended).
An OpenAI API key in OPENAI_API_KEY. Anthropic, Gemini, or a local Ollama model also work.
About 10 minutes and a coffee.

Step 1: Install Pydantic AI 1.85

Pin the version so the examples below stay reproducible:

python -m venv .venv && source .venv/bin/activate
pip install "pydantic-ai==1.85.1" "pydantic>=2.10"

# Optional, for the durable-execution example
pip install "pydantic-ai[durable]"

If you are on Windows PowerShell, use .venv\Scripts\Activate.ps1. The [durable] extra pulls in a small SQLite-backed checkpointer; you can swap it for Postgres or Redis later.

Step 2: Define a typed agent with a structured output

An agent in Pydantic AI is just three things: a model name, an output schema, and a system prompt. The output schema is a Pydantic model, which means the SDK forces the LLM to return JSON that round-trips into your class — no manual parsing, no regex.

# agent.py
from pydantic import BaseModel, Field
from pydantic_ai import Agent

class Order(BaseModel):
    sku: str = Field(..., min_length=3, max_length=20)
    quantity: int = Field(..., ge=1, le=999)
    rush: bool = False

agent = Agent(
    model="openai:gpt-4o-mini",
    output_type=Order,
    system_prompt=(
        "You convert customer messages into Order objects. "
        "If the customer says 'urgent' or 'ASAP', set rush=True."
    ),
)

result = agent.run_sync("Hey, can I get 12 of SKU-7781 ASAP?")
print(result.output)
print(type(result.output))

Run it:

$ python agent.py
sku='SKU-7781' quantity=12 rush=True
<class '__main__.Order'>

If the model returns malformed JSON, Pydantic AI re-prompts it with the validation error attached, up to retries=3 by default. You almost never see ValidationError bubble up to your code.

Step 3: Add a typed tool the agent can call

Tools are decorated functions. Type hints become the JSON schema the model sees, and the docstring becomes the tool description. No JSON Schema by hand.

from pydantic_ai import Agent, RunContext

agent = Agent(
    model="openai:gpt-4o-mini",
    output_type=Order,
    system_prompt="Look up the SKU before confirming the order.",
)

INVENTORY = {"SKU-7781": 50, "SKU-9001": 0}

@agent.tool
def stock_level(ctx: RunContext, sku: str) -> int:
    """Return how many units of `sku` are currently in stock."""
    return INVENTORY.get(sku, 0)

result = agent.run_sync("Order 12 of SKU-7781 if available, otherwise nothing.")
print(result.output)
print("Tool calls:", [c.tool_name for c in result.all_messages() if c.kind == "tool-call"])

Example output:

sku='SKU-7781' quantity=12 rush=False
Tool calls: ['stock_level']

Notice the RunContext argument — that is where you inject dependencies like a database session, a tenant ID, or a feature-flag client. The model never sees it; it is server-side only.

Step 4: Make the agent survive a crash

This is the headline feature of 1.85. Wrap your agent in a DurableAgent and every tool call, model call, and partial output is checkpointed. If the process dies, calling resume() with the same run ID picks up exactly where it left off — no duplicate tool calls, no lost state.

from pydantic_ai.durable import DurableAgent, SqliteCheckpointer

durable = DurableAgent(agent, checkpointer=SqliteCheckpointer("runs.db"))

run_id = "order-2026-04-28-001"
try:
    result = durable.run_sync("Order 12 of SKU-7781 ASAP", run_id=run_id)
except KeyboardInterrupt:
    # Simulate a crash mid-run
    pass

# Later, in a fresh process:
result = durable.resume(run_id=run_id)
print(result.output)

Under the hood, every step writes a row to runs.db. Replay is deterministic: cached tool results are returned from the checkpoint instead of re-calling your code, so idempotency is no longer your problem.

Common pitfalls

Forgetting output_type. Without it, you get a plain string and lose the validation safety net. Always declare a schema, even if it is a single-field model.
Mutable defaults in tool signatures. Pydantic AI introspects signatures at agent-creation time; a list = [] default leaks across runs. Use Field(default_factory=list) on the model side.
Retrying non-deterministic tools. If your tool sends an email, mark it with @agent.tool(retries=0) or wrap the side effect in an idempotency key — durable replay will happily re-send otherwise.
Mixing run_sync and run. The async run returns a coroutine; calling it inside FastAPI without await silently returns the coroutine object as your output. Linters catch this; tests catch it faster.

Quick reference

Concept	API	Notes
Define agent	Agent(model, output_type, system_prompt)	output_type is any Pydantic model
Sync run	agent.run_sync(prompt)	Blocks; good for scripts
Async run	await agent.run(prompt)	Use inside FastAPI / asyncio
Tool	@agent.tool def name(ctx, arg: T) -> R	Type hints become JSON schema
Inject deps	RunContext.deps	Pass with agent.run_sync(..., deps=x)
Durable wrap	DurableAgent(agent, checkpointer)	Checkpoint = SQLite, Postgres, Redis
Resume	durable.resume(run_id=...)	Replays from last checkpoint

Next steps

Swap openai:gpt-4o-mini for anthropic:claude-sonnet-4-6 or ollama:llama3.1 — the rest of your code does not change.
Add streaming with agent.run_stream() to get tokens as they arrive while still validating the final object.
Plug your DurableAgent behind a Celery or Temporal worker to get queue-level retries on top of step-level checkpoints.
Read the 1.85 changelog for the new Capability primitive — it lets you bundle tools, hooks, and prompts into reusable units.

Tested on Python 3.12, pydantic-ai 1.85.1, pydantic 2.10.4. Code is MIT-licensed; copy freely.