
Gemini Interactions API: Build Stateful AI Agents
Summary
Use Gemini's new Interactions API and previous_interaction_id to build stateful agents in Python.
At Google I/O on May 19, 2026, Gemini 3.5 Flash went generally available and Google quietly changed how it wants you to build. The docs now open with a line that matters: for new projects, use the Interactions API (Beta). It is the new standard, optimized for agentic workflows, server-side state, and the capabilities Google says will ship exclusively on it going forward. The old generateContent API is still fully supported, but the center of gravity has moved.
The single biggest practical change is state. With generateContent you resend the entire conversation history on every turn. With Interactions, the server keeps the history for you and you reference it by ID. That one shift removes a whole class of bugs around dropped thought signatures, mismatched tool IDs, and ballooning request payloads.
This guide teaches you the Interactions API end to end in Python: your first call, reading the new steps timeline, stateful multi-turn with previous_interaction_id, a real tool-using agent that survives across turns, and the storage and background-task rules that trip people up. Every API detail here is checked against the official docs (last updated 2026-05-19).
What you'll build
By the end you will have a small command-line support agent that answers a user question, calls a custom function to look up an order status, feeds the result back to the model, and continues the conversation without you ever resending the chat history. You will also know exactly when to use stateful mode versus stateless mode, and how to offload long jobs to the background.
Prerequisites
- Python 3.9 or newer.
- A Gemini API key from Google AI Studio, exported as the
GEMINI_API_KEYenvironment variable. - The Google GenAI SDK, version
1.55.0or newer (the Interactions API is not in older builds). - Basic familiarity with JSON and function/tool calling concepts.
pip install -q -U "google-genai>=1.55.0"
export GEMINI_API_KEY="your_key_here"
Why the version floor matters: client.interactions simply does not exist on google-genai below 1.55.0 (JavaScript needs @google/genai 1.33.0+). If you get an AttributeError on client.interactions, your SDK is too old.
Step 1: Your first interaction
An Interaction represents one complete turn. You create it with client.interactions.create(...) and pass input (a plain string is fine for the simplest case). The response is not a single text blob; it is a structured timeline of steps.
from google import genai
# Picks up GEMINI_API_KEY from the environment automatically
client = genai.Client()
interaction = client.interactions.create(
model="gemini-3.5-flash",
input="Explain what a stateful API is in one sentence.",
)
# The SDK gives you a convenience property for the final text:
print(interaction.output_text)
print("interaction id:", interaction.id)
Representative output:
A stateful API remembers context from previous requests so the
client does not have to resend it every time.
interaction id: interactions/abc123def456
output_text returns the last run of text blocks in the response and joins consecutive text together. It deliberately skips earlier text that is separated by thoughts, images, or tool calls, so for interleaved multimodal replies you iterate steps yourself instead.
Step 2: Read the steps timeline
The steps array is the heart of the API. Instead of a flat output, you get a chronological list of typed events: model thoughts, function_call and function_result steps, and the final model_output. This is what makes complex agent flows debuggable, and it is the documented breaking change from the older outputs array.
interaction = client.interactions.create(
model="gemini-3.5-flash",
input="Explain how AI works in a few words",
)
for step in interaction.steps:
if step.type == "model_output":
for block in step.content:
if block.type == "text":
print(block.text)
Note that the create response only returns model-generated steps. If you later fetch the stored record with interactions.get, it also includes the user_input step for full context. Keep that asymmetry in mind when you reconstruct history by hand.
Step 3: Stateful multi-turn with previous_interaction_id
Here is the feature you came for. To continue a conversation, pass the id of the previous interaction as previous_interaction_id. The server reloads the history for you. You never resend prior messages.
# Turn 1
first = client.interactions.create(
model="gemini-3.5-flash",
input="My favorite language is Python. Remember that.",
)
print("Turn 1:", first.output_text)
# Turn 2 - no history resent, just the pointer
second = client.interactions.create(
model="gemini-3.5-flash",
input="What did I say my favorite language was?",
previous_interaction_id=first.id,
)
print("Turn 2:", second.output_text)
Representative output:
Turn 1: Got it, I'll remember that you like Python.
Turn 2: You said your favorite language was Python.
There is a critical subtlety that the docs spell out explicitly: previous_interaction_id carries forward only the conversation history (inputs and outputs). Everything else is interaction-scoped and must be re-specified on every turn:
tools– your function declarations and built-in tools.system_instruction– your system prompt.generation_config– includingthinking_level, and so on.
If your agent suddenly forgets it has tools on turn two, this is almost always why. Re-pass tools every single time.
Step 4: A tool-using agent across turns
Function calling in the Interactions API follows the familiar four-step loop, but the shapes changed. A function declaration uses a flat type: "function" object, the model returns a function_call step with name, arguments, and a generated id, and you send results back as a function_result step. The model never runs your code; you do, then hand it the result.
Define the tool:
import json
from google import genai
client = genai.Client()
get_order_status = {
"type": "function",
"name": "get_order_status",
"description": "Look up the current status of a customer order by its ID.",
"parameters": {
"type": "object",
"properties": {
"order_id": {"type": "string", "description": "The order ID, e.g. 'A-4417'"},
},
"required": ["order_id"],
},
}
# Your real implementation would hit a database or API.
def get_order_status_impl(order_id: str) -> dict:
fake_db = {"A-4417": "shipped", "A-9001": "processing"}
return {"order_id": order_id, "status": fake_db.get(order_id, "not_found")}
Turn 1 – the model decides to call the tool:
interaction = client.interactions.create(
model="gemini-3.5-flash",
input="Where is my order A-4417?",
tools=[get_order_status],
)
fc = next(s for s in interaction.steps if s.type == "function_call")
print("Model wants to call:", fc.name, fc.arguments, "id:", fc.id)
Representative output:
Model wants to call: get_order_status {'order_id': 'A-4417'} id: call_01
Turn 2 – execute locally, then return a function_result. Pass previous_interaction_id so the server has the context, and crucially set call_id to the call's id and name to the call's name:
result = get_order_status_impl(**fc.arguments)
final = client.interactions.create(
model="gemini-3.5-flash",
previous_interaction_id=interaction.id,
tools=[get_order_status], # re-pass tools, always
input=[
{
"type": "function_result",
"name": fc.name,
"call_id": fc.id,
"result": [{"type": "text", "text": json.dumps(result)}],
}
],
)
print(final.output_text)
Representative output:
Your order A-4417 has shipped and is on its way.
Two rules the docs flag as required for Gemini 3.x: every function_result must include the matching call_id, and name must match the call. If they do not match, the model tends to return an empty response with finish_reason: STOP rather than a clear error. The standard Python and Node SDKs handle the ID matching for you when you use their helpers, but if you build the step dicts by hand, match them yourself.
Step 5: store, stateless mode, and background jobs
By default every interaction is stored (store=true). Storage is what powers previous_interaction_id, background execution, and observability. Retention differs by tier: paid keeps interactions for 55 days, free tier for 1 day, after which they are deleted automatically.
If you do not want server-side storage, set store=False and manage history on the client by passing the full steps list back in input each turn. But know the trade-offs the docs call out: store=False is incompatible with background=true, and it prevents using previous_interaction_id for later turns.
# Stateless: you carry the history yourself
history = [
{"type": "user_input",
"content": [{"type": "text", "text": "Where is my order A-4417?"}]}
]
interaction = client.interactions.create(
model="gemini-3.5-flash",
store=False,
input=history,
tools=[get_order_status],
)
# Append every model-generated step exactly as received
for step in interaction.steps:
history.append(step.model_dump())
For slow work like Deep Think or Deep Research, set background=True to offload the job to a background process instead of holding a long synchronous request open. You then poll or stream for completion. Background mode requires storage, so leave store at its default.
Worked example: combining a built-in tool with your own
A genuinely useful pattern is mixing a built-in tool (Google Search) with a custom function in one request. The model can ground itself on the web and call your code in the same flow. When you continue with previous_interaction_id, the built-in tool context circulates automatically.
get_weather = {
"type": "function",
"name": "get_weather",
"description": "Gets the weather for a requested city.",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City and state, e.g. Utqiagvik, Alaska"},
},
"required": ["city"],
},
}
tools = [{"type": "google_search"}, get_weather]
interaction = client.interactions.create(
model="gemini-3.5-flash",
input="What is the northernmost city in the US, and what's the weather there?",
tools=tools,
)
for step in interaction.steps:
if step.type == "function_call":
result = {"response": "Very cold. 22 degrees Fahrenheit."}
follow = client.interactions.create(
model="gemini-3.5-flash",
previous_interaction_id=interaction.id,
tools=tools,
input=[{
"type": "function_result",
"name": step.name,
"call_id": step.id,
"result": [{"type": "text", "text": json.dumps(result)}],
}],
)
print(follow.output_text)
The model uses Search to determine the city (Utqiagvik, Alaska), then calls your get_weather function for that city, and finally composes one grounded answer. To force a tool call rather than let the model decide, set generation_config={"tool_choice": "any"}; auto is the default and none forbids tool calls.
Common pitfalls and gotchas
- Forgetting to re-pass tools, system_instruction, or generation_config. Only history travels with
previous_interaction_id. Everything else is interaction-scoped and resets to defaults if you omit it. - SDK too old.
client.interactionsneedsgoogle-genai>=1.55.0(JS@google/genai>=1.33.0). AnAttributeErrorhere means upgrade, not a bug in your code. - Mismatched function results. Each
function_resultmust carry the matchingcall_idand the samenameas the call. A mismatch usually yields an empty reply withfinish_reason: STOP, not a loud error. - Assuming output_text always has everything. It only returns the last contiguous text run. For replies interleaved with thoughts, images, or tool calls, iterate
stepsand read each content block. - store=False expecting stateful behavior. Turning off storage disables
previous_interaction_idandbackground=true. Pick stateful (default) or fully client-managed; do not mix. - Free-tier retention surprise. Free-tier interactions vanish after 1 day, so a
previous_interaction_idfrom yesterday may already be gone. Paid tier holds 55 days. - Remote MCP on Gemini 3 models. The Interactions API supports remote MCP servers (Streamable HTTP only, not SSE), but Gemini 3 model support for remote MCP is not live yet. Use a 2.5 model if you need it today, and avoid '-' in MCP server names.
- Treating Beta as production-stable. The Interactions API is Beta and schemas can change. For stable production, Google still recommends
generateContent. Pin theApi-Revisionheader on REST calls to avoid surprise breakage.
Quick reference
| Concept | What it does | Key detail |
|---|---|---|
| client.interactions.create | Creates one interaction (turn) | Needs google-genai 1.55.0+ |
| interaction.steps | Typed timeline of the turn | Replaces the old outputs array |
| interaction.output_text | Convenience: final text | Last contiguous text run only |
| previous_interaction_id | Continue a conversation | Carries history only, not tools/config |
| store (default true) | Server-side persistence | Required for background + previous_interaction_id |
| background=true | Offload long jobs | Incompatible with store=false |
| function_result step | Return tool output | Must match call_id and name |
| tool_choice | Control tool use | auto (default), any, none, validated |
Next steps
- Swap the fake order lookup for a real database or HTTP call and add error handling around tool execution.
- Add a system_instruction with an explicit tool budget (e.g. 'limited action budget of 5 tool calls') to curb over-calling.
- Try the Deep Research agent with background=true for long-running research tasks.
- Read the official Interactions API overview and the May 2026 breaking-changes guide before going to production.
Primary sources: Google AI for Developers — Interactions API overview, Interactions quickstart, and Interactions function calling (all last updated May 2026), plus the Gemini 3.5 Flash 'What's new' migration guide.
Comments
Be the first to comment