
Build Your First Local AI Agent with Google Gemma 4 + Ollama
Summary
Step-by-step guide to run Google's latest Gemma 4 model locally and build an AI agent with tool-calling and agentic workflows.
What You'll Learn
- Install and run Gemma 4 locally using Ollama
- Use Gemma 4's built-in function calling for tool use
- Build a simple AI agent that can search the web and answer questions
- No cloud API costs — everything runs on your machine
Why Gemma 4?
Google released Gemma 4 in April 2026. It's their most capable open model yet, built specifically for agentic workflows.
Key specs:
- 128K context window
- Native function calling and structured JSON output
- Built-in reasoning mode (think step-by-step before answering)
- Available in 4 sizes: E2B, E4B, 26B MoE, 31B Dense
The E4B variant runs on most consumer laptops with 8GB+ RAM.
Prerequisites
- Python 3.10+
- 8GB+ RAM (for E4B model)
- macOS, Linux, or Windows with WSL
- Basic terminal/command line knowledge
Step 1: Install Ollama
Ollama lets you run LLMs locally with one command.
macOS/Linux:
bash
curl -fsSL https://ollama.com/install.sh | sh
Windows: Download from ollama.com/download
Verify installation:
bash
ollama --version
Expected output:
ollama version 0.6.x
Step 2: Pull Gemma 4 Model
bash
ollama pull gemma4:e4b
This downloads the E4B variant (~4GB). For smaller machines, use:
bash
ollama pull gemma4:e2b
Test it works:
bash
ollama run gemma4:e4b "What is an AI agent?"
Expected output:
An AI agent is an autonomous system that can perceive its environment,
make decisions, and take actions to achieve specific goals — often
using tools like web search, code execution, or API calls.
Step 3: Install Python Dependencies
bash
pip install ollama gradio duckduckgo-search
Step 4: Create the AI Agent
Create a file called agent.py:
python
import ollama
import json
from duckduckgo_search import DDGS
# Define the tool (web search)
def web_search(query: str) -> str:
"""Search the web and return top results."""
results = DDGS().text(query, max_results=3)
return "\n".join(
f"- {r['title']}: {r['body']}" for r in results
)
# Tool definition for Gemma 4
tools = [
{
"type": "function",
"function": {
"name": "web_search",
"description": "Search the web for current information",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
}
},
"required": ["query"]
}
}
}
]
def run_agent(user_question: str) -> str:
messages = [
{
"role": "system",
"content": "You are a helpful AI agent. Use the web_search tool when you need current information."
},
{"role": "user", "content": user_question}
]
# Step 1: Ask Gemma 4 (with tools available)
response = ollama.chat(
model="gemma4:e4b",
messages=messages,
tools=tools
)
msg = response["message"]
# Step 2: If the model wants to call a tool, execute it
if msg.get("tool_calls"):
for tool_call in msg["tool_calls"]:
func_name = tool_call["function"]["name"]
func_args = tool_call["function"]["arguments"]
if func_name == "web_search":
result = web_search(func_args["query"])
# Add tool result back to conversation
messages.append(msg)
messages.append({
"role": "tool",
"content": result
})
# Step 3: Get final answer with tool results
final = ollama.chat(
model="gemma4:e4b",
messages=messages
)
return final["message"]["content"]
return msg["content"]
# Test it
if __name__ == "__main__":
question = "What are the latest AI developments this week?"
print(f"Q: {question}\n")
print(f"A: {run_agent(question)}")
Step 5: Run the Agent
bash
python agent.py
Example input:
Q: What are the latest AI developments this week?
Example output:
A: Based on my search, here are this week's top AI developments:
- Google released Gemma 4, their most capable open model for agentic AI
- OpenAI reached 900 million weekly ChatGPT users
- Microsoft expanded multi-model collaboration in Copilot
Step 6: Add a Web UI (Optional)
Add this to the bottom of agent.py:
python
import gradio as gr
demo = gr.Interface(
fn=run_agent,
inputs=gr.Textbox(
label="Ask your AI Agent",
placeholder="e.g., What's trending in AI today?"
),
outputs=gr.Textbox(label="Agent Response"),
title="Gemma 4 AI Agent",
description="Local AI agent powered by Gemma 4 with web search"
)
demo.launch()
Run it:
bash
python agent.py
Open http://localhost:7860 in your browser.
How It Works (Simple Diagram)
User Question
↓
Gemma 4 (thinks: do I need a tool?)
↓ ↓
No tool Calls web_search()
↓ ↓
Direct answer Gets search results
↓
Gemma 4 summarizes results
↓
Final answer to user
Add More Tools
Extend your agent by adding more tool functions:
python
# Example: Calculator tool
def calculator(expression: str) -> str:
"""Evaluate a math expression."""
try:
return str(eval(expression))
except:
return "Error: invalid expression"
# Add to tools list
tools.append({
"type": "function",
"function": {
"name": "calculator",
"description": "Calculate a math expression",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Math expression like '2+2' or '100*0.15'"
}
},
"required": ["expression"]
}
}
})
Troubleshooting
| Problem | Fix |
|---|---|
| ollama: command not found | Restart terminal after install |
| Model too slow | Use gemma4:e2b (smaller) |
| Out of memory | Close other apps, or use E2B |
| Connection refused | Run ollama serve first |
What's Next?
- Add memory to your agent (store conversation history)
- Connect to local files (RAG with your documents)
- Deploy on Android using Google AI Edge SDK
- Chain multiple agents together (multi-agent systems)
Key Takeaways
- Gemma 4 is Google's best open model for building AI agents
- Ollama makes local deployment dead simple
- Function calling lets your agent use real tools
- No API keys or cloud costs required
- E4B runs on most laptops with 8GB+ RAM
Comments
Be the first to comment