Build Your First Local AI Agent with Google Gemma 4 + Ollama

What You'll Learn

Install and run Gemma 4 locally using Ollama
Use Gemma 4's built-in function calling for tool use
Build a simple AI agent that can search the web and answer questions
No cloud API costs — everything runs on your machine

Why Gemma 4?

Google released Gemma 4 in April 2026. It's their most capable open model yet, built specifically for agentic workflows.

Key specs:

128K context window
Native function calling and structured JSON output
Built-in reasoning mode (think step-by-step before answering)
Available in 4 sizes: E2B, E4B, 26B MoE, 31B Dense

The E4B variant runs on most consumer laptops with 8GB+ RAM.

Prerequisites

Python 3.10+
8GB+ RAM (for E4B model)
macOS, Linux, or Windows with WSL
Basic terminal/command line knowledge

Step 1: Install Ollama

Ollama lets you run LLMs locally with one command.

macOS/Linux:

bash

curl -fsSL https://ollama.com/install.sh | sh

Windows: Download from ollama.com/download

Verify installation:

bash

ollama --version

Expected output:

ollama version 0.6.x

Step 2: Pull Gemma 4 Model

bash

ollama pull gemma4:e4b

This downloads the E4B variant (~4GB). For smaller machines, use:

bash

ollama pull gemma4:e2b

Test it works:

bash

ollama run gemma4:e4b "What is an AI agent?"

Expected output:

An AI agent is an autonomous system that can perceive its environment,
make decisions, and take actions to achieve specific goals — often
using tools like web search, code execution, or API calls.

Step 3: Install Python Dependencies

bash

pip install ollama gradio duckduckgo-search

Step 4: Create the AI Agent

Create a file called agent.py:

python

import ollama
import json
from duckduckgo_search import DDGS

# Define the tool (web search)
def web_search(query: str) -&gt; str:
    """Search the web and return top results."""
    results = DDGS().text(query, max_results=3)
    return "\n".join(
        f"- {r['title']}: {r['body']}" for r in results
    )

# Tool definition for Gemma 4
tools = [
    {
        "type": "function",
        "function": {
            "name": "web_search",
            "description": "Search the web for current information",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query"
                    }
                },
                "required": ["query"]
            }
        }
    }
]

def run_agent(user_question: str) -&gt; str:
    messages = [
        {
            "role": "system",
            "content": "You are a helpful AI agent. Use the web_search tool when you need current information."
        },
        {"role": "user", "content": user_question}
    ]

    # Step 1: Ask Gemma 4 (with tools available)
    response = ollama.chat(
        model="gemma4:e4b",
        messages=messages,
        tools=tools
    )

    msg = response["message"]

    # Step 2: If the model wants to call a tool, execute it
    if msg.get("tool_calls"):
        for tool_call in msg["tool_calls"]:
            func_name = tool_call["function"]["name"]
            func_args = tool_call["function"]["arguments"]

            if func_name == "web_search":
                result = web_search(func_args["query"])

            # Add tool result back to conversation
            messages.append(msg)
            messages.append({
                "role": "tool",
                "content": result
            })

        # Step 3: Get final answer with tool results
        final = ollama.chat(
            model="gemma4:e4b",
            messages=messages
        )
        return final["message"]["content"]

    return msg["content"]

# Test it
if __name__ == "__main__":
    question = "What are the latest AI developments this week?"
    print(f"Q: {question}\n")
    print(f"A: {run_agent(question)}")

Step 5: Run the Agent

bash

python agent.py

Example input:

Q: What are the latest AI developments this week?

Example output:

A: Based on my search, here are this week's top AI developments:
- Google released Gemma 4, their most capable open model for agentic AI
- OpenAI reached 900 million weekly ChatGPT users
- Microsoft expanded multi-model collaboration in Copilot

Step 6: Add a Web UI (Optional)

Add this to the bottom of agent.py:

python

import gradio as gr

demo = gr.Interface(
    fn=run_agent,
    inputs=gr.Textbox(
        label="Ask your AI Agent",
        placeholder="e.g., What's trending in AI today?"
    ),
    outputs=gr.Textbox(label="Agent Response"),
    title="Gemma 4 AI Agent",
    description="Local AI agent powered by Gemma 4 with web search"
)

demo.launch()

Run it:

bash

python agent.py

Open http://localhost:7860 in your browser.

How It Works (Simple Diagram)

User Question
     ↓
Gemma 4 (thinks: do I need a tool?)
     ↓              ↓
  No tool        Calls web_search()
     ↓              ↓
Direct answer   Gets search results
                    ↓
              Gemma 4 summarizes results
                    ↓
              Final answer to user

Add More Tools

Extend your agent by adding more tool functions:

python

# Example: Calculator tool
def calculator(expression: str) -&gt; str:
    """Evaluate a math expression."""
    try:
        return str(eval(expression))
    except:
        return "Error: invalid expression"

# Add to tools list
tools.append({
    "type": "function",
    "function": {
        "name": "calculator",
        "description": "Calculate a math expression",
        "parameters": {
            "type": "object",
            "properties": {
                "expression": {
                    "type": "string",
                    "description": "Math expression like '2+2' or '100*0.15'"
                }
            },
            "required": ["expression"]
        }
    }
})

Troubleshooting

Problem	Fix
ollama: command not found	Restart terminal after install
Model too slow	Use gemma4:e2b (smaller)
Out of memory	Close other apps, or use E2B
Connection refused	Run ollama serve first

What's Next?

Add memory to your agent (store conversation history)
Connect to local files (RAG with your documents)
Deploy on Android using Google AI Edge SDK
Chain multiple agents together (multi-agent systems)

Key Takeaways

Gemma 4 is Google's best open model for building AI agents
Ollama makes local deployment dead simple
Function calling lets your agent use real tools
No API keys or cloud costs required
E4B runs on most laptops with 8GB+ RAM

Build Your First Local AI Agent with Google Gemma 4 + Ollama

What You'll Learn

Why Gemma 4?

Prerequisites

Step 1: Install Ollama

Step 2: Pull Gemma 4 Model

Step 3: Install Python Dependencies

Step 4: Create the AI Agent

Step 5: Run the Agent

Step 6: Add a Web UI (Optional)

How It Works (Simple Diagram)

Add More Tools

Troubleshooting

What's Next?

Key Takeaways

Comments