Gemma 4 Tool Calling: Build a Local AI Agent

Kodetra Technologies·June 11, 2026·10 min read Intermediate

Summary

Run Google's open Gemma 4 locally with Ollama and wire up real function calling for an agent.

Google's Gemma 4 is the rare open model that is both small enough to run on your own laptop and capable enough to drive a real agent. The weights are Apache 2.0, the family ships in sizes from a 2B-effective edge model up to a 31B dense workstation model, and on Ollama alone it has already crossed 12.5 million downloads. What makes it interesting for builders is not the chat quality. It is the native function-calling support baked into the model and its chat template.

Function calling (also called tool calling) is the mechanism that turns a text generator into an agent. Instead of hallucinating today's weather or guessing the result of a calculation, the model emits a structured request: call get_weather with city="Tokyo". Your code runs that function, hands the result back, and the model writes a grounded answer. Everything stays on your machine: no API keys, no per-token billing, no data leaving your network.

Keep reading — it's free

Enter your email to keep reading — plus the best of AI & tech, daily. Free, forever.

Gemma 4 Tool Calling: Build a Local AI Agent

Keep reading — it's free

Comments