Guides

AI Guides for Builders

How-to content for builders, indie hackers, and AI engineers. Less theory, more shipped code.

Gemini 3.5 Pro: Feed a 2M-Token Codebase in One Call

Intermediate

Gemini 3.5 Pro: Feed a 2M-Token Codebase in One Call

Load a whole repo into Gemini 3.5 Pro's 2M context, query it without RAG, and cache to cut cost.

8 min read·Kodetra Technologies

Today

Migrate to Claude Sonnet 5: Effort and the Tokenizer Trap

Intermediate

Migrate to Claude Sonnet 5: Effort and the Tokenizer Trap

Hands-on Python guide to Sonnet 5's adaptive thinking, effort levels, and the 30% tokenizer trap.

10 min read·Kodetra Technologies

Today

Semantic Caching for LLMs: Cut Your Token Bill in Python

Intermediate

Semantic Caching for LLMs: Cut Your Token Bill in Python

Build a semantic cache that reuses answers for similar prompts and slashes LLM API costs.

10 min read·Kodetra Technologies

Yesterday

Build an LLM Spend Governor: Budget Caps in Python

Intermediate

Build an LLM Spend Governor: Budget Caps in Python

A runnable Python governor that caps LLM spend per user and auto-downgrades models.

10 min read·Kodetra Technologies

2d ago

Stream Gemini Thinking: Build a Show-Your-Work CLI

Intermediate

Stream Gemini Thinking: Build a Show-Your-Work CLI

Stream Gemini's thought summaries live, control reasoning effort, and track thinking-token cost.

8 min read·Kodetra Technologies

4d ago

Gemini Thought Summaries: Audit Deep Think Reasoning

Intermediate

Gemini Thought Summaries: Audit Deep Think Reasoning

Surface, stream, and log Gemini 2.5 Pro Deep Think's reasoning chain with thought summaries.

10 min read·Kodetra Technologies

5d ago

Progressive Skill Loading: 40+ Agent Skills, No Bloat

Intermediate

Progressive Skill Loading: 40+ Agent Skills, No Bloat

Build a skill-manifest registry so an AI agent wields dozens of skills without context bloat.

9 min read·Kodetra Technologies

6d ago

Run Nemotron 3 Nano: Efficient Open Coding Agent

Intermediate

Run Nemotron 3 Nano: Efficient Open Coding Agent

Build a frugal tool-calling coding agent on NVIDIA's open Nemotron 3 Nano via OpenRouter in Python.

10 min read·Kodetra Technologies

7d ago

Pydantic AI: Type-Safe Agents You Can Unit Test

Intermediate

Pydantic AI: Type-Safe Agents You Can Unit Test

Build agents with typed deps, validated output, and offline tests using Pydantic AI.

9 min read·Kodetra Technologies

7d ago

Nemotron 3 Ultra: Build a 550B Tool-Use Agent

Intermediate

Nemotron 3 Ultra: Build a 550B Tool-Use Agent

Wire NVIDIA's open 550B MoE into a Python tool-calling loop for long-running agents.

10 min read·Kodetra Technologies

8d ago

Claude Advisor Tool: Pair Haiku With Opus in Python

Intermediate

Claude Advisor Tool: Pair Haiku With Opus in Python

Let a cheap executor model consult a stronger advisor mid-task in one Messages API call.

11 min read·Kodetra Technologies

9d ago

LLM Failover in Python: Survive a Model Going Dark

Intermediate

LLM Failover in Python: Survive a Model Going Dark

Build a provider-agnostic LLM failover client in Python that survives outages and model removals.

9 min read·Kodetra Technologies

9d ago

Page 1 of 4