Guides

AI Guides for Builders

How-to content for builders, indie hackers, and AI engineers. Less theory, more shipped code.

DeepSeek V4 Pro: Cheap 1M-Token Context in Python

Intermediate

DeepSeek V4 Pro: Cheap 1M-Token Context in Python

Use DeepSeek V4 Pro's auto KV cache to run huge-context jobs for cents.

8 min read·Kodetra Technologies

2d ago

MiniMax M3: Master 1M-Token Long Context With MSA

Intermediate

MiniMax M3: Master 1M-Token Long Context With MSA

MiniMax M3 hands-on: MSA sparse attention plus real 1M-token long context, with runnable Python.

10 min read·Kodetra Technologies

2d ago

Handle Fable 5 Refusals With Fallbacks in Python

Intermediate

Handle Fable 5 Refusals With Fallbacks in Python

Catch Claude Fable 5's stop_reason refusal and auto-retry on Opus 4.8 without breaking production.

10 min read·Kodetra Technologies

3d ago

Gemma 4 Tool Calling: Build a Local AI Agent

Intermediate

Gemma 4 Tool Calling: Build a Local AI Agent

Run Google's open Gemma 4 locally with Ollama and wire up real function calling for an agent.

10 min read·Kodetra Technologies

4d ago

Claude Opus 4.8 Fast Mode: 2.5x Faster Output in Python

Intermediate

Claude Opus 4.8 Fast Mode: 2.5x Faster Output in Python

Use speed:"fast" on Claude Opus 4.8 for up to 2.5x faster output, with a safe rate-limit fallback.

9 min read·Kodetra Technologies

4d ago

Build an Apple-Style Multi-Model AI Router in Python

Intermediate

Build an Apple-Style Multi-Model AI Router in Python

WWDC let iPhone users pick ChatGPT, Gemini, or Claude. Build the same model router in Python.

9 min read·Kodetra Technologies

5d ago

More guides like this?

New AI guides for builders, in your inbox. Free.

Join 2,043 builders reading daily.

MiniMax M3 Tool Calling: Build an Agentic Loop in Python

Intermediate

MiniMax M3 Tool Calling: Build an Agentic Loop in Python

Wire MiniMax M3's OpenAI-compatible API into a real tool-calling agent loop.

8 min read·Kodetra Technologies

5d ago

Run a Local LLM on Android with llama.cpp + Vulkan

Intermediate

Run a Local LLM on Android with llama.cpp + Vulkan

Compile llama.cpp with Vulkan in Termux and run a quantized LLM on your Android GPU, no root.

9 min read·Kodetra Technologies

9d ago

MAI-Code-1-Flash in Python: 5B Coder Beats Haiku 4.5

Intermediate

MAI-Code-1-Flash in Python: 5B Coder Beats Haiku 4.5

Call Microsoft's June 2 coding model via OpenRouter for cheap, fast refactors.

10 min read·Kodetra Technologies

11d ago

Agent Harness in Python: Give LLMs Shell and File Access

Intermediate

Agent Harness in Python: Give LLMs Shell and File Access

Build a safe local agent harness with shell, files, approvals, and logs in Python.

10 min read·Kodetra Technologies

12d ago

Claude Opus 4.8 Effort Levels: A Hands-On Python Guide

Intermediate

Claude Opus 4.8 Effort Levels: A Hands-On Python Guide

Tune token spend on Opus 4.8 with the effort parameter. Runnable Python, real I/O, real numbers.

7 min read·Kodetra Technologies

13d ago

GLM-4.7 in Python: Build a Cheap Coding Assistant

Intermediate

GLM-4.7 in Python: Build a Cheap Coding Assistant

Use Zhipu's GLM-4.7 through the OpenAI SDK to build a tool-calling coding assistant for pennies.

9 min read·Kodetra Technologies

18d ago

Page 1 of 2