CodeAct + Hyperlight: One Code Block, Dozens of Tool Calls

Kodetra Technologies·June 4, 2026·10 min read Intermediate

Summary

Collapse agent tool-call loops into one sandboxed Python program and cut latency in half.

Most production agents aren't slow because the model is dumb. They're slow because of the wiring. Every tool call is a separate model turn: the LLM picks a tool, you run it, you send the result back, the LLM picks the next tool. A task that needs forty small lookups burns forty round trips, and you pay for the full conversation history on every single one.

At Build 2026 this week, the Microsoft Agent Framework team put a number on the fix. CodeAct support, shipped in the new agent-framework-hyperlight package, lets the model write one short Python program that calls all of your tools via call_tool(...), runs it once in a sandbox, and returns a consolidated answer. On Microsoft's own benchmark (computing order totals across eight users, dozens of tool calls), the same task with the same model and the same tools dropped from 27.81s to 13.23s and from 6,890 tokens to 2,489. That's a 52.4% latency cut and 63.9% fewer tokens from changing nothing but the wiring.

Keep reading — it's free

Enter your email to keep reading — plus the best of AI & tech, daily. Free, forever.

CodeAct + Hyperlight: One Code Block, Dozens of Tool Calls

Keep reading — it's free

Comments