
Microsoft Agent Governance Toolkit: Secure AI Agents
Summary
Block agent attacks in <0.1ms with Microsoft's open-source runtime governance toolkit.
Autonomous agents are doing real work now — refunding customers, opening pull requests, deploying infra. That power cuts both ways. A poisoned tool description, a hijacked goal, or a single rogue subagent can move money or wipe data before you notice. Microsoft just open-sourced the Agent Governance Toolkit (April 2026, MIT license) — the first kit to address all 10 OWASP agentic AI risks with sub-millisecond deterministic policy enforcement.
By the end of this guide you'll have a working policy engine that intercepts every tool call your agent makes, blocks unsafe actions in under 0.1 ms, and produces audit evidence you can hand to compliance. We'll wire it into a plain Python agent — no framework lock-in.
Prerequisites
- Python 3.11+ and
pip - An OpenAI or Anthropic API key (any LLM-backed agent works — we use a tiny stub here)
- Five minutes and a terminal
Step 1 — Install the toolkit
The core package is agent-os: a stateless policy engine you call before each tool execution. It ships with adapters for LangChain, LangGraph, OpenAI Agents SDK, Microsoft Agent Framework, CrewAI, Pydantic AI, and Haystack.
pip install agent-os agent-compliance
agent-os is the runtime gate. agent-compliance records evidence and grades you against EU AI Act, HIPAA, SOC 2, and the OWASP Agentic Top 10.
Step 2 — Define a policy
Policies are plain YAML or Python. Each rule names a tool pattern, the conditions, and a verdict (allow / deny / require_approval). Start narrow — deny first, open up as you learn.
# policy.yaml
version: 1
rules:
- id: block-prod-writes
when:
tool: "*.write"
args.environment: "production"
verdict: require_approval
reason: "Writes to prod must be approved by a human."
- id: cap-spend
when:
tool: "stripe.refund"
limit:
args.amount_cents: { lte: 10000 } # $100 max per refund
verdict: deny_if_exceeds
- id: deny-untrusted-tools
when:
tool.trust_tier: { lt: 2 }
verdict: deny
reason: "Tool not signed at trust tier 2+."
Step 3 — Wire the gate into your agent loop
Wrap your tool dispatcher with PolicyGate.check(). The gate is synchronous, in-process, and returns a verdict in well under 1 ms — the team reports p99 below 0.1 ms.
# agent.py
from agent_os import PolicyGate, ActionContext
from agent_compliance import EvidenceRecorder
gate = PolicyGate.from_yaml("policy.yaml")
audit = EvidenceRecorder(sink="./evidence.jsonl")
def safe_call(tool_name, args, *, trust_tier=1, environment="dev"):
ctx = ActionContext(
tool=tool_name,
args=args,
tool_meta={"trust_tier": trust_tier},
environment=environment,
)
verdict = gate.check(ctx)
audit.record(ctx, verdict)
if verdict.action == "allow":
return TOOLS[tool_name](**args)
if verdict.action == "require_approval":
raise PermissionError(f"Approval required: {verdict.reason}")
raise PermissionError(f"Blocked: {verdict.reason}")
# Example tool registry
TOOLS = {
"stripe.refund": lambda amount_cents, customer_id: {"ok": True, "refund_id": "re_123"},
"fs.write": lambda path, body: open(path, "w").write(body),
}
# In your agent loop, replace direct tool calls with safe_call(...)
print(safe_call("stripe.refund", {"amount_cents": 5000, "customer_id": "cus_1"}))
Step 4 — Try to break it
Run three calls to confirm the gate behaves:
# 1) Allowed: under the $100 cap, dev env
safe_call("stripe.refund", {"amount_cents": 5000, "customer_id": "cus_1"})
# 2) Denied: over the cap
safe_call("stripe.refund", {"amount_cents": 250000, "customer_id": "cus_1"})
# -> PermissionError: Blocked: cap-spend exceeded
# 3) Requires approval: prod write
safe_call("fs.write", {"path": "/etc/app.conf", "body": "..."}, environment="production")
# -> PermissionError: Approval required: Writes to prod must be approved by a human.
Every decision is now in evidence.jsonl with timestamp, rule id, args, and verdict — exactly what an auditor wants.
Step 5 — Plug into a real framework
Skip the manual wrapper if you're already on a framework. The toolkit hooks native extension points so you don't rewrite agent code.
# LangChain — wrap each Tool
from agent_os.adapters.langchain import GovernedTool
from langchain.agents import initialize_agent
tools = [GovernedTool(t, gate=gate) for t in raw_tools]
agent = initialize_agent(tools, llm, agent="openai-tools")
# OpenAI Agents SDK — middleware
from agents import Agent
from agent_os.adapters.openai_agents import governance_middleware
agent = Agent(name="ops", tools=tools, middleware=[governance_middleware(gate)])
# Microsoft Agent Framework — pipeline step
from agent_framework import AgentBuilder
from agent_os.adapters.maf import GovernanceStep
agent = AgentBuilder().with_tools(tools).add_step(GovernanceStep(gate)).build()
Common pitfalls
- Wrapping the LLM, not the tool call. The gate must sit between the planner and the side effect. Wrapping the LLM only catches text, not actions.
- Open-by-default policies. Start with
default: denyand add allow rules. Reverse and you'll ship a hole on day one. - Forgetting subagents. Each spawned subagent gets its own context. Pass the gate (or a derived child gate with tighter scope) into every subagent constructor.
- Trust tier drift. Re-verify tool signatures on load — a manifest swap is the easiest supply-chain attack on agents.
Quick reference
| Component | What it does | When you need it |
|---|---|---|
| agent-os | Sub-millisecond policy gate | Always — this is the runtime |
| agent-compliance | Evidence + framework grading | If anyone audits you (EU AI Act, SOC 2, HIPAA) |
| agent-marketplace | Ed25519-signed plugin manifests | Loading third-party tools/skills |
| agent-lightning | RL training-loop governance | Fine-tuning agents with RL or RLHF |
| adapters | LangChain, LangGraph, OpenAI, MAF, CrewAI, Pydantic AI, Haystack | You already use one of those |
Next steps
- Walk through Microsoft's 20 official tutorials in the
microsoft/agent-governance-toolkitrepo on GitHub. - Map your real tools to OWASP Agentic Top 10 categories — start with goal hijacking (T1) and tool misuse (T6).
- Add
agent-marketplaceif you load tools from anywhere outside your own repo. - Pipe
evidence.jsonlinto your SIEM and alert on anydenyin production.
Governance used to mean a quarterly review meeting. With sub-millisecond runtime enforcement plus evidence collection, it becomes the same kind of always-on guardrail your web app already has — just shaped for autonomous agents.
Comments
Be the first to comment