Microsoft Agent Governance Toolkit: Secure AI Agents — ContentBuffer guide

Microsoft Agent Governance Toolkit: Secure AI Agents

K
Kodetra Technologies··4 min read Intermediate

Summary

Block agent attacks in <0.1ms with Microsoft's open-source runtime governance toolkit.

Autonomous agents are doing real work now — refunding customers, opening pull requests, deploying infra. That power cuts both ways. A poisoned tool description, a hijacked goal, or a single rogue subagent can move money or wipe data before you notice. Microsoft just open-sourced the Agent Governance Toolkit (April 2026, MIT license) — the first kit to address all 10 OWASP agentic AI risks with sub-millisecond deterministic policy enforcement.

By the end of this guide you'll have a working policy engine that intercepts every tool call your agent makes, blocks unsafe actions in under 0.1 ms, and produces audit evidence you can hand to compliance. We'll wire it into a plain Python agent — no framework lock-in.

Prerequisites

  • Python 3.11+ and pip
  • An OpenAI or Anthropic API key (any LLM-backed agent works — we use a tiny stub here)
  • Five minutes and a terminal

Step 1 — Install the toolkit

The core package is agent-os: a stateless policy engine you call before each tool execution. It ships with adapters for LangChain, LangGraph, OpenAI Agents SDK, Microsoft Agent Framework, CrewAI, Pydantic AI, and Haystack.

pip install agent-os agent-compliance

agent-os is the runtime gate. agent-compliance records evidence and grades you against EU AI Act, HIPAA, SOC 2, and the OWASP Agentic Top 10.

Step 2 — Define a policy

Policies are plain YAML or Python. Each rule names a tool pattern, the conditions, and a verdict (allow / deny / require_approval). Start narrow — deny first, open up as you learn.

# policy.yaml
version: 1
rules:
  - id: block-prod-writes
    when:
      tool: "*.write"
      args.environment: "production"
    verdict: require_approval
    reason: "Writes to prod must be approved by a human."

  - id: cap-spend
    when:
      tool: "stripe.refund"
    limit:
      args.amount_cents: { lte: 10000 }   # $100 max per refund
    verdict: deny_if_exceeds

  - id: deny-untrusted-tools
    when:
      tool.trust_tier: { lt: 2 }
    verdict: deny
    reason: "Tool not signed at trust tier 2+."

Step 3 — Wire the gate into your agent loop

Wrap your tool dispatcher with PolicyGate.check(). The gate is synchronous, in-process, and returns a verdict in well under 1 ms — the team reports p99 below 0.1 ms.

# agent.py
from agent_os import PolicyGate, ActionContext
from agent_compliance import EvidenceRecorder

gate = PolicyGate.from_yaml("policy.yaml")
audit = EvidenceRecorder(sink="./evidence.jsonl")

def safe_call(tool_name, args, *, trust_tier=1, environment="dev"):
    ctx = ActionContext(
        tool=tool_name,
        args=args,
        tool_meta={"trust_tier": trust_tier},
        environment=environment,
    )
    verdict = gate.check(ctx)
    audit.record(ctx, verdict)

    if verdict.action == "allow":
        return TOOLS[tool_name](**args)
    if verdict.action == "require_approval":
        raise PermissionError(f"Approval required: {verdict.reason}")
    raise PermissionError(f"Blocked: {verdict.reason}")

# Example tool registry
TOOLS = {
    "stripe.refund": lambda amount_cents, customer_id: {"ok": True, "refund_id": "re_123"},
    "fs.write":      lambda path, body: open(path, "w").write(body),
}

# In your agent loop, replace direct tool calls with safe_call(...)
print(safe_call("stripe.refund", {"amount_cents": 5000, "customer_id": "cus_1"}))

Step 4 — Try to break it

Run three calls to confirm the gate behaves:

# 1) Allowed: under the $100 cap, dev env
safe_call("stripe.refund", {"amount_cents": 5000, "customer_id": "cus_1"})

# 2) Denied: over the cap
safe_call("stripe.refund", {"amount_cents": 250000, "customer_id": "cus_1"})
# -> PermissionError: Blocked: cap-spend exceeded

# 3) Requires approval: prod write
safe_call("fs.write", {"path": "/etc/app.conf", "body": "..."}, environment="production")
# -> PermissionError: Approval required: Writes to prod must be approved by a human.

Every decision is now in evidence.jsonl with timestamp, rule id, args, and verdict — exactly what an auditor wants.

Step 5 — Plug into a real framework

Skip the manual wrapper if you're already on a framework. The toolkit hooks native extension points so you don't rewrite agent code.

# LangChain — wrap each Tool
from agent_os.adapters.langchain import GovernedTool
from langchain.agents import initialize_agent

tools = [GovernedTool(t, gate=gate) for t in raw_tools]
agent = initialize_agent(tools, llm, agent="openai-tools")

# OpenAI Agents SDK — middleware
from agents import Agent
from agent_os.adapters.openai_agents import governance_middleware
agent = Agent(name="ops", tools=tools, middleware=[governance_middleware(gate)])

# Microsoft Agent Framework — pipeline step
from agent_framework import AgentBuilder
from agent_os.adapters.maf import GovernanceStep
agent = AgentBuilder().with_tools(tools).add_step(GovernanceStep(gate)).build()

Common pitfalls

  • Wrapping the LLM, not the tool call. The gate must sit between the planner and the side effect. Wrapping the LLM only catches text, not actions.
  • Open-by-default policies. Start with default: deny and add allow rules. Reverse and you'll ship a hole on day one.
  • Forgetting subagents. Each spawned subagent gets its own context. Pass the gate (or a derived child gate with tighter scope) into every subagent constructor.
  • Trust tier drift. Re-verify tool signatures on load — a manifest swap is the easiest supply-chain attack on agents.

Quick reference

ComponentWhat it doesWhen you need it
agent-osSub-millisecond policy gateAlways — this is the runtime
agent-complianceEvidence + framework gradingIf anyone audits you (EU AI Act, SOC 2, HIPAA)
agent-marketplaceEd25519-signed plugin manifestsLoading third-party tools/skills
agent-lightningRL training-loop governanceFine-tuning agents with RL or RLHF
adaptersLangChain, LangGraph, OpenAI, MAF, CrewAI, Pydantic AI, HaystackYou already use one of those

Next steps

  • Walk through Microsoft's 20 official tutorials in the microsoft/agent-governance-toolkit repo on GitHub.
  • Map your real tools to OWASP Agentic Top 10 categories — start with goal hijacking (T1) and tool misuse (T6).
  • Add agent-marketplace if you load tools from anywhere outside your own repo.
  • Pipe evidence.jsonl into your SIEM and alert on any deny in production.

Governance used to mean a quarterly review meeting. With sub-millisecond runtime enforcement plus evidence collection, it becomes the same kind of always-on guardrail your web app already has — just shaped for autonomous agents.

Comments

Subscribe to join the conversation...

Be the first to comment