Extend the policy prover to reason about MCP (Model Context Protocol) tool permissions alongside existing L4/L7 binary and endpoint modeling.
Agents increasingly use MCP servers to interact with external services (GitHub, databases, file systems, etc.). Today OPP models which binaries can reach which endpoints. MCP adds a layer: the binary (e.g., node) reaches the MCP server, but the security-relevant question is what tools the MCP server exposes and what those tools can do.
Problem
A sandbox policy might allow node to reach api.github.com, and an MCP server running inside the sandbox provides tools like create_issue, push_code, read_repo. The prover should be able to reason about:
- Which MCP tools are available in this sandbox?
- Do the exposed tools + credentials allow write operations despite read-only policy intent?
- Can an MCP tool chain create an exfiltration path (e.g., read file via one tool, send via another)?
Scope
- Model MCP server tool inventories as a new registry dimension alongside binaries and APIs
- Map MCP tools to the underlying API actions they perform (e.g., GitHub MCP create_pull_request -> POST /repos/*/pulls)
- Integrate with existing exfiltration and write-bypass queries
- Start with GitHub MCP as the first modeled server, expand to others
Context
This bridges the gap between "which binaries can reach which endpoints" (current model) and "what can the agent actually do through its tool interfaces" (what users care about). Tracks under OS-43.
Extend the policy prover to reason about MCP (Model Context Protocol) tool permissions alongside existing L4/L7 binary and endpoint modeling.
Agents increasingly use MCP servers to interact with external services (GitHub, databases, file systems, etc.). Today OPP models which binaries can reach which endpoints. MCP adds a layer: the binary (e.g., node) reaches the MCP server, but the security-relevant question is what tools the MCP server exposes and what those tools can do.
Problem
A sandbox policy might allow node to reach api.github.com, and an MCP server running inside the sandbox provides tools like create_issue, push_code, read_repo. The prover should be able to reason about:
Scope
Context
This bridges the gap between "which binaries can reach which endpoints" (current model) and "what can the agent actually do through its tool interfaces" (what users care about). Tracks under OS-43.