Problem Statement
The sandbox today rolls its own LLM proxy: crates/openshell-sandbox/src/l7/inference.rs does pattern matching, crates/openshell-router does request rewriting and streaming, and crates/openshell-sandbox/src/proxy.rs::process_inference_keepalive glues them together. This works for the three providers we ship, but every new capability — guardrails, retries, OTel tracing, budget controls, additional providers, MCP federation, A2A — is custom code we have to write and maintain.
agentgateway is a Rust LLM/MCP/A2A gateway that already implements all of those. Embedding it in-process inside openshell-sandbox lets us delete duplicated logic and pick up those features for free.
Proposed Design
Scope
- In: replace the
inference.local interception path; add mcp.local as a new listener.
- Out: A2A, the L4 OPA proxy, forward HTTP proxy, secret resolver, and any
openshell-server change.
Architecture
- Embedding: link agentgateway as an in-process library dependency. Construct
agentgateway::ProxyInputs and call Gateway::proxy_bind(...) directly with the post-OPA TcpStream. No subprocess, no loopback hop.
- TLS: agentgateway terminates TLS using a per-SNI cert resolver backed by the existing sandbox CA in
ProxyTlsState. Cert-resolver hook to be contributed (code drop pending).
- Config: in-process. The supervisor builds an agentgateway
Stores directly from GetInferenceBundle results (and a sandbox-settings-driven MCP target list as a Phase 1 stop-gap) and hot-swaps it on bundle revision changes — replacing the existing spawn_route_refresh loop.
- Observability: an OCSF adapter consumes agentgateway access logs and emits
HttpActivityBuilder events; guardrail blocks dual-emit DetectionFinding. Per-binary identity from resolve_process_identity is propagated via header injection at handoff.
- What stays: the L4 CONNECT lifecycle, OPA decision, process-identity resolution, denial aggregator, forward HTTP proxy, secret resolver, and
openshell-router::system_inference (still used for in-process system calls).
Benefits
- Delete duplicated proxy logic in
openshell-sandbox/l7/inference.rs, openshell-router::backend, and the route-refresh loop.
- Gain agentgateway-native features without writing them: guardrails, retries, OTel tracing, budget controls, broader provider coverage.
- MCP gateway as a new product capability with no custom-built MCP code in OpenShell.
- Aligns OpenShell's L7 surface with the broader agent-gateway ecosystem; future agentgateway upgrades land features automatically.
What's involved
Upstream / fork work on agentgateway:
- Cargo features to slim the build (no XDS client, no kube controller, no OIDC by default).
- Programmatic
Stores builder API for in-memory config.
- Per-SNI cert resolver hook on
ServerTLSConfig.
- Pluggable access-log sink (so OpenShell can adapt logs to OCSF).
OpenShell-side work:
- New
crates/openshell-sandbox/src/agw/ module (config translator, lifetime owner, handoff functions).
- Wire
handle_inference_interception to the agw handoff behind a feature flag for safe migration.
- Add
handle_mcp_interception for mcp.local and a sandbox-settings-driven MCP target list.
- OCSF access-log adapter.
- Cut over and shrink
openshell-router to system_inference only.
Alternatives Considered
- Sidecar container/process. Breaks the "all egress through the sandbox proxy" model and would need extra netns rules to constrain it. Rejected.
- Subprocess managed by the supervisor. Workable, but the in-process path is cleaner and avoids a localhost TLS hop given that we want to terminate TLS in agentgateway anyway.
- Keep building it ourselves. Each new capability (guardrails, MCP, A2A, more providers) is custom code we'd otherwise inherit.
- Extend
openshell-router to cover MCP/guardrails/etc. Same maintenance burden as today, multiplied.
Risks / Open Questions
- Agentgateway is shaped as a binary, not a polished library; the upstream work above is a real chunk of effort. A short spike should validate before committing.
- MITM cert-resolver hook size depends on the code drop.
- No
GetMcpBundle proto exists; Phase 1 reads MCP targets from sandbox settings as a stop-gap.
Agent Investigation
- Reviewed
architecture/gateway.md, architecture/inference-routing.md, crates/openshell-sandbox/src/proxy.rs, crates/openshell-sandbox/src/l7/, and crates/openshell-router/.
- Reviewed agentgateway at
~/src/agentgateway/agentgateway — confirmed ProxyInputs::new, Gateway::proxy_bind, in-memory Stores, AI/MCP/A2A backend types, and that HTTPS/TLS/SSH listener protocols exist.
- Confirmed the L4 OPA proxy, process-identity correlation, and forward HTTP proxy in
openshell-sandbox have no agentgateway analogue and must remain.
- Confirmed
openshell-router::system_inference is still needed post-cutover for in-process system calls.
Problem Statement
The sandbox today rolls its own LLM proxy:
crates/openshell-sandbox/src/l7/inference.rsdoes pattern matching,crates/openshell-routerdoes request rewriting and streaming, andcrates/openshell-sandbox/src/proxy.rs::process_inference_keepaliveglues them together. This works for the three providers we ship, but every new capability — guardrails, retries, OTel tracing, budget controls, additional providers, MCP federation, A2A — is custom code we have to write and maintain.agentgateway is a Rust LLM/MCP/A2A gateway that already implements all of those. Embedding it in-process inside
openshell-sandboxlets us delete duplicated logic and pick up those features for free.Proposed Design
Scope
inference.localinterception path; addmcp.localas a new listener.openshell-serverchange.Architecture
agentgateway::ProxyInputsand callGateway::proxy_bind(...)directly with the post-OPATcpStream. No subprocess, no loopback hop.ProxyTlsState. Cert-resolver hook to be contributed (code drop pending).Storesdirectly fromGetInferenceBundleresults (and a sandbox-settings-driven MCP target list as a Phase 1 stop-gap) and hot-swaps it on bundle revision changes — replacing the existingspawn_route_refreshloop.HttpActivityBuilderevents; guardrail blocks dual-emitDetectionFinding. Per-binary identity fromresolve_process_identityis propagated via header injection at handoff.openshell-router::system_inference(still used for in-process system calls).Benefits
openshell-sandbox/l7/inference.rs,openshell-router::backend, and the route-refresh loop.What's involved
Upstream / fork work on agentgateway:
Storesbuilder API for in-memory config.ServerTLSConfig.OpenShell-side work:
crates/openshell-sandbox/src/agw/module (config translator, lifetime owner, handoff functions).handle_inference_interceptionto the agw handoff behind a feature flag for safe migration.handle_mcp_interceptionformcp.localand a sandbox-settings-driven MCP target list.openshell-routertosystem_inferenceonly.Alternatives Considered
openshell-routerto cover MCP/guardrails/etc. Same maintenance burden as today, multiplied.Risks / Open Questions
GetMcpBundleproto exists; Phase 1 reads MCP targets from sandbox settings as a stop-gap.Agent Investigation
architecture/gateway.md,architecture/inference-routing.md,crates/openshell-sandbox/src/proxy.rs,crates/openshell-sandbox/src/l7/, andcrates/openshell-router/.~/src/agentgateway/agentgateway— confirmedProxyInputs::new,Gateway::proxy_bind, in-memoryStores, AI/MCP/A2A backend types, and that HTTPS/TLS/SSH listener protocols exist.openshell-sandboxhave no agentgateway analogue and must remain.openshell-router::system_inferenceis still needed post-cutover for in-process system calls.