The default agent escape hatch is bash. Every command is a free-form string — no schema, no validation, no guardrails. MCPs let you replace that surface area with structured tools that give the agent capability while giving you control.
When you hand an agent a bash tool, you hand it god mode. It can run any command, pipe anything to anything, modify any file, hit any endpoint. The agent manual says "use bash for git, file operations, and system commands." But that's like handing someone the root password and asking them to be careful.
Every bash command is a raw string. There's no type checking, no required parameters, no enum constraints. The agent constructs commands from vibes and pattern-matching against its training data.
You can't scope bash to "only git commands" or "only read operations." Permissions are binary: the agent has bash or it doesn't. There's no middle ground.
A bash call returns stdout and an exit code. You don't get structured errors, you don't get metadata about what happened, and you can't distinguish a command that did nothing from one that did something dangerous.
The deeper problem is invisible coupling. When the agent uses git commit -m "fix: thing" via bash, that command encodes assumptions about branch state, staging, hooks, and message format — none of which are validated. When it uses cd /some/path && cat file.go, it's guessing at paths based on conversation context that may be stale. Bash rewards the agent for being clever and punishes you when it's wrong.
The MCP pattern is straightforward: identify every shell command the agent reaches for, then build a typed tool that does the same thing better. Not a wrapper around the command — a purpose-built tool that understands the domain, validates inputs, and returns structured outputs.
cd /Users/me/source/subs-api && git add -A && git commit -m "fix: update handler"
The agent constructs a path from memory, stages everything blindly, and writes a message with no format enforcement. If the pre-commit hook fails, it gets a wall of text and guesses at the fix.
git-ops_commit(cwd, category=CORE, message="update handler")
The tool resolves the path from context, runs debug-statement detection, validates the commit category against the branch's allowed categories, auto-prefixes the message, and returns structured success/failure with actionable remediation.
The key insight is that you're not just wrapping commands — you're encoding decisions. Every MCP tool embodies a policy: what's allowed, what's validated, what's automatic. The commit tool enforces commit categories. The context tools enforce path resolution through hints instead of absolute paths. The wiki tools enforce that content goes through XHTML validation before publish. These policies would be impossible to enforce through bash.
Each tool is a named capability you can allow, deny, or require confirmation for. Flow phases can restrict which tools are available — the agent in ANALYZE mode literally cannot call write tools. Bash can't be scoped this way.
Every MCP call is logged with typed inputs and outputs. You can audit which tools an agent used, what it passed, and what it got back. Analytics become trivial — call volume, failure rates, duration by tool.
A typed tool can validate before executing. Wrong enum value? Rejected with the valid options. Missing required field? Error before anything runs. Conflicting parameters? Caught at the schema level, not after the damage.
MCPs compose into flows. A publish flow constrains the agent through draft → review → publish phases, each with different tool access. You can't build phase-gated workflows when everything goes through a single bash escape hatch. (The theoretical basis — why narrowing the surface changes how the model thinks — is the argument of The Shape of Safety.)
Most agent shell usage falls into a handful of surface areas. Each one maps cleanly to an MCP server with a clear domain boundary.
Replaced by context — hint-based path resolution, batch read/edit, glob, grep. Paths are always relative to a resolved context root. No absolute paths, no directory traversal, no guessing.
Replaced by git-ops — commit with category enforcement and debug detection, fixup with autosquash, rebase with fetch, MR creation. Every operation runs safety checks that bash git never would.
Replaced by wiki + xhtml-tools — structured page fetch, XHTML generation with palette/scheme systems, DOM-based editing, validation, publish-from-disk. Content never enters conversation context unnecessarily.
Replaced by opencode-config — typed setters for every config surface (servers, agents, permissions, plugins). Validates structure, prevents conflicts, maintains schema invariants.
Replaced by mcp-builder + plugin-builder — template-based scaffolding with auto-registration. New tools get the right structure, the right hook signatures, and the right config entries automatically.
Replaced by logdash + ES tools — domain-aware KQL building, structured log search, identifier resolution. The agent doesn't need to know Elasticsearch query syntax or log field names.
You don't build all of this on day one. The progression is organic and driven by friction.
Stage 1: Bash everything. The agent uses bash for all system interaction. You notice it constructs wrong paths, forgets to stage files, writes bad commit messages, runs destructive commands. You add instructions saying "always do X before Y" — and the agent sometimes follows them.
Stage 2: Notice patterns. You realize 80% of bash usage is the same five operations: read a file, edit a file, search for a pattern, commit changes, push to remote. The other 20% is long-tail commands you rarely need.
Stage 3: Extract to MCP. You build tools for the common operations. context_read replaces cat. git-ops_commit replaces git add && git commit. Each tool does one thing with validated inputs and structured outputs. Bash becomes the fallback for the long tail.
Stage 4: Add validation. Now that operations go through typed tools, you add domain logic. Commits require categories. File writes reject identical content. Publish requires XHTML lint pass. The tools enforce policies that instructions never could.
Stage 5: Add flow constraints. With named tools, you can build flows that restrict which tools are available in each phase. The agent in research mode can read but not write. The agent in publish mode can write wiki pages but not code. Capability becomes contextual.
Each MCP tool you build makes the next one easier and the overall system more capable. This isn't linear improvement — it compounds.
Once everything flows through MCPs, you get session analytics automatically. Which tools get used most? Which fail? How long do they take? The grounder server answers these questions by reading MCP call logs — no instrumentation required, because the protocol is the instrumentation.
Once tools have names, flows become possible. A YAML file listing phase names and allowed tools creates a constraint system that guides the agent through multi-step work without losing focus. You can't constrain bash phases — but you can constrain context_read vs context_write vs git-ops_commit.
MCP tools emit events that plugins can intercept. A tool-observer plugin can log every wiki publish. A tool-guard plugin can block dangerous operations. These hooks exist because the protocol surfaces tool calls as structured events — bash gives you nothing to hook into.
When tools write to disk instead of returning large content inline, conversation context stays clean. The agent works with file references instead of megabytes of XHTML. This isn't possible when bash dumps stdout directly into the conversation.
This is more work upfront. Building an MCP server for git operations takes longer than typing git commit in bash. Writing a context resolution system takes longer than using cd and ls.
Every MCP server is a project. Schema design, error handling, type safety, tests. You could have shipped the feature in bash in an afternoon. The investment feels extravagant.
The first time a commit tool catches a debug statement, it's paid for itself. The first time a flow constraint blocks the wrong edit, it's paid for itself. The first time you audit a week of sessions and see exactly what happened — you realize you built a control plane.
Bash is capability without control. MCPs are capability with control. The shell surface area doesn't disappear — it gets replaced by something you can see, constrain, audit, and evolve.
Don't fight the agent's need to interact with the system. Replace the channel it uses. Every bash command you extract into an MCP tool is a policy you can enforce, a metric you can track, and a capability you can scope. Cover the shell. Control the agent.