What is the Model Context Protocol (MCP) and why does it matter for AI agents?

MCP is an open protocol that standardises how AI agents connect to external tools and data sources. Launched by Anthropic in November 2024, it reached 97 million monthly SDK downloads within 16 months and gained native adoption from OpenAI, Google, Microsoft, and AWS. It matters because it replaced one-off custom integrations with a shared standard — giving agents reliable, repeatable access to real systems.

Why do 88% of AI agent pilots fail to reach production?

The failure gap isn't a model problem or a protocol problem. Forrester and Anaconda's 2026 research identifies the root causes as integration complexity, monitoring absence, and unclear organisational ownership — three of five are product design decisions, not engineering ones. Agents fail when nobody decides what the agent should do with its access, how it communicates what it did, and who is accountable for its behaviour.

What is progressive tool discovery in MCP, and why does it improve agent performance?

Progressive tool discovery means exposing a minimal set of tools at initialisation — often just a single 'find tools' entry point — and loading only three to four context-relevant tools per task. Amazon Prime Video and Speakeasy independently found that comprehensive tool access degraded performance: one team reduced context from 405,000 tokens (400 tools loaded at once) to 1,600–2,500 tokens with selective loading — a 160× reduction. Fewer tools mean less hallucination and lower cost.

How should AI agents handle permissions transparency with users?

The design challenge is not enforcement — MCP adopted OAuth 2.1 for authorisation — but legibility. Users need to understand what an agent can access before it acts, not after it fails. Best-practice implementations surface which tools are about to be called, which credentials they use, and what the agent's options are when it hits a permission boundary. This is the difference between an agent users tolerate and one they rely on.

What do enterprise teams that successfully deploy AI agents have in common?

Gartner and Forrester 2026 data point to four shared attributes: pre-deployment infrastructure investment, governance documentation before deployment, baseline metrics established before pilots begin, and dedicated business ownership of the agent's behaviour. Three of those four are product design decisions — someone has to define what success looks like, what the agent is accountable for, and who governs it. Engineering is necessary, but not sufficient.

MCP Is Exposing AI's Real Bottleneck: Product Design

MAY 29, 2026 · BY SASHA SHUMYLO

The model layer is solved. The orchestration layer is catching up. The context layer — how agents discover, access, and act on external tools and data — is where the work is now. And it turns out that work is not engineering. It is product design.

The stack is almost complete

For years, building an AI product meant solving three problems at once: which model to use, how to orchestrate it, and how to connect it to real data. Each layer had its own bottleneck.

The model layer matured first. GPT-4, Claude, Gemini — frontier models now ship with reasoning, tool use, and instruction-following that would have seemed unrealistic two years ago. The orchestration layer followed. LangChain, CrewAI, AutoGen, and a dozen other frameworks turned multi-step agent workflows from research projects into something a product team could ship in a sprint.

The context layer was the last open problem. Before MCP, every model-to-tool connection was a custom integration. Build once, for one model, maintain forever. Anthropic launched MCP in November 2024 to standardise that connection. By March 2026, it had 97 million monthly SDK downloads, 10,000+ public servers, and native adoption from OpenAI, Google, Microsoft, and AWS. The Linux Foundation took governance in December 2025. 78% of enterprise AI teams now report at least one MCP-backed agent in production.

The protocol works. The stack is nearly complete. And yet 88% of AI agent pilots still never reach production.

That gap is not a model problem. It is not an orchestration problem. And it is not, strictly speaking, a protocol problem. It is a product design problem — hiding inside infrastructure decisions that nobody on the product team made deliberately.

By “product design,” we mean a specific set of decisions: what capabilities to expose to the agent and when, how to structure tool discovery so the agent can navigate without drowning in context, how to make permissions visible so users understand what the agent does on their behalf, and who in the organisation owns the agent’s behaviour. These are not UX decisions in the traditional sense — they don’t involve screens or layouts. But they shape whether the product works for real people.

The 88% figure comes from Forrester and Anaconda’s 2026 surveys. The 12% that succeed share four attributes: pre-deployment infrastructure investment, governance documentation before deployment, baseline metrics before pilots, and dedicated business ownership. Three of those four are product design decisions.

MCP in numbers:

97M monthly SDK downloads (March 2026)
10,000+ public MCP servers
78% enterprise AI teams — at least one MCP agent in production
Adopted by OpenAI, Google, Microsoft, AWS
Linux Foundation governance — December 2025

MCP in numbers:

97M monthly SDK downloads (March 2026)
10,000+ public MCP servers
78% enterprise AI teams — at least one MCP agent in production
Adopted by OpenAI, Google, Microsoft, AWS
Linux Foundation governance — December 2025

What MCP actually asks of product teams

MCP defines three primitives: tools (actions an agent can take), resources (context data it can read), and prompts (reusable templates). The protocol handles the plumbing. What it does not handle is the set of decisions that determine whether any of this works for a human being.

Three design problems surface the moment a team moves from “we have an MCP server” to “our users trust this agent.”

Tool discovery: what does the agent see, and when?

The naive approach is to expose every tool at once. An e-commerce platform with 300+ API operations loads over 36,000 tokens in tool definitions before the agent processes a single user request — roughly a fifth of Claude’s context window, consumed before any work begins.

Amazon Prime Video hit this wall in production. Their engineering team found that giving agents comprehensive tool access through centralised MCP servers actually degraded performance. More tools meant more confusion, more hallucination, worse outcomes. Their solution was progressive tool discovery — exposing a single “find tools” capability at initialisation, then loading only three or four context-appropriate tools per task. The Speakeasy team measured the difference: from 405,000 tokens for 400 tools down to 1,600–2,500 tokens. A 160× reduction.

Progressive discovery is, at its origin, an engineering optimisation. Teams adopt it to reduce cost and fight hallucination. But once the engineering pattern exists, a layer of product decisions sits on top. Which tools are in the default set? How does the agent communicate to the user that its current capabilities are limited? What happens when a request falls outside the loaded tool set — does the agent fail silently, escalate, or explain? These are not engineering questions. They are the same questions a product designer asks when structuring a navigation system — except the user is an AI agent, and the navigation happens in a context window instead of on a screen.

Speakeasy: the token math

400 tools = 405,000 tokens
Progressive loading = 1,600–2,500 tokens
Reduction: 160×

Speakeasy: the token math

400 tools = 405,000 tokens
Progressive loading = 1,600–2,500 tokens
Reduction: 160×

Permissions: who controls what the agent can do?

Traditional access control was built for humans who log in, do work, and log out. Agents don’t follow that pattern. They run continuously, spin up dynamically, and chain actions across multiple systems in a single workflow. A user asks an agent to “pull last quarter’s sales data and draft a board update.” That single request might touch a data warehouse, a CRM, and a document editor — each with its own credential requirements, each with its own scope of what’s allowed.

The security implications are well documented. 88% of organisations confirmed or suspected security incidents related to AI agents in 2026. When ten agents share one credential set, there is no way to link a specific tool call to a specific user — a direct compliance failure under HIPAA, SOC2, and GDPR.

But the product design implication is less discussed and equally critical: how does the user understand what their agent can and cannot do? In a traditional application, permissions are invisible until you hit a wall. For an agent that acts autonomously, invisible permissions create invisible risks. Designing permission transparency for agent-mediated work is an open UX problem that most teams haven’t started solving.

MCP adopted OAuth 2.1 as its authorisation standard. The protocol answers “can this agent do this thing?” It does not answer “does the user understand what this agent is doing on their behalf?” — which is the question that builds or breaks trust.

Context structure: what does the agent know about the user’s world?

An agent that connects to six tools has access to six different data contexts — each with its own schema, its own freshness, its own level of reliability. The agent needs to decide which context matters for a given request, how to reconcile conflicting data across sources, and how to communicate the limits of what it knows.

Block, the parent company of Square and Cash App, runs MCP across 12,000 employees with default servers connecting to Snowflake, GitHub, Jira, Slack, Google Drive, and internal APIs. Employees report 50–75% time savings on common tasks. But that result didn’t come from plugging in servers. It came from structuring which contexts the agent loads by default, what it has to discover on demand, and how it communicates the boundaries of its knowledge to the user.

This is not a data engineering problem. It is a product design problem: what does the agent’s working memory look like, and who decides what goes in it?

Where design is absent, agents fail

The five root causes cited most frequently in agent scaling failures are integration complexity, inconsistent output quality at volume, absence of monitoring tooling, unclear organisational ownership, and insufficient domain training data. Three of those five — integration complexity, monitoring absence, and unclear ownership — are design problems wearing engineering clothes.

Integration complexity is not about whether the API works. It is about whether the tool surface is structured so the agent can navigate it without drowning in context. Monitoring absence is not about whether logs exist. It is about whether someone designed what gets surfaced to the user, how the system explains what it did, and what the user can do when something goes wrong. Unclear ownership means nobody decided who is responsible for the agent’s behaviour — a product accountability decision, not an org-chart problem.

This pattern connects to what we explored in Where AI Agents Fail: Designing for Control and From UX to AX: Why Interaction Design Breaks When Systems Start Acting. The failure modes described there — agents that automate too aggressively, systems that expose too much process complexity, trust breakdowns when users can’t trace what happened — map directly onto what MCP adoption is revealing at enterprise scale.

Gartner predicts that over 40% of agentic AI projects will be cancelled by 2027 due to escalating costs, unclear business value, or inadequate risk controls. “Inadequate risk controls” is a governance design problem. “Unclear business value” is a product scoping problem. The technology works. The products around it don’t.

The five failure modes

Integration complexity
Inconsistent output quality at volume
Absence of monitoring tooling
Unclear organisational ownership
Insufficient domain training data

The five failure modes

Integration complexity
Inconsistent output quality at volume
Absence of monitoring tooling
Unclear organisational ownership
Insufficient domain training data

What good looks like

The teams that ship agents to production share a set of design decisions that have nothing to do with which model they chose or which framework they used.

They design tool surfaces, not just tool integrations. Amazon Prime Video’s progressive discovery approach — starting with a single “find tools” entry point and loading three to four tools per task — is an information architecture pattern, not a performance hack. It treats the agent’s context window as a design surface with the same constraints as a mobile screen: limited real estate, high cost of distraction, zero tolerance for irrelevance.

They make permissions visible, not just enforced. The FactSet approach to enterprise MCP governance uses scope-based permissions that adapt to each interaction. More importantly, they surface what the agent can and cannot do as part of the interaction, not as a wall the user hits after the fact. The agent tells the user what it’s about to access before it accesses it. When it hits a permission boundary, it doesn’t fail silently — it tells the user what happened and what their options are.

They design the feedback loop, not just the output. 96% of organisations have adopted agents, but only 12% have a centralised platform to manage them. The teams that close the operational gap build observability into the user experience, not as a backend dashboard that only engineers see.

At The Gradient, we’ve been building with MCP in our own content operations — connecting AI agents to Sanity CMS for editorial workflows that span drafting, metadata, and publishing. The pattern we’ve seen firsthand mirrors what the enterprise data shows: the technical integration is the easy part. The hard part is designing what the agent should do with the access it has, how it communicates what it did, and what the user’s recourse is when the output isn’t right.

The context layer is the next frontier

For the past decade, the interface meant screens. Layouts, components, navigation patterns, responsive breakpoints. Product designers spent their time deciding where buttons go, how flows connect, and what the user sees at each step.

MCP is one protocol within a broader context layer that also includes retrieval systems, memory architectures, prompt assembly, and intent routing. But MCP’s rapid adoption is making visible a set of problems that exist across that entire layer. When agents mediate the interaction between users and systems, the product decisions shift — from layouts and flows to behaviour, trust, and accountability.

This is the argument we’ve been building across this series since we first made the case that UI in the classical sense is losing its central role. MCP makes that shift tangible. It gives teams the plumbing to connect agents to real systems. It does not give them a framework for deciding how those connections should behave from the user’s perspective.

The product decisions — what to expose, how to structure discovery, how to make permissions legible, how to build trust through transparency — are where the work actually happens. Product teams that treat MCP as backend infrastructure will build agents that work in demos and stall in production. The ones that treat it as a design surface will build the products that earn the 171% ROI the data promises.

The models are ready. The orchestration is ready. The protocols are ready. What’s still missing is the product thinking that turns working infrastructure into products people trust.

What does this mean for your product?

Discuss with your AI.

BY Sasha ShumyloPartner and Product Strategist

A Product Strategist with over 13 years of experience in marketing, product strategy, and branding. His love for analytics, funnels, and a structured approach ensures that the digital products we craft aren't just functional—they impress.