all writing

Skills vs MCP: when 50 tool calls is the smell

45 MCP tools and the parameter fatigue to match — forever tweaking arguments to do something simple. The fix was a skill that read my grudging docs.

Dark editorial cover: a single bold electric-lime line runs straight down the centre of a vast charcoal grid of identical unlabeled switches, its glow spilling across the sprawl on either side before fading into black.

The Bureau — the multi-agent orchestration engine I’ve been building — exposes its whole control surface as MCP tools. One afternoon I went to nudge a stuck worker and caught myself staring at the screen trying to remember whether the parameter was directive, or taskId, or both, and in what order. I’d written the tool. I still couldn’t drive it.

I’d already pruned it hard — from seventy-five tools down to forty-five — and it still has all forty-five: declare a task graph, approve a gate, reject for rework, merge two graphs with ID remapping, inject a steering directive into a running pod, lock files, post a discovery, read a handoff. Each one a perfectly reasonable verb. Together, a control panel with forty-five unlabelled switches that I was operating by memory.

The symptom is tool sprawl. The smell is somewhere else.

The obvious read is “too many tools — cut them.” I tried that. Seventy-five became forty-five; that is the cut-down number. And it didn’t fix the actual problem, because the problem was never the count. It was that I was asking a human (me) to hold the entire parameter surface of forty-five tools in working memory and assemble the right call, with the right arguments, in the right order, every time.

That’s not a tool problem. That’s a missing layer. I’d built the actions and skipped the part that knows how to use them.

It took me longer than I’d like to admit to see that MCP and Skills aren’t competing answers to the same question. They answer different questions:

  • MCP is for connectivity and actions — thin, well-described verbs that do one thing and say exactly what they need.
  • A Skill is for procedure and judgement — it knows which verbs to call, in what order, with what arguments, to get a job done.

When your tool count explodes, the instinct is to collapse the actions. The fix is usually to add the orchestration layer you never wrote.

What MCP is actually for

A good MCP tool is boring. It does one thing and it documents itself. Here’s the message-passing tool, more or less verbatim:

src/tools/send-message.ts
registerInstrumentedTool(server, "send_message", {
title: "Send Message",
description: "Send a message to another Claude session by ID or role name. " +
"The message is delivered to their inbox; they receive it on their next check_messages.",
inputSchema: z.object({
to: z.string().describe("Session ID or role name of the recipient"),
type: z.enum(["task", "message", "status", "directive"])
.default("message").describe("Message type"),
body: z.string().describe("The message content"),
}),
});

Nothing clever. But notice that every field carries a .describe(), the enum spells out its legal values, and the description says when the recipient actually sees it. That last sentence isn’t for the human reading the source — it’s for the model reading the tool catalog at runtime. The schema is the documentation, and the documentation ships with the tool.

The interesting tools go further and put a sliver of workflow in the prose:

src/tools/declare-intent.ts
description:
"Declare which files you plan to modify and what you intend to do. Enables " +
"workspace conflict detection — other agents will be warned before touching " +
"the same files. Call this early, before starting implementation. Returns any " +
"conflicts detected immediately so you can coordinate before work begins. " +
"TIP: Call this right after set_status(\"implementing\", ...) and before writing any code.",

Read that again. “Call this early.” “TIP: right after set_status, before writing any code.” That’s not a description of what the tool does — it’s a description of where it sits in a workflow. The orchestration knowledge was already leaking into the tool definitions. I just hadn’t noticed I had a place to put it.

The skill that read its own catalog

MCP servers are introspectable by design. A client can call tools/list and get back every tool’s name, description, and input schema as structured JSON. That’s the whole protocol-level point of self-description — and it’s the hook everything else hangs on.

So the skill I wrote — /tb — was thin. It took me about five minutes to put together, and it went on to save me hours, probably days, over the life of the project. On invocation it did one discovery call against the server and got back the live surface:

tools/list (excerpt)
{
"tools": [
{
"name": "send_message",
"description": "Send a message to another Claude session by ID or role name. The message is delivered to their inbox; they receive it on their next check_messages.",
"inputSchema": {
"type": "object",
"properties": {
"to": { "type": "string", "description": "Session ID or role name of the recipient" },
"type": { "type": "string", "enum": ["task","message","status","directive"], "default": "message" },
"body": { "type": "string", "description": "The message content" }
},
"required": ["to", "body"]
}
},
{
"name": "get_handoff",
"description": "Read the structured handoff context from a completed task. Always check predecessor handoffs at the start of your task for context.",
"inputSchema": {
"type": "object",
"properties": {
"graphId": { "type": "string", "description": "The graph ID" },
"taskId": { "type": "string", "description": "The task ID whose handoff to read" }
},
"required": ["graphId", "taskId"]
}
}
]
}

That’s all the skill needed. It didn’t hardcode a single tool name or argument. It read the catalog, understood the surface from the descriptions, and routed plain English onto it:

/tb send a message to the coder agent to also check function X
/tb show me the handoff between the architect and the api designer
/tb where is graph 0f7483 stuck

The first one resolves to send_message with to: "coder", type: "message", body: "...". The second walks the graph, finds the two roles, and calls get_handoff with the right IDs. I never typed graphId. I never remembered that type defaults to message. The skill knew, because the tools told it.

And because the skill re-discovered the surface every run, it never went stale. I was changing tools underneath it daily — renaming, adding params, splitting one verb into two — and the skill just picked up the new reality on the next call. No version drift between the orchestration layer and the actions, because the orchestration layer didn’t hold a copy of the actions. It held a pointer.

That last line is the one I’d staple to the wall. “Fix the switches first, then label them” is really an argument for taking your functional definitions seriously — the name and the single, honest job each tool does — before you spend a sentence describing them. A tool called process_data that quietly does three things is a bug you’ve decided to document instead of fix. And because the description is the contract every agent and skill reads at runtime, a stale or flattering one isn’t a cosmetic problem — it actively steers callers wrong. Treat tool definitions like code, because they are: name first, description second, and the muddled verb gets fixed before you tidy its prose.

The part nobody mentions: it’s a development tool

Everything above is about user experience, and it’s real. But it buried the lede on what actually made the skill worth it.

The skill paid for itself during construction, not after. When you’re building a forty-five-tool MCP server, the bottleneck isn’t writing the tools — it’s the inner loop of checking whether the agents are doing what you think they’re doing. Did the handoff actually carry the decisions? Is the steering directive landing on the next turn? Did the rework agent get the rejection reason? Answering each of those by hand-assembling the right MCP call, with the right IDs, is exactly the friction that started this whole article.

With the skill, I could ask in English and get the answer. /tb show me the last handoff. /tb is graph X stuck. It turned a thirty-second “wait, what’s that param” detour into a reflex, and over a debugging session that adds up to hours. I stopped thinking about the tool surface and started thinking about the actual bug.

The boring documentation is the part that compounds

Here’s the thing I’d tell past-me. Writing a .describe() on every field of forty-five tools is tedious, unglamorous work. It feels like the chore you do after the interesting part. It is, in fact, the part with the hidden payoff.

Because I’d done it, the skill cost me almost nothing — it was a discovery call and a router. The expensive thing was already paid for, in small boring increments, while I was building the tools anyway. Later, when I wrapped the same surface in a plain REST control plane — no AI in that layer, just switches — that was cheap for the same reason: the descriptions, the schemas, the enums were all already there to build a UI against. The documentation didn’t pay out once. It kept paying.

That’s the inversion. I’d always treated tool docs as a tax on shipping. They’re closer to an investment that quietly funds everything you build on top of the server next — the skill, the UI, the next agent, the you-in-six-months who’s forgotten how any of it works.

Where the line sits

So, the rule of thumb I went looking for:

  1. MCP tools are thin actions that describe themselves. One verb, one job, every parameter documented, every legal value enumerated. Write them as if the only reader is a model seeing them for the first time at runtime — because it is.
  2. Skills are the orchestration knowledge — the order, the judgement, the “call this before that.” If you’re tempted to write a tool whose job is to call three other tools in sequence, that sequence probably wants to be a skill, not a tool.
  3. The handoff between them is tools/list. Keep the skill thin by having it read the live catalog instead of hardcoding the surface. Then your two layers can’t drift, because one of them doesn’t hold a copy of the other.

If you want the protocol-level details, the MCP spec on tools is the authoritative description of the discovery contract, and there are decent implementation and best-practice guides for writing the descriptions themselves.

Fifty tool calls to drive a workflow isn’t a sign you have too many tools. It’s a sign you’ve made a human do a skill’s job by hand. Write the boring docs, let a thin skill read them, and the sprawl collapses on its own.

Jacques Bronkhorst
Principal engineer who ships across the stack — enterprise .NET by day, an over-engineered home lab by night. Writes it all down at jcqb.dev.
next up
Ship the agent's code review, then ship an agent to doubt it