Building MCPs is not hard
MCP servers aren't hard to build. Here's how I approach building new ones.
MCP servers are not hard to build. Most open-source MCP servers we see online are half-baked REST wrappers , and that’s where people miss the point. MCP tools are AI-native APIs: they exist to enhance a model’s capabilities, not to mirror a human-oriented REST API one endpoint at a time. Building high-quality MCP servers is straightforward enough if you know what you’re optimizing for.
On Tools
High quality tools are essential to an agent performing an action in an optimal way. Poorly designed tools (e.g. REST wrappers) are not ideal because they’re designed for humans to use, not AI. I recently made yet another Unsplash MCP server because I was unsatisfied with the ones in the wild. Here’s what I look for when building one.
Dual-transport
Stdio and stateless HTTP are both supported. Few MCP servers offer both. Often for local development we’re forced into using STDIO with no opportunity to use HTTP unless we fork and modify the source.
Write for the model
Tools should be self-explanatory: clear descriptions, clear input schemas. If you can easily understand what the inputs are, the model can too. I validate inputs (and outputs, when it makes sense) with zod via a typed SDK. Adds some package weight, but worth it imo 😅.
Normalize before you return
Filter out any extra cruft not needed for the LLM and normalize the response so that it’s obvious to
the LLM where the critical data is. Here’s a truncated look at Unsplash’s GET /photos/random
response:
{ "id": "Dwu85P9SOIk", "created_at": "2016-05-03T11:00:28-04:00", "updated_at": "2016-07-10T11:00:01-05:00", "width": 2448, "height": 3264, "color": "#6E633A", "blur_hash": "LFC$yHwc8^$yIAS$%M%00KxukYIp", "downloads": 1345, "description": "A man drinking a coffee.", "exif": { // ... photo metadata }, "location": { // ... location data }, "current_user_collections": [ // The *current user's* collections that this photo belongs to. // ... more collections ], "urls": { "raw": "https://images.unsplash.com/photo-1417325384643-aac51acc9e5d" // ... more urls }, "links": { "self": "https://api.unsplash.com/photos/Dwu85P9SOIk", "html": "https://unsplash.com/photos/Dwu85P9SOIk", "download": "https://api.unsplash.com/photos/Dwu85P9SOIk/download", "download_location": "https://api.unsplash.com/photos/Dwu85P9SOIk/download" }, "user": { // ... user data }}Lengthy, no? There’s a lot of extra cruft in here that we likely don’t need. Field-level trimming keeps extra junk out of the agent context. Workflows go further, and that’s where the real leverage is.
On Workflows
I call these workflows in my MCP skills. A workflow stitches together multiple API calls into a single tool, not unlike a process you’d follow in real life. Consider installing a new faucet in your kitchen. You’d start by turning off the water, then removing the old faucet, then installing the new one, then turning the water back on, and finally checking for leaks.
How to design one
Creating AI-native tools takes a similar approach:
- Discover common patterns and use cases for the API
- Identify clusters of API calls we could compress into a single tool
- Build the tool that orchestrates the calls, in parallel when they’re independent
Agents are really good at recognizing patterns and doing research for us. This is exactly why I made a skill to do just that.
What it looks like in code
Consider prompting an agent using our Unsplash MCP to find information about a photographer. There’s
no specific API to do this and without a specific tool an agent needs to discover and call
get_users_by_username, get_users_by_username_photos, get_users_by_username_collections, and
get_users_by_username_statistics. Four tool calls, four chances to get the order wrong. With a
new workflow tool (unsplash_photographer_portfolio), one call does it in parallel and returns
something like this:
{ "summary": "Dennis Thompson (@djthoms) — 42 photos, 3 collections, 1,204 total downloads.", "profile": { "username": "djthoms", "name": "Dennis Thompson", "bio": "...", "location": "San Francisco, CA", "total_photos": 42, "total_collections": 3, "downloads": 1204, "followers": 89, "web_url": "https://unsplash.com/@djthoms" }, "top_photos": [ { "id": "Dwu85P9SOIk", "description": "A man drinking a coffee.", "dimensions": "2448x3264", "likes": 1345, "urls": { "regular": "https://images.unsplash.com/...", "small": "https://images.unsplash.com/...", "thumb": "https://images.unsplash.com/..." }, "web_url": "https://unsplash.com/photos/Dwu85P9SOIk" } ], "collections": [{ "id": "1234", "title": "Street", "total_photos": 12 }], "statistics": { // ... download/view trends }}Same underlying API. One tool call, a summary the model can quote immediately, and no exif,
blur_hash, or current_user_collections noise. If one upstream call fails, the response still
includes what loaded plus a partial_failure array, so the agent can work with what’s there instead
of bailing entirely.
Here’s roughly how it’s built: registration, parallel fetch, normalize, summarize (full source):
registerReadTool( server, "unsplash_photographer_portfolio", { title: "Photographer Portfolio", description: "Get a complete 360-degree view of an Unsplash photographer: profile, top photos, " + "collections, and download/view statistics. Combines 4 API calls in parallel. " + "Use this instead of making separate calls for profile, photos, collections, and stats.", inputSchema: { username: z.string().describe("Unsplash username"), photos_per_page: z.number().int().min(1).max(30).default(10).optional(), }, }, async (params) => { const [profile, photos, collections, statistics] = await callApiAll( getUsersByUsername({ path: { username: params.username } }), getUsersByUsernamePhotos({ path: { username: params.username }, query: { /* ... */ }, }), getUsersByUsernameCollections({ path: { username: params.username } }), getUsersByUsernameStatistics({ path: { username: params.username } }), );
const formattedPhotos = photos?.map((p) => ({ id: p.id, description: p.description ?? p.alt_description, dimensions: `${p.width}x${p.height}`, urls: { regular: p.urls?.regular, small: p.urls?.small, thumb: p.urls?.thumb }, web_url: p.links?.html, }));
return jsonResponse({ summary: `${profile.name} (@${profile.username}) — ${profile.total_photos} photos, ...`, profile: { /* trimmed profile fields */ }, top_photos: formattedPhotos, collections: collections?.map(/* id, title, total_photos */), statistics, // partial_failure if any callApiAll request fails }); },);The interesting parts are the description (tells the model when to reach for this tool),
callApiAll (parallel + fault-tolerant), and the shape of what comes back (summary first,
then structured sections).
Putting it to work
For our Unsplash MCP server, we have two workflow tools:
unsplash_photographer_portfolio- get a 360-degree view of a photographer’s portfoliounsplash_collection_overview- get a 360-degree view of a collection
Sure we can let the agent figure out the correct order of which specific tools to use, or we can create a curated tool that does the right thing. Workflows, I believe, are the missing piece in most MCP servers. They create a more natural flow for an agent to interact with.
With curated tools we can ask an agent questions like:
For the Unsplash profile @djthoms review their profile and give me a sense of their overall vibe.
And in a single tool call we collect relevant info for an agent to summarize.
The plumbing isn’t the hard part. Wiring up transport and registering tools is an afternoon or 15 minutes using an agent. The whole game is thinking like the agent: what does it actually need, and can you deliver that in one shot? Pick the task you keep asking agents to do, bundle the API calls, trim the response, write a description that tells the model when to reach for it. Auth and rate limits still matter, but they’re not where most MCP servers go wrong.
If you want to see the full approach in the wild, the code is in unsplash-mcp and the workflow pattern lives in mcp-skills.