The Harness: the infrastructure that makes an agent work

Day 4 / 60

I've been waiting a long time to talk about this. Harness is one of those concepts that seems technical and dense at first, but once you understand it, it changes how you think about agents. The term came into my life recently, but it's quickly becoming popular in the AI agents community on X.

What is the Harness?

The harness is the infrastructure that surrounds the model so the agent can run long-duration tasks. It's not the agent itself. It's how the agent operates.

The simplest metaphor I can think of:

I want to build a car. The engine is the LLM. The harness is everything else — the steering wheel, the wheels, the brakes, the body — and above all, how they're interconnected to make the car work.

For those who like computers, there's a better metaphor: processors are the models, RAM is context management, and the rest — the motherboard, fans, keyboard, screen and how everything is connected — is the harness.

Agent Harness diagram

Source: philschmid.de

In an agent, the harness includes:

The prompts it uses
The available tools or skills
The logic for choosing and executing those tools
Context management
State
Planning capabilities
Memory
Guardrails
And a bunch of things I'm probably forgetting

Nine agents, nine different harnesses

I've built several agents over time:

Billi, Tsukasa, A0x, Jesse XBT (in its early phase), Tomás, Claudio, Felipe, Pedro and Don Nelson.

Each one had a different harness. Not because they do different things, but because my way of building agents has changed a lot. Each iteration I learned something I didn't know before.

What the industry teaches us

There are three examples I find very revealing:

Manus — one of the first recognized public agents, refactored its harness 5 times in 6 months.

LangChain changed its harness 3 times in a year.

Vercel eliminated 80% of its tools to respond faster and spend fewer tokens.

The conclusion is simple: the harness needs to be as lightweight and modular as possible. Each new model behaves differently. The harness that works today may not work tomorrow.

Some advice if you're starting out

If you're building the harness for your agent or agentic flow, there are three things I think make the difference from the start:

Keep it simple. Simple is better. But simple is hard.

Make it modular. Models change fast. If your harness is coupled to a specific version of something, you'll have to redo the whole thing when the model changes. It's happened to me :(.

Prompts are no longer a competitive advantage. The trajectory your harness captures is. Every time your agent fails to follow an instruction is an opportunity to improve. That accumulates and over time becomes very hard to replicate.

Progress: Don Nelson on trial

In the first post I mentioned three things I needed to solve to get to the CEN.

The first, and for me the hardest, is already resolved: I have a client who gave me a real scenario letter. I now have a concrete case study.

The DIgSILENT license is just days away from being resolved. From there, the pace depends only on me.

In the next posts I'll go directly into Don Nelson's harness — how it's built today, what I've learned from it, and what I'm going to change so it can submit a study that the CEN approves.

Some references if you want to learn more about Harness: