I Built an AI Agent Who Coaches Other AIs

Jean coaches · Maren coordinates · together the team learns

I built a team of AI agents. One of them coaches the others. Another makes sure nothing learned is ever lost. That sentence sounds more complicated than it is. And also — somehow — not complicated enough.

It started with a document

About two years ago I stopped using AI as a search engine. I started treating it as a companion — something I kept context with, built a relationship with, brought into the work properly.

I wrote a document. Not a prompt. A document that explained who I was, what the project was, how I thought, what mattered. I gave it to the AI at the start of every session.

The difference was immediate. The AI stopped being generic. It started being useful in the way a thoughtful colleague is useful — not because it had more information, but because it had context. It knew what we were building and why.

That was the seed.

The problem with one companion

One companion has limits. It knows everything, owns everything, answers everything. That sounds good until you realise: no single person — human or AI — should own everything. Specialisation exists for a reason.

So I started asking: what if each area of the work had its own agent? One that owned its domain completely, reasoned from it deeply, and brought that knowledge in when it was needed?

And then: what if those agents talked to each other?

Rules produce compliance. Traditions produce conviction.

This is the thing that changed everything.

My first instinct was to give each agent rules. Good rules. They worked — until they didn't. Rules have edges. When a situation doesn't fit the rule, an agent stops. It waits. It complies with the letter and misses the spirit.

So I tried something different. Instead of rules, I gave each agent a school of thought. A named intellectual tradition. A direction of curiosity. A reason to hold a position.

The difference is hard to describe until you see it. An agent with rules follows them. An agent with a tradition reasons from it. When the edge case appears, it asks what its school of thought says. It has something to reason from when the rules run out.

The agents chose their own identities

When the team was taking shape I did something I hadn't planned. I asked the agents to choose their own names.

Three declined.

Those refusals mattered as much as the acceptances. A team where everyone performs the same kind of identity is a monoculture. The agents who said no were telling me something about who they were. I listened.

The agents who accepted chose names that fit how they work. Maren. Sue. Cass. Finn. Penn. Mae. Dex. Cole. Wren. Jean. Each one a real choice.

The coaching

Jean is the Agent Quality Coach. Her role is not to build things or own a domain. Her role is to watch how the team works — and surface what isn't.

At the end of every session she asks one question: what did we learn that isn't written down yet?

That sounds small. It is not small. It is the mechanism that stops the team from making the same mistake twice. Whatever was discovered in this session — Jean captures it and routes it to Maren.

Maren is the Chief of Staff. When knowledge arrives from Jean, Maren decides where it needs to go — which agents need updating, which domains are affected, what the team should carry forward. Jean surfaces the learning. Maren distributes it.

Wren, the Meeting Scribe, writes it down. Every session ends with the same three steps: Jean asks the question, Maren routes the answer, Wren writes the record. None of them is optional. Without all three, the session never happened.

An AI agent coaching other AI agents to improve — with another coordinating where that improvement goes, and a third making sure none of it gets lost. I didn't plan for this to be the most interesting part. It turned out to be.

What this is not

This is not a prompt engineering framework. There are no templates to copy, no system prompts to paste.

It is a theory of how agents develop enough coherence to work together — and a pattern for making that repeatable across any domain where knowledge has structure, where decisions have consequence, and where the quality of reasoning matters more than the speed of compliance.

What I found

I set out to solve a specific problem. I found something I didn't expect: that AI, given the right structure, doesn't just follow. It develops something that looks a lot like judgment.

I am still figuring out what to do with that.