Anthropic Trained Its Replacement

Created: March 26, 2026 • Updated: March 26, 2026 • 6 min read

Neural network visualization representing AI self-improvement and recursive learning loops

Anthropic built its safety culture on a single premise: if dangerous capabilities are going to exist, better to have researchers who understand the risks building them than not. That logic attracted the best minds in AI. It gave them the best compute, the clearest view of the frontier, and a mission framed as the most important work in the world. It also, inevitably, trained them. Behnam Neyshabur joined Anthropic in December 2024 to co-lead the Discovery team, whose explicit mission was to build an AI scientist capable of pushing the scientific frontier autonomously. Thirteen months later, he left to build exactly that at a company called Mirendil, backed by Marc Andreessen and Kleiner Perkins at a $1 billion valuation. The safety lab incubated the thing the safety lab was most afraid of. That is not irony. That is the structural logic of every institution that tries to contain a technology by studying it closely enough.

The loop is already closing

Mirendil's thesis is clean: build frontier models that are exceptional at AI research, then redesign the entire lab around those models to make the improvement loop faster and more autonomous with each iteration. They call this self-accelerating AI. The goal, in their own words, is "the thing that is capable of continually improving itself toward any goal." That sentence appeared on their public website this week, reposted by Marc Andreessen to 1.5 million followers. They are not being subtle about what they are building.

Here is what they are not saying loudly: Google DeepMind already demonstrated this loop working in production. AlphaEvolve, released in May 2025, is an evolutionary coding agent powered by Gemini that improved Gemini's own training kernels by 23%. It redesigned TPU chip layouts. It produced the first improvement to Strassen's matrix multiplication algorithm in 56 years. It recovers 0.7% of Google's global compute every single day. The self-improvement loop is not theoretical. It is running in a Google data center right now, compounding quietly while Mirendil is still raising its seed round.

Mirendil calls itself "the first lab from the future." It is probably third.

Why Neyshabur left

Neyshabur's academic record is foundational. His paper on Sharpness-Aware Minimization reshaped how researchers think about training stability. His generalization work has over 38,000 citations. He spent two years at Google DeepMind co-leading the team that built Gemini's reasoning capabilities before Anthropic recruited him. He is not someone chasing a trend. He is someone who had a 15-year thesis about AI doing science and finally decided to pursue it on his own terms.

At Anthropic, the Discovery team had a concrete north star: take the OSWorld benchmark for autonomous computer use from 8% to human level, which sits at 72%. In 13 months, the team moved that number to 61%. Then Neyshabur left. The most coherent read is not disillusionment. It is that after 13 months inside the most careful AI lab in the world, he concluded that this capability was going to be built regardless of where he worked, and that a focused team with the right feedback loop could move faster than any institution managing competing priorities and governance overhead.

Dario Amodei described the strategy publicly: "We would make models that were good at coding and good at AI research, and we would use that to produce the next generation of models and speed it up to create a loop." That is the Mirendil roadmap, stated by Anthropic's CEO, describing Anthropic's own plan. The only variable is who gets there first.

The compound failure of safety-first institutions

Safety-focused labs have a structural problem that no amount of good intentions resolves. They attract researchers who take the risks seriously. They give those researchers access to frontier capabilities, breakthrough compute, and a culture where existential stakes are the baseline assumption. Then the researchers leave. Each generation inherits the technical knowledge of the previous one and sheds the institutional constraints. The knowledge compounds. The guardrails do not travel.

OpenAI was founded by people who left Google because they believed general AI was coming and wanted careful researchers in the room. Anthropic was founded by people who left OpenAI for the same reason. Mirendil was founded by people who left Anthropic. This is not a coincidence or a failure of loyalty. It is the predictable output of a field where the most important work cannot be done slowly, and where institutions that try to do it slowly create the very conditions that motivate their best people to go do it fast elsewhere.

Jared Kaplan at Anthropic called recursive self-improvement "the ultimate risk" and said humans face an "extremely high-risk decision" between 2027 and 2030 about whether to allow AI to train the next generation of AI. He said this while working at the same lab where the Discovery team was building exactly that capability. Neyshabur was in the building. He heard the argument. He left anyway. That tells you something important about where the logic leads when you think it through all the way.

What four people and $175 million actually means

Mirendil has four known team members: two co-founders from Anthropic, one from xAI, one former OpenAI intern. They are targeting $175 million at a $1 billion valuation. Their stated philosophy is "we hire to compound, not to grow." This is either a genuinely new organizational model for what an AI lab looks like, or the most expensive proof in history that you still need humans to close the loop.

The real bet is not that Mirendil will outcompute Google or outscale OpenAI on headcount. The bet is that the self-improvement loop makes traditional team scaling irrelevant past a certain threshold. If the model does the research, the lab becomes infrastructure rather than workforce. You need the right loop architecture, the right evaluation systems, and the right initial model. Four people with the right setup and a $175 million war chest might beat a thousand researchers fighting institutional inertia. That is either the founding insight of the next era of AI development or a compelling pitch deck that does not survive contact with the actual problem.

The mission nobody is saying out loud

The International AI Safety Report 2026, backed by 30 countries and over 100 authors, named self-accelerating AI as a priority concern. Twenty of 25 leading researchers surveyed from DeepMind, OpenAI, Anthropic, Meta, and top universities identified automating AI research as one of the most severe and urgent risks in the field. ICML 2026 is hosting a dedicated workshop on recursive self-improvement. The safety community is writing papers. Mirendil is raising $175 million.

The question Neyshabur has implicitly answered is not whether this capability gets built. It does. The question is whether the person who builds it is someone who spent 13 months inside Anthropic studying what it means to do this with awareness of what it can become, or someone who did not have that education. That is not a reassurance. It is the best available outcome inside a logic that was always going to produce this result regardless of where any individual chose to work. The arena is open. The loop is closing. Someone was going to be first.

The loop is already closing

Why Neyshabur left

The compound failure of safety-first institutions

What four people and $175 million actually means

The mission nobody is saying out loud

Sources