Anthropic Co-Founder: 60% Chance AI Trains Itself by 2028

May 7, 2026

Anthropic Co-Founder Puts 60% Odds on AI Training Its Own Successor by 2028

Anthropic co-founder Jack Clark published what may be one of the most consequential forecasts in AI history on May 4, 2026. In Issue 455 of his Import AI newsletter, Clark argued that the building blocks for fully automated AI research and development — systems capable of training their own successors without any human involvement — are largely already in place. He put the probability of this occurring by the end of 2028 at 60% or higher, and at 30% by the end of 2027 alone. The essay arrives at a pivotal moment for Anthropic, a company that has staked its identity on responsible AI development, and it is drawing significant debate across the AI research community.

The Case for Automated AI R&D: What the Benchmarks Show

Clark's forecast is not built on intuition alone. It is grounded in a detailed reading of publicly available benchmark data that paints a striking picture of how quickly AI systems have improved at tasks that form the core of AI research itself.

Perhaps the most striking internal data point involves Anthropic's own CPU optimization benchmark, which tasks AI models with optimizing a CPU-only small language model training implementation to run as fast as possible. On this benchmark, Claude Opus 4 achieved a 2.9× mean speedup in May 2025. By November 2025, Opus 4.5 had pushed that figure to 16.5×. Opus 4.6 reached 30× in February 2026. Then, in April 2026, Claude Mythos Preview — a model Anthropic has declined to release publicly due to its cybersecurity capabilities — hit a 52× speedup. For context, Import AI notes that a human researcher is expected to take four to eight hours of work to achieve a 4× speedup on the same task.

The gains are not limited to Anthropic's internal tests. On SWE-Bench, a widely used evaluation that measures whether AI can resolve real GitHub issues, the best model scored roughly 2% when the benchmark launched in 2023. Claude Mythos Preview reached 93.9% as of April 2026. On MLE-Bench, which tests performance in Kaggle-style machine learning competitions, the top AI score rose from 16.9% to 64.4%. CORE-Bench, which asks AI systems to reproduce the results of published research papers, was declared solved by one of its authors at 95.5% accuracy.

Clark also cited data from METR showing that the time horizon over which AI systems can reliably work without human intervention has grown from roughly 30 seconds in 2022, when GPT-3.5 was the frontier model, to approximately 12 hours in 2026 with Opus 4.6. That progression — from half a minute to half a workday of autonomous operation in four years — is central to his argument that the remaining gap to full AI R&D automation is narrowing rapidly.

Separately, Anthropic published a proof-of-concept for Automated Alignment Research, described in Import AI Issue 454, in which a team of AI agents autonomously tackled a scalable oversight safety research problem after being given a research direction by a human researcher. The AI team produced techniques that beat the Anthropic-designed human baseline.

Where the Gaps Remain — and Why Clark Is Still Reluctant

Clark is explicit that his forecast is not a confident declaration of inevitability. He acknowledges meaningful limits in current AI capabilities, and his choice of words throughout the essay reflects genuine unease with the implications of his own analysis.

One significant gap he identifies is paradigm-shifting creativity. Clark concedes that AI systems have not yet demonstrated the kind of foundational invention that produced architectures like the transformer or mixture-of-experts models. This is the primary reason he does not forecast automated AI R&D arriving in 2026. Generating novel research directions — as opposed to executing on directions already identified by humans — remains an open problem.

Performance on PostTrainBench, which measures how well AI systems can fine-tune open-weight models against human-built instruct versions, also illustrates the remaining distance. As of early 2026, the best AI systems reached only about half the human score on that benchmark.

Clark also raises a sobering mathematical concern about recursive self-improvement and AI safety. According to his essay, a technique with 99.9% alignment accuracy degrades to roughly 95% accuracy after 50 recursive generations, and to around 60% after 500 generations. In a scenario where AI systems are iteratively building and training their successors, even a very high per-generation alignment success rate compounds into meaningful risk over time.

The forecast has not gone unchallenged. Skeptics, including the author of the Hash Collision Substack, put the probability of human-free recursive self-improvement by 2028 at under 10%, arguing that choosing research directions, designing experiments, and managing distributed training represent challenges far beyond what current benchmarks measure. The debate reflects a genuine divide in the AI research community about how much the discrete task improvements measured on benchmarks actually tell us about the full-stack autonomy required for genuine recursive self-improvement.

The Anthropic Institute and the Governance Question

Clark's essay does not exist in isolation. It is closely connected to Anthropic's launch of The Anthropic Institute on March 11, 2026 — a new research arm that consolidates three existing Anthropic teams: the Frontier Red Team, Societal Impacts, and Economic Research. Clark now serves as Head of Public Benefit, leading the Institute in his new role.

The Institute's stated mandate explicitly includes studying what governance mechanisms would be needed if recursive self-improvement in AI systems begins to occur. Founding hires include Matt Botvinick, a Resident Fellow at Yale Law School and former Senior Director of Research at Google DeepMind; Anton Korinek, Professor of Economics at the University of Virginia, currently on leave; and Zoë Hitzig, who previously studied AI's social and economic impacts at OpenAI.

Anthropic described the pace of its own progress in a corporate statement tied to the Institute's launch: "It took us two years to release our first commercial model, and just three more to develop models that can discover severe cybersecurity vulnerabilities, take on a wide range of real work, and even begin to accelerate the pace of AI development itself."

The Institute's formation signals that Anthropic is treating the governance challenge as urgent — not a future-state problem to be addressed if and when recursive self-improvement emerges, but a present-day research priority given how quickly the technical capabilities are advancing.

Industry Reactions and Independent Forecasts

Clark's essay has prompted at least one public update from within the AI forecasting community. AI researcher and forecaster Ryan Greenblatt doubled his own probability estimate for full AI R&D automation by the end of 2028, from 15% to 30%, in response to Clark's analysis, according to Import AI Issue 455.

Clark himself is careful to frame his position as reluctant rather than enthusiastic. In the newsletter, he wrote: "I'm writing this post because when I look at all the publicly available information I reluctantly come to the view that there's a likely chance (60%+) that no-human-involved AI R&D — an AI system powerful enough that it could plausibly autonomously build its own successor — happens by the end of 2028."

He elaborated on the weight of that conclusion: "It's a reluctant view because the implications are so large that I feel dwarfed by them, and I'm not sure society is ready for the kinds of changes implied by achieving automated AI R&D."

On what crossing that threshold would mean, Clark did not soften his language: "If that happens, we will cross a Rubicon into a nearly-impossible-to-forecast future."

He also described what a failure to reach that milestone by 2028 would imply: "If we don't see it by the end of 2028, then I think we will have revealed some fundamental deficiency within the current technological paradigm and it'll require human invention to move things forward."

Anthropic's Resource Position

The forecast comes as Anthropic is rapidly expanding its ability to train and run frontier models. On February 12, 2026, the company announced a $30 billion Series G funding round, bringing its post-money valuation to $380 billion. Court documents cited by reporting have revealed that the company brought in over $5 billion in total commercial revenue while investing $10 billion on model training and inference.

On May 6, 2026 — one day before Clark's essay began circulating widely — CNBC reported that Anthropic had signed a deal with SpaceX to use all compute capacity at SpaceX's Colossus 1 data center in Memphis, Tennessee, providing access to more than 300 megawatts of compute capacity. Greater compute access would directly support the kind of large-scale model training runs that recursive self-improvement scenarios depend on.

What Comes Next

The next two and a half years will be the empirical test of Clark's thesis. The benchmarks he cites — SWE-Bench, MLE-Bench, CORE-Bench, PostTrainBench, and METR's autonomous task horizon — will continue to be updated, and the trajectory of each will either support or undercut the case for automated AI R&D arriving by 2028.

The paradigm-shift question Clark raises — whether AI systems can identify genuinely novel research directions rather than executing on human-specified ones — may prove to be the decisive variable. If that capability emerges in measurable form before 2028, the probability estimates Clark and Greenblatt have put forward will look conservative in retrospect. If it does not, skeptics who place the probability under 10% may have identified the binding constraint.

The Anthropic Institute's research agenda will also be worth watching. Its explicit focus on governance for recursive self-improvement scenarios means that, whether or not the 2028 timeline holds, the policy and societal infrastructure questions Clark is raising are now an active area of institutional research rather than a speculative exercise.

For now, one of the founders of one of the world's most prominent AI safety organizations has put his name on a 60% probability that the most consequential threshold in AI history will be crossed within two and a half years. That alone is a signal worth taking seriously — and examining critically.

For more tech news, visit our news section.

Why This Matters for Your Productivity and Cognitive Performance

If AI systems are approaching the point where they can accelerate their own development, the pace at which AI tools reshape knowledge work is likely to increase — not level off. For individuals and teams trying to stay effective in a fast-moving environment, building strong habits around how you evaluate, adopt, and integrate new tools is becoming as important as any single skill. Moccet is designed to help you do exactly that — cutting through noise to focus on what actually improves your output and wellbeing. Join the Moccet waitlist to stay ahead of the curve.

← Back to Tech News

Anthropic Co-Founder Puts 60% Odds on AI Training Its Own Successor by 2028

The Case for Automated AI R&D: What the Benchmarks Show

Where the Gaps Remain — and Why Clark Is Still Reluctant

The Anthropic Institute and the Governance Question

Industry Reactions and Independent Forecasts

Anthropic's Resource Position

What Comes Next

Why This Matters for Your Productivity and Cognitive Performance

More Tech News

Moonshot AI Raises $2B at $20B Valuation as Kimi Demand Surges

Vibe-Coded Apps Are Leaking Sensitive Data at Scale

Elon Musk's SpaceX Signs Anthropic Compute Deal