AdvantageWorks Team May 20, 2026 6 min read

Andrej Karpathy Joins Anthropic: New Pretraining Role Details

Q: Where can I find the nanoGPT repository?

The nanoGPT repository is at github.com/karpathy/nanoGPT . It contains a roughly 600-line implementation of a GPT-2-style model — about 300 lines for the model definition ( model.py ) and 300 lines for the training loop ( train.py ) — and reproduces GPT-2 (124M) on the OpenWebText dataset. As of late 2025 Karpathy marked nanoGPT as deprecated in favour of nanochat , his newer end-to-end ChatGPT for $100 reference implementation. The original nanoGPT repository remains online for reference and is still the cleanest starting point for understanding what a transformer pretraining loop actually does.

Senior pretraining researchers are scarce. The same handful of people show up on every shortlist, and three or four labs are bidding for them. If your AI…

Andrej Karpathy Joins Anthropic: What the "Vibe Coding" Inventor's Move Means for the AI Frontier

Senior pretraining researchers are scarce. The same handful of people show up on every shortlist, and three or four labs are bidding for them. If your AI roadmap has stalled on hiring, you are not the outlier — Anthropic, OpenAI, and Google DeepMind are competing for the same names.

Andrej Karpathy is an AI researcher and founding member of OpenAI who joined Anthropic's pretraining team on May 19, 2026 (TechCrunch, 2026). He reports to team lead Nick Joseph and works on using Claude to accelerate the training of the next generation of Claude.

The hire matters for two reasons. It pulls another senior name into Anthropic at a moment when OpenAI is still losing executives, and it puts a researcher with strong public opinions about minimalist engineering in charge of one of the most resource-intensive workloads in the industry: frontier pretraining.

What changed on May 19, 2026

Karpathy joined Anthropic's Pretraining Research team under Nick Joseph after pausing his education startup, Eureka Labs.
The team's stated mandate is to use Claude itself to accelerate the next generation of Claude — a recursive-self-improvement loop applied to model training.
His move follows departures from OpenAI by Ilya Sutskever (May 2024), John Schulman (August 2024, also to Anthropic), and Mira Murati (September 2024).
Karpathy's 2025 "vibe coding" framing — natural-language prompting as a first-class engineering primitive — continues to shape how teams describe AI-assisted development.
Anthropic intends to apply Karpathy's "Software 2.0" thinking to how Claude is used to train its successors.

Who is Andrej Karpathy? A Career Built on "Software 2.0"

You have probably used a model, read a paper, or watched a tutorial that traces back to Karpathy's work. The thread running through it is a preference for first principles: build the smallest thing that actually works, then explain why it works in public.

His career started in computer vision and NLP. As a PhD student at Stanford under Fei-Fei Li, he designed CS231n, the course that trained much of the first deep-learning cohort. He then joined OpenAI as a founding member.

In June 2017 he moved to Tesla as Senior Director of AI. That is where he wrote the original "Software 2.0" essay (Karpathy, 2017), arguing that neural networks were not just better algorithms — they were a different kind of software, compiled from data rather than written by hand. Under his leadership, Tesla's Autopilot vision stack moved from hand-tuned heuristics to a single learned model.

The Karpathy Career Timeline

Years	Organization	Primary Focus
2011 – 2015	Stanford University	PhD, Computer Vision & NLP (CS231n)
2015 – 2017	OpenAI	Founding Member, Research Scientist
2017 – 2022	Tesla	Sr. Director of AI (Autopilot Vision)
2023 – 2024	OpenAI	Midtraining & Synthetic Data
2024 – 2026	Eureka Labs	AI-Native Education
2026 – Pres.	Anthropic	Pretraining Research

Why the Anthropic Move Matters: The Strategic Analysis

Anthropic is not hiring a researcher to ship a feature. They are hiring a system architect to redesign the machines that build the models. The hire lands while Anthropic is in talks for a $30 billion funding round at a roughly $900 billion pre-money valuation (Bloomberg, 2026), on the back of the Claude 4 family of models.

The strategic frame is recursive self-improvement. A frontier model — Claude — generates synthetic data, evaluates candidate architectures, and helps decide what to train next. Karpathy's recent open-source work on nanoGPT was an exercise in stripping a GPT-2-class trainer down to roughly 600 lines split between model.py and train.py (GitHub, 2026). The same instinct — keep the loop legible, then make it faster — is what Anthropic now wants applied to a multibillion-dollar training run.

The flip side of this concentration is the rest of the market. As the strongest researchers cluster inside three or four labs, mid-market and enterprise teams find themselves priced out of the senior-architect tier. Most companies cannot hire an OpenAI co-founder. They can, however, plug a senior AI architect into an existing team through engagements like the fractional agentic team , which is built for exactly that gap.

The "OpenAI Exodus" and the Competition for Elite Talent

When the people who built a technology start leaving for a direct competitor, it is usually a signal about culture or strategy, not just compensation.

Since mid-2024 (Fortune, 2024), OpenAI has lost a string of senior figures: Ilya Sutskever in May 2024, John Schulman in August 2024 (also to Anthropic), and Mira Murati in September 2024. Some of them started their own ventures. Others landed at Anthropic, which has stayed publicly committed to Constitutional AI, steerability, and a safety-first research agenda. That posture appears to be a draw for researchers who want to work on frontier capability without giving up the safety story.

If recursive training and natural-language engineering are now where the research is headed, the question is whether your stack can keep up — your evaluation harness, your data pipeline, your deployment story.

AI Transformation Discovery Sprint — A short engagement that maps your current AI stack against current frontier patterns and produces a concrete production roadmap.

Education, nanoGPT, and the "Teeth Over Education" Philosophy

Karpathy's work outside the labs is just as influential as his work inside them. The "Zero to Hero" YouTube lectures and the nanoGPT repository changed how a generation of engineers learned what a transformer actually does.

He often talks about "teeth": building a small, working thing is worth more than reading a long survey paper. nanoGPT is the example — a GPT-2-class trainer in roughly 600 lines of Python, split between the model and the training loop (GitHub, 2026), that you can read end-to-end in a morning. (As of late 2025, nanoGPT is officially deprecated in favour of his newer nanochat project, but the original repository remains the cleanest starting point for understanding pretraining.)

His February 2025 "vibe coding" post argued that the dominant programming skill is shifting from writing syntax to writing intent — prompting a model in natural language, then critically evaluating whether the generated code matches what you actually wanted. The discipline he is bringing to Anthropic is the same one: keep the loop small enough to reason about, then scale only the parts that matter.

Conclusion: The Future of Pretraining

Frontier AI research is moving past prompt engineering and into systems engineering. The Karpathy hire is one more sign that the next phase of model progress will come from smarter training pipelines, not just bigger ones.

For most enterprises, the practical question is no longer "should we adopt LLMs." It is "are our internal systems ready to consume what the frontier labs ship next quarter, or are we going to spend the next year catching up?"

AI Readiness Snapshot — A free 30-minute assessment of your current AI maturity, with a concrete next-step plan for adopting frontier model patterns.

Frequently asked questions

As of May 19, 2026, Andrej Karpathy is a researcher on Anthropic's pretraining team, reporting to team lead Nick Joseph. His mandate is to use Claude itself to accelerate the next generation of Claude — a recursive-self-improvement loop applied to frontier model training.

This is Karpathy's second tour through frontier-lab research after his 2015–2017 founding stint at OpenAI and a 2023–2024 return there focused on midtraining and synthetic data. He confirmed the move on X, calling the next few years at the frontier of large language models especially formative. Anthropic told TechCrunch the new team will focus specifically on using Claude to accelerate pretraining research, distinct from the labs' existing model-product teams.

Karpathy did not leave OpenAI for Anthropic directly. He left OpenAI for a second time in early 2024 to found Eureka Labs, an AI-native education startup, then paused Eureka Labs in May 2026 to join Anthropic. He has stated publicly that he remains deeply passionate about education and plans to resume that work in time.

His Anthropic move fits a broader pattern: several senior researchers — including John Schulman, Ilya Sutskever, and Mira Murati — have departed OpenAI since 2024 for other frontier labs or new ventures. Anthropic's published focus on Constitutional AI, steerability, and safety research has made it an attractive landing spot for researchers prioritising that agenda.

Vibe coding is a term Andrej Karpathy coined in February 2025 to describe a programming workflow where a developer prompts a large language model in natural language, accepts most of its code suggestions without editing, and pastes any errors back to the model rather than reading the code line by line. Karpathy's original post called it embracing the exponentials of LLM-assisted development and forgetting that the code even exists.

In practice, vibe coding works for prototypes, internal tools, and throwaway projects where speed beats long-term maintainability. Karpathy himself flagged it as suitable for weekend projects rather than production-grade systems. Enterprise teams that adopt the workflow typically pair it with human review, automated tests, and explicit architectural guardrails for any code that will be deployed.

No. Karpathy paused Eureka Labs in May 2026 to join Anthropic's pretraining team. In his announcement, he was clear that Eureka Labs was paused — not shut down — and that he intends to return to the education work in time.

Eureka Labs was founded in mid-2024 as an AI-native school that combines a domain expert teacher with an AI teaching assistant, starting with the LLM101n course on building language models from scratch. Whether the pause shifts that roadmap, and how long it lasts, has not been publicly disclosed.

The nanoGPT repository is at github.com/karpathy/nanoGPT. It contains a roughly 600-line implementation of a GPT-2-style model — about 300 lines for the model definition (model.py) and 300 lines for the training loop (train.py) — and reproduces GPT-2 (124M) on the OpenWebText dataset.

As of late 2025 Karpathy marked nanoGPT as deprecated in favour of nanochat, his newer end-to-end ChatGPT for $100 reference implementation. The original nanoGPT repository remains online for reference and is still the cleanest starting point for understanding what a transformer pretraining loop actually does.

For most enterprises, Karpathy's hire is a market signal, not a hiring playbook. The same handful of researchers capable of leading frontier pretraining work are concentrated inside three or four labs, and Anthropic's hire of Karpathy underlines how scarce that talent has become. Mid-market and enterprise teams trying to build proprietary AI systems cannot recruit at that tier and cannot match the compensation.

The practical path forward is two-step: borrow senior architecture decisions and let internal staff own execution. Engagements such as Advantage Works' Fractional Agentic Team plug a senior AI architect into an existing team for the system-design work — pretraining strategy, retrieval architecture, evaluation harness — without committing to a full-time research hire. For organisations earlier in the journey, a short AI Transformation Discovery Sprint maps which frontier-lab patterns actually apply to your stack before any hiring decision.