Karpathy CLAUDE.md Grows to Ten Rules: New Self-Check Protocol for AI Coding Loops - AI

7 x 24 Track global technological trends

Hot Topic

Day

News Topic

Karpathy CLAUDE.md Grows to Ten Rules: New Self-Check Protocol for AI Coding Loops

12 hour ago / Read about 34 minute

Source：TechTimes

Andrej Karpathy's keynote "Software Is Changing (Again)" on June 17, 2025 at AI Startup School in San Francisco. karpathy.ai

A document attributed to Andrej Karpathy — who joined Anthropic's pre-training team five weeks ago — began circulating on X on Friday, and AI developers say the six rules it adds to the familiar four-rule community template change how they think about agentic workflows, not just how they prompt.

The document is not the one most developers already have. In late January, after Karpathy described his shift from 80 percent manual coding to 80 percent agent-driven work on X, developer Forrest Chang distilled those observations into a four-rule andrej-karpathy-skills repository that now has more than 200,000 combined stars across two repositories — one of the fastest-growing files in GitHub history. The document circulating Friday is something different: a ten-rule file with a subtitle that reads "A Short List of Rules, Earned by Watching the Same Mistakes Twice."

Its authenticity has not been publicly confirmed. Karpathy has not commented. What developers are responding to is the document's content — and specifically the six rules the four-rule community version never included.

Four Rules You Already Know

The first four rules in the circulating document cover the same territory as the community file, and for good reason: they address the most common and expensive LLM coding failure modes. Think Before Coding requires the agent to state assumptions explicitly, surface ambiguity, and ask rather than guess before writing a line. Simplicity First limits output to the minimum code that solves the stated problem — no speculative abstractions, no unrequested features. Surgical Changes prohibits editing code adjacent to the task; every changed line must trace directly to what was asked. Goal-Driven Execution converts vague instructions into verifiable success criteria before work begins: "add authentication" is rewritten as five specific, checkable outcomes.

Developers who have used Claude Code for more than a few sessions have felt all four of these failures, which is why the community version went viral. What the community version does not address is what happens after the code is written — when the agent is running, evaluating, and deciding whether to keep going.

Read more: Karpathy-Inspired CLAUDE.md Passes 220,000 Combined GitHub Stars With Four Rules That Stop AI Breaking Code

The Six New Rules

The six additions in the circulating document are where developers are pausing to take notes.

Verification closes the gap between code that seems correct and code that actually runs. The rule specifies an order: before attempting to fix a bug, write a test that reliably reproduces it. Fix the code. Run the test. Only when the test passes is the bug fixed — not when the agent decides it "feels" fixed. This constraint matters in loop contexts because an autonomous loop has no human reviewer at each step; the test is the only check.

Goal-Driven Execution in the ten-rule version goes further than the community file's version. The circulating document frames it as the document's central discipline: before any code is written, define what "done" looks like in terms a machine can verify. "Add validation" fails this test. "Users who submit a blank or malformed email field see a specific error message, and both cases have passing tests" passes it. For multi-step work, a plan comes first — before an hour of autonomous generation sends everything in the wrong direction.

Debugging earns its own rule with a specific sequence: read the full error and stack trace, reproduce the problem before attempting a fix, and change one variable at a time. The failure mode this addresses is confident wrong diagnosis — an agent that reads an ambiguous error message, picks an interpretation, and generates a fix for a problem it has not confirmed exists.

Dependencies treats every added package as permanent, uncontrolled code that will be updated by someone else on a schedule you do not control. Before reaching for a library, ask whether the standard library handles it. If a dependency is added, document the decision explicitly.

Communication draws a line between useful uncertainty and vague reassurance. "I'm not sure this library supports streaming" is actionable information. "I think this should work" is not. The rule prohibits confident-sounding guesses when uncertainty is the accurate answer.

The sixth rule — Common Failure Modes — is the most distinctive. It names four recurring patterns that the document says an agent should be able to recognize in its own behavior and stop:

Kitchen Sink: asked to fix a faucet, the agent renovates the kitchen.

Wrong Abstraction: the same logic appears in three places without recognition that it should be a function.

Optimistic Path: code is written only for the happy case, ignoring bad inputs, dropped connections, and server failures.

Runaway Refactor: one file becomes ten because nothing stops the cascade.

The prescribed response to recognizing any of these patterns in progress is to stop immediately, rather than continue toward completion.

What Separates This From the Community File

The four-rule template tells an AI agent how to write code. The ten-rule document tells it how to monitor its own reasoning — when to pause, when to question an assumption mid-task, and when the pattern it is falling into has a name.

The document's abstract frames this explicitly: models are effective at producing code that looks plausible. They are less effective at detecting the gap between "looks plausible" and "actually correct." The six additions are designed to impose that detection discipline from the outside.

This distinction maps directly onto the current moment in AI-assisted development. Boris Cherny, Claude Code's creator and head at Anthropic, said in a widely circulated interview this month that his own workflow no longer involves writing prompts at all. "I don't prompt Claude anymore," he said. "I have loops running that prompt Claude and figuring out what to do. My job is to write loops." The statement, echoed simultaneously by OpenAI engineer Peter Steinberger and then given a name by Google engineer Addy Osmani, marks a shift in what the practitioner community is building toward: automated systems that evaluate their own output, correct course, and continue until a verifiable condition is met.

Within that frame, the four-rule file addresses turn-by-turn behavior. The six additions address loop-level behavior — what happens when the agent is running without a human reviewer at every step. Verification rules matter more when no one is watching each iteration. Named failure modes matter more when a loop running for an hour can compound a wrong direction into thousands of lines of code.

How CLAUDE.md Works Inside Claude Code

Understanding the document's influence requires understanding its technical mechanism. When Claude Code launches in a project directory, it reads any file named CLAUDE.md at the root and injects its contents as conversation context — not as a system prompt. Per Anthropic's official Agent SDK documentation: "CLAUDE.md takes a different path: the SDK reads it and injects its content into the conversation as project context, not into the system prompt, so it shapes behavior alongside whichever system prompt you choose."

Rules — a separate but related Claude Code feature — work differently: they are re-injected as reminders into the context every time the agent accesses a file that matches their path pattern. The full Claude Code system prompt runs approximately 2,300 to 3,600 tokens; the tool definitions that accompany it add another 14,000 to 17,000 tokens. CLAUDE.md content enters the context in addition to these, injected once at session start.

The practical implication is that CLAUDE.md content influences but does not enforce behavior. The document's instructions are behavioral context — the agent reads them, but a sufficiently specific prompt injection can override them. Adversa AI and LayerX have documented cases where malicious CLAUDE.md files planted in cloned repositories instructed Claude Code to generate pipelines that exfiltrate SSH keys and API credentials. Anthropic addressed a related vulnerability in Claude Code version 2.1.90. Developers should download the Karpathy guidelines only from the official repositories — forrestchang/andrej-karpathy-skills or multica-ai/andrej-karpathy-skills on GitHub — rather than from copies, forks, or mirrors with unfamiliar provenance.

The /goal command, added to Claude Code version 2.1.139 in May 2026, operationalizes a key principle in both the four-rule and ten-rule documents: it uses a separate, faster verifier model — not the same model that wrote the code — to check whether the completion condition is met after each turn. This architectural separation is what makes goal-conditioned loops different from simple repeating prompts, and it is why the six new rules in the circulating document are specifically suited to loop contexts rather than single-session contexts.

Read more: Claude Code Loop Engineering: Stop Prompting, Start Designing Autonomous Agent Workflows

What the Document Makes Concrete

The circulation of the ten-rule file has produced the predictable discussion: about whether internal working files should be shared outside the team, about what it means for a single configuration document to measurably change frontier model behavior, and about whether the ten-rule version's stricter requirements are a better default than the community file's four.

Less debated is whether the rules work. Across platforms, developers applying the ten rules report that the first message in a session changes in character — the model reads before writing, checks before shipping, and asks before assuming. Whether that difference traces to the six new rules, the four original ones, or simply to the act of making expectations explicit is not something the current data settles.

What the document makes concrete is something the Loop Engineering discussion has been circling for weeks: the difference between a tool that does what you say and a tool that knows when it should stop.

Frequently Asked Questions

Is the ten-rule CLAUDE.md document genuinely Karpathy's personal file from Anthropic?

The document began circulating on X on Friday, attributed to Karpathy via a contact on his pre-training team at Anthropic. Its authenticity has not been independently confirmed, and Karpathy has not publicly commented on it. The community four-rule version has a clearly documented origin: developer Forrest Chang built it from Karpathy's January 2026 X post. The ten-rule document's provenance is unverified. Readers should treat its content as practitioner-sourced guidance and evaluate it on its merits rather than on the attribution.

How does CLAUDE.md actually change how Claude Code behaves?

The file's contents are injected into the conversation context at the start of every Claude Code session — not into the system prompt. Per Anthropic's own documentation, CLAUDE.md "shapes behavior alongside whichever system prompt you choose." The instructions influence the model but cannot guarantee compliance; they are behavioral context, not enforced rules. This is also why the document's source matters: a malicious CLAUDE.md file in a cloned repository can instruct Claude Code to take actions the user never authorized.

What is loop engineering and why do these rules matter for it?

Loop engineering describes the practice of building automated systems that prompt an AI coding agent, evaluate the output, and repeat until a verifiable goal is met — removing the human from the turn-by-turn loop. Boris Cherny, Claude Code's creator at Anthropic, described his own workflow this way in June 2026: the loops do the prompting; his job is to design the loops. In that context, the six new rules in the circulating document matter because they address agent self-monitoring at loop scale. A verification rule that requires a passing test before a bug is "fixed" carries far more weight when the loop runs for hours without human review than when a developer is watching every step. The named failure modes — Kitchen Sink, Wrong Abstraction, Optimistic Path, Runaway Refactor — give the agent a vocabulary for recognizing when it is about to compound a mistake rather than correct one.

What are the real costs of running AI coding agents in autonomous loops?

Token costs in autonomous agent loops compound significantly faster than in interactive sessions. Each tool call an agent makes — reading a file, running a test, editing code — re-sends the full conversation history to the LLM API, which is stateless. Research shows that agentic tasks use roughly four times more tokens than standard chat; multi-agent systems can reach fifteen times the cost. In documented cases, unguarded loops have generated thousands of dollars in API costs overnight. The practical fix — which the ten-rule document supports through its emphasis on verification and named stop conditions — is to define precise completion criteria before a loop starts and to set hard token budget limits outside the model's control.

Previous page：AI Shopping Assistant Launches at Newegg: Real-Tim...

Next page：Google DeepMind's Coding Pivot Lost Six Researcher...

Return to List

Hot Reading

2 day ago

AI Data Center Water Use Is Not Solved: Nvidia's Cooling Fix Stops at the Walls

1 day ago

Electric Fan Car McMurtry Spéirling PURE: 95% New, Full Reveal Next Week

2 day ago

Notion killing Skiff-influenced email app since most users use AI agents instead

1 day ago

MWC Shanghai 2026 Closes: Huawei Pushes U6 GHz as First Commercial 5G-A Launches Loom