AGENTS.md: Less is More

A recent paper from ETH Zurich changed how I think about CLAUDE.md and AGENTS.md files. These files often make coding agents worse.

The paper — “Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?” by Gloaguen, Mündler, Müller, Raychev, and Vechev — is the first rigorous empirical study of whether context files help. Short answer: usually not.

What the Research Found

The team evaluated four production coding agents (Claude Code with Sonnet 4.5, Codex with GPT-5.2 and GPT-5.1 mini, and Qwen Code with Qwen3-30b-coder) across two benchmarks: SWE-bench Lite (300 tasks, 11 repositories) and AGENTbench, a new benchmark they built (138 tasks, 12 repositories with real developer-written context files).

Key findings:

LLM-generated context files cut success rates by 0.5–2% while raising inference costs over 20%. They made things worse.
Developer-written context files gained only ~4% in success rate, while still adding cost and extra tool steps to every task.
Agents obeyed every instruction in the context files — and that was the problem. The extra constraints made tasks harder. Agents explored more files, ran more tests, and burned more tokens without solving problems any better.

The most striking result: when the team removed all standard documentation (README, docs/) from repositories, LLM-generated context files suddenly helped. Most generated context files just parrot what the repo already contains. Noise, not signal.

The paper’s core recommendation: context files should describe only minimal requirements. Start with nothing. Add only what addresses real friction.

The Obviousness Problem

Over 60,000 GitHub repositories now contain CLAUDE.md or AGENTS.md files. Agents generated many via /init commands; developers wrote others following early best-practice guides. Either way, they fill up with generic boilerplate: standard language conventions, default tool behavior, linter-enforced style rules, obvious project structure descriptions. All of it wastes context tokens on what the agent already knows.

You’ve seen this pattern. A developer stuffs a context file with instructions like “use TypeScript strict mode” (already in tsconfig.json), “run tests with pytest” (already in pyproject.toml), or “follow PEP 8 style” (enforced by the linter). The agent reads every line, treats each as a constraint, and burns extra time and tokens complying — yet it would have done the right thing anyway.

A Skill to Fix It

I built pare-claude-md, a Claude Code skill that trims CLAUDE.md and AGENTS.md files to their bare essentials.

The skill applies the Obviousness Principle: for each instruction, ask “Would a staff-level engineer or a capable AI coding assistant already know this?” If yes, cut it.

How It Works

Six steps:

Read the target file. Either the path you provide or the CLAUDE.md/AGENTS.md in the current directory.
Explore the repository. Survey the project type, stack, config files (package.json, tsconfig.json, pyproject.toml, Makefile, etc.), architecture, and tooling to learn what counts as “obvious” for this project.
Prepend core engineering principles. A small set of universal principles (clarity over cleverness, explicit over implicit, fail fast, etc.), originally shared by Daniel Bernal, that prime productive behavior without project-specific noise.
Apply the Obviousness Principle. Remove standard language conventions, generic best practices, default tool behavior, obvious project descriptions, content that restates config files, and style rules enforced by formatters or linters. Keep unexpected architectural decisions, non-obvious gotchas, custom conventions that differ from community norms, surprising dependency choices, environment-specific quirks, and historical context that prevents repeated mistakes.
Tighten the prose. Short sentences, bullets over paragraphs, no filler, one idea per bullet.
Show a before/after diff so you can review exactly what changed before writing anything.

Installation

# From the plugin marketplace
/plugin marketplace add wooters/pare-claude-md
/plugin install pare-claude-md

Or manually copy skills/pare-claude-md/SKILL.md to ~/.claude/skills/pare-claude-md/.

Usage

/pare-claude-md                    # pare CLAUDE.md or AGENTS.md in current directory
/pare-claude-md ./path/to/file.md  # pare a specific file

The skill always shows the diff and asks for confirmation before writing.

How the Skill Maps to the Research

I built pare-claude-md before the paper came out, but the overlap between the skill’s design and the paper’s findings is striking.

“Context files should describe only minimal requirements.” The Obviousness Principle strips everything a capable agent already knows and leaves only non-obvious instructions. Agents perform best with little or no context, so the skill drives toward that minimal state while preserving what genuinely helps.

“LLM-generated context files mostly replicate existing documentation.” The skill targets this directly: it removes content that restates config files and obvious project descriptions. Before deciding what’s redundant, it reads the repo’s config files (package.json, tsconfig.json, pyproject.toml, etc.). This matches the paper’s finding that generated context files help only when existing documentation is missing.

“Unnecessary requirements make tasks harder because agents follow them faithfully.” The paper’s most surprising finding: agents obey bloated context files. They burn tokens on constraints they would have satisfied anyway. The skill removes generic best practices, standard language conventions, and style rules already enforced by linters — exactly the instructions the paper found cause agents to spend 2–4 extra tool steps per task.

“Signal-to-noise ratio matters more than comprehensiveness.” Step 5 (tightening prose to short sentences, bullets, no filler) raises signal-to-noise ratio in the content that survives the cut. The paper showed a 20%+ increase in inference costs from context files, so trimming surviving instructions to their leanest form matters.

One Open Question

Step 3 goes beyond what the paper tested: prepending core engineering principles (“clarity over cleverness”, “explicit over implicit”, “fail fast”). The paper says nothing about whether high-level principles help or hurt. A capable model may already internalize them, making the principles pure noise. I’m watching this closely — it would make a great A/B test with a future AGENTbench.

Less Is More

The paper confirms what most engineers already know: the best code is the code you don’t write, and the best instructions are the ones you don’t give. A five-line context file addressing your project’s actual quirks will outperform a 2,000-word generated overview that restates the README.

If you use CLAUDE.md or AGENTS.md, scrutinize what’s in there. Better yet, let pare-claude-md do it for you.