tags: [codemap, tooling, sync, symbols, line-numbers, documentation, pre-commit, trust] related:
- tools/codemap/symbol-extractor.js
- tools/codemap/codemap-patcher.js
- tools/codemap/smell-detector.js
- packages/cli/commands/codemap.js
- docs/CODEMAP.md
- responses/RESPONSE_2026-05-16_codemap-sync-tool.md status: current —
023 — The Living CODEMAP
Every codebase that works with AI tools eventually builds a symbol index. A file that maps function names to line numbers, that describes what each piece does, that tells the AI where to look. The index pays for itself on the first session where it saves three re-reads.
Then someone moves a function. Or refactors a class. Or adds a new method that the index doesn’t know about.
A stale line number is worse than no line number. It sends the AI to the wrong place. The AI reads the wrong context, gives a wrong answer, costs an extra round trip. You would have been better off without the index entry.
This is the CODEMAP trust problem. And it has an obvious solution — keep the index accurate — that turns out to need a tool.
Why Manual Updates Fail
The CODEMAP update rule says: when you edit a function, update its line number. The rule is correct and almost never followed.
Not because developers are careless. Because the update happens at the end, after the function is written and tested and the commit is being staged. At that moment, the CODEMAP is the last thing on your mind. You skip it. Or you update the one function you changed and miss the three other functions in the same file that shifted by four lines when you added an import at the top.
The line number problem is specifically insidious because it’s invisible. A wrong function name in the CODEMAP is obvious — you look for getUser and find fetchUser. A wrong line number looks right until you navigate to it and find yourself in the wrong function.
The Tool
dar codemap has three modes:
dar codemap --check # exit 1 if anything is stale or missing
dar codemap --sync # patch stale line numbers, append missing symbols
dar codemap --generate # write a fresh section for a file from scratch
--check runs in the pre-commit hook. If you edited packages/platform-api/app.js and the createApp entry in CODEMAP says line 13 but createApp is now at line 15, the commit is blocked. The error is one line:
✗ platform-api/app.js:createApp — CODEMAP says 13, actual 15
--sync fixes it. One command, every stale entry updated, every missing symbol appended.
How It Works
The symbol extractor (tools/codemap/symbol-extractor.js) reads source files with a small set of regex patterns — no AST, no parser:
const DECL_PATTERNS = [
{ re: /^(?:async\s+)?function\s+(\w+)/, type: 'function' },
{ re: /^class\s+(\w+)/, type: 'class' },
// const restricted to arrow/function assignments only
{ re: /^const\s+(\w+)\s*=\s*(?:async\s+)?\(/, type: 'const' },
];
Only top-level declarations are extracted — lines with no leading whitespace. This avoids capturing local variables, nested callbacks, and private helpers. A const bind = (id, cmd) => inside a page.evaluate() callback is indented; it’s skipped.
The patcher (tools/codemap/codemap-patcher.js) reads the CODEMAP markdown, parses the symbol tables, matches each row to an extracted symbol, and updates line numbers in place. It appends new symbols at the bottom of their table rather than resorting — so the human-curated order and annotations are preserved.
The only constraint: every symbol in the patcher’s output gets cleaned before matching:
function cleanSymbolName(sym) {
return sym.replace(/`/g, '').split('(')[0].split(/\s/)[0].trim();
}
createApp(config, template, ...) in the CODEMAP table becomes createApp before the match. This tolerates full signatures, partial signatures, and backtick-quoted names all in the same table.
The Smell Detector
The codemap tool ships with a smell detector (tools/codemap/smell-detector.js) that flags architectural violations as it scans files:
⚠ direct collectFields call — should go through SchemaAdapter
⚠ fragile ModelClass derivation — record?.constructor is unreliable
⚠ hardcoded route prefix — /ui/${...} should use a route builder
⚠ should be peerDependency — require('express') inside packages/
Smells appear in the CODEMAP as annotated rows. They’re not blocking — the commit goes through. But they’re visible: the CODEMAP is the document you read before working on a file, and a ⚠ row is a signal that something in this function needs attention before it compounds.
The smell rules follow the same pattern as tools/lint-ui.js — a RULES array where each rule is { re, message }. Adding a new rule is one line. The pattern is already established; the cost of catching a new category of smell is near zero.
The Pre-commit Hook
dar codemap --check runs in the pre-commit hook alongside contracts-sync. The combination catches two failure modes:
contracts-synccatches: public function exists in source but not incontracts.jsoncodemap --checkcatches: public function exists in CODEMAP but at the wrong line
Together they enforce: every public function is indexed, and every index entry is accurate.
The first time this hook blocked a commit was the session where two import statements were added to app.js, shifting createApp from line 13 to line 15. The hook caught it, --sync fixed it, the commit went through. Total extra time: fifteen seconds.
What “Living” Means
A living document isn’t one that gets manually updated when someone remembers. It’s one where the update is mechanically enforced as part of the work that makes it stale.
The CODEMAP becomes living when --check is in the pre-commit hook. Not because developers are now more careful — but because the update is required. You cannot commit a change that makes the CODEMAP stale without also fixing the CODEMAP. The cost of keeping it accurate is zero because the tool does the work.
The same pattern applies to contracts.json, which is regenerated and checked before every commit. To adapter-spec.json, which drives code generation so the adapter and its spec can never drift. The principle is consistent: every derived artifact that can be validated mechanically should be validated mechanically, before the commit, not after.
Manual accuracy is fragile. Enforced accuracy is infrastructure.
The Token Economics
The CODEMAP is not a token-saving device in the naive sense. A large CODEMAP costs more tokens to load than a grep -n call costs to run. Loading CODEMAP.md to find one function and then reading the file anyway is strictly worse than just running the grep.
The value is orientation, not lookup. When a session starts cold on a complex framework, the CODEMAP tells you: here are the public functions in each package, here are the known smells, here are the gotchas. That context shapes the next ten decisions. The cost is paid once; the benefit compounds across the session.
But that value only exists when the CODEMAP is accurate. Stale entries erode trust. Once you navigate to a wrong line number twice, you stop trusting the CODEMAP and start verifying every entry with a grep. At that point the CODEMAP has negative value — it costs tokens to load and provides no reliable information.
The tool’s purpose is not to make the CODEMAP faster to query. It’s to make it trustworthy enough to load without verification.