A series for developers who take their craft seriously and want to work effectively with AI — not as a shortcut, but as a force multiplier for deep knowledge.
Each piece is grounded in real work on a real project (DarJS — a mixin-based business framework built over 11 phases). The patterns are transferable to any serious software project.
The Series
000 — Before You Start: The AI Concepts That Actually Matter for Builders
File: 000_preface.md
Status: ✅ Written
001 — Instructions as Design Patterns
How to write AI instructions that transfer intent reliably across sessions
and models. The three-part structure (trigger + why + application scope)
that makes a rule scale.
File: 001_instructions_as_design_patterns.md
Status: ✅ Written
002 — The Sentence That Takes a Paragraph to Explain
Domain knowledge compressed into single sentences. What they look like,
how they’re earned, and why the developer who can write them becomes the
steering layer when AI handles execution.
File: 002_domain_compression.md
Status: ✅ Written
003 — How to Write a Framework Prompt
How to extract just the public surface of a system and describe it
precisely enough that any AI can build against it without knowing the
internals. What to include, what to deliberately omit, how to encode
design decisions as constraints.
File: 003_framework_prompt.md
Status: ✅ Written
004 — Right Model, Right Layer
Different AI models have different strengths. A well-structured project
lets you route work to the right model: architecture to the deep thinker,
implementation to the fast builder, templates to any model given a tight
surface prompt.
File: 004_right_model_right_layer.md
Status: ✅ Written
005 — Contract-Based Architecture Is Agent-Ready Architecture
The centerpiece. Everything you do to make a system composable also makes
it agent-composable. The contracts that let a junior developer build without
knowing internals are the same contracts that let an AI agent do it.
Good software boundaries and good agent boundaries are the same thing.
File: 005_contracts_and_agents.md
Status: ✅ Written
006 — Writing Art Direction, Not Image Prompts
The difference between describing an image and specifying the conditions that produce a consistent family of images. The four-part structure — persona, world anchor, specific asset, technical constraints — and why it transfers to every domain where you prompt AI to produce outputs that must work together.
File: 006_art_direction_not_image_prompts.md
Status: ✅ Written
007 — The Diagram That Pulses While the Code Runs
Most architecture diagrams are dead the moment they’re drawn. A live diagram driven by runtime signals is always accurate because it doesn’t describe the code — it is the code, made visible. The implementation, the principle, and why it matters more for AI-assisted development than for any other kind.
File: 007_the_diagram_that_pulses.md
Status: ✅ Written
008 — Debug First
The conventional order — build the features, add observability later — is backwards. Every invisible problem in a framework is an instrumentation problem. The four instruments that make a system legible while it runs, why to build them before the features, and how they change what AI-assisted development can be.
File: 008_debug_first_framework_design.md
Status: ✅ Written
009 — How to Make Your App AI-Testable
Most apps can theoretically be AI-tested but practically can’t. The reason DarJS worked is that its DOM is deterministically derived from model metadata. The design principle: route contracts + field contracts + selector contracts must all flow from the same source of truth.
File: 009_ai_testable_apps.md
Status: ✅ Written
010 — The DSL Layer Between AI and Your App
The mistake is asking AI to write Playwright. The right pattern: design a semantic JSON DSL, have AI emit that, have a thin runner translate DSL → browser actions using domain knowledge. AI never touches the DOM. The DSL is the contract between AI capability and your runner.
File: 010_dsl_layer.md
Status: ✅ Written
011 — The NLP-First Codebase: Replacing LLM Calls with Retrieval
Most of what developers ask an LLM is a retrieval problem wearing a generation costume. Build an indexed contract corpus with @reuse-when fields, run TF-IDF over it, route to nlp-reuse / nlp-verify / llm-generate. The hit rate for retrieval on simple contracts: above 80%. The LLM handles what the index can’t — which turns out to be less than you’d expect.
File: 011_nlp_first_codebase.md
Status: ✅ Written
012 — Writing Code for Machines, Not Just Humans
Documentation is written from the implementor’s perspective. @reuse-when is written from the caller’s perspective — the words someone would type before they know the function exists. That gap between implementation language and caller language is exactly what makes code hard to find without an LLM. Closing it with structured annotations makes the codebase directly queryable by any retrieval system.
File: 012_code_for_machines.md
Status: ✅ Written
013 — Your Directory Layout Is Now a Routing Table
An agent navigating a codebase doesn’t carry internalized context across sessions. It reads what’s present and infers what’s absent. A file placement table in a local CLAUDE.md is a routing instruction that executes in zero tokens. An undocumented convention is a reasoning problem the agent solves from scratch every session — and may solve differently each time.
File: 013_directory_as_routing_table.md
Status: ✅ Written
014 — Your Framework Needs a dar inspect
AI agents working on a codebase face an introspection gap: the runtime shape of an app — fields after mixin composition, transitions, registered pages — doesn’t exist anywhere a static reader can find in one place. dar inspect closes that gap with a live CLI that answers structural questions in one call. The pattern: CLI first, MCP wrapper second, same interface for both.
File: 014_framework_needs_inspect.md
Status: ✅ Written
015 — 80% Without the LLM: PageDef Autofill and What It Proves
For a well-specified business UI, roughly 80% of the interface definition can be generated deterministically from model structure — columns from scalar fields, filters from enum fields, widgets from mixin lookup. The remaining 20% is genuinely undecidable and gets flagged for human/LLM judgment. The split isn’t a shortcut — it’s the correct division of labor between retrieval and reasoning.
File: 015_pagedef_autofill.md
Status: ✅ Written
016 — One Config Object, Five Form Screens
A twelve-line PageDef wizard declaration generates a full multi-step form: step navigation, skipWhen conditions, widget steps, summary screen, accumulated form data on submit. The pattern: declarative config expresses what the UI does, the framework handles how. Every AI agent working on the app gets a stable interface for expressing complex form behavior without touching Alpine state management.
File: 016_wizard_from_config.md
Status: ✅ Written
017 — The Stable Adapter Layer: Building AI Tools That Don’t Break When You Refactor
When AI tools wrap a changing codebase, every direct import is a coupling that silently rots on the next refactor. A single adapter class reduces the surface area to one change point. A spec-driven generator makes the read method section derived, not maintained. A drift detection script catches export renames before they cause partial failures that look like data problems.
File: 017_stable_adapter_layer.md
Status: ✅ Written
018 — From Oracle to Builder: Write-Capable AI Tools and the Scaffold Workflow
Read-only AI tools answer questions. Write-capable tools change the shape of collaboration: the AI scaffolds, verifies with health checks, corrects mistakes in its own tool-call loop — before the human sees the result. The dry_run gate solves the confirmation problem in stdio MCP servers. Six tool calls build a runnable app: suggest mixins, scaffold, generate PageDef, verify health, fix locale.
File: 018_write_capable_mcp.md
Status: ✅ Written
019 — The Design Layer: DESIGN.md, Token Systems, and Closing the AI Visual Loop
DESIGN.md (Google Labs, April 2026) gives AI agents a persistent, structured understanding of your visual identity — tokens for exact values, prose for design rationale. Combined with DOM contracts (data-field, data-action attributes) and screenshot vision, the AI build loop closes at the UI layer: scaffold → inspect → critique against spec → verify. Stagehand replaces brittle CSS selectors with AI-native browser actions that survive layout changes.
File: 019_design_layer.md
Status: 🔲 Planned
020 — DOM Contracts: The Attributes That Make Your UI Testable Without an LLM
The data-field, data-action, data-page, data-record, data-transition attributes added to PageDef templates are a DOM contract layer — machine-readable selectors that survive layout changes, work with Playwright, Stagehand, and NL test runners equally, and cost nothing to add since the renderer already knows every identifier. The pattern: annotate once at the framework level, get testability everywhere.
File: 020_dom_contracts.md
Status: 🔲 Planned
021 — Living Documentation: CODEMAP as a Synced Artifact
Most codebases have documentation that was accurate when written and wrong six months later. A CODEMAP that’s generated from source — symbols, line numbers, smell markers — is always accurate because it can’t drift. The dar codemap --sync pattern: extract symbols from source, detect known smells, patch the markdown table. The byproduct discipline: every file read updates the map. The result: cold-start sessions navigate by CODEMAP without verification reads.
File: 021_living_codemap.md
Status: 🔲 Planned
022 — The Confidence Gap as a Safety Gate
When retrieval drives automation, the difference between the top match and the second-best is more informative than the score itself. A 56% match with a 53% runner-up is more dangerous than a 30% match with no competitors — the first is ambiguous, the second is just uncertain. The pattern: ambiguous = (second >= best * 0.90) applied in ui-resolver.js prevents silent wrong-element clicks in NL test runners. The principle generalises to any system where a retrieval result triggers a side effect.
File: 022_confidence_gap_safety_gate.md
Status: ✅ Written
Source material:
darjs/packages/nlp/ui-resolver.js:156—ambiguousflag implementation (second_best >= best * 0.90)darjs/packages/nlp/__tests__/ui-resolver.test.js— ambiguity detection tests (TF-IDF score distribution, 56 tests)darjs/packages/testing/nl-runner.js—NlAmbiguousErrorthrown by translateNlStep; safety gate in actiondarjs/responses/RESPONSE_2026-05-16_nl-testing-p2-p3.md— score distribution table (0.85–0.99 specific, 0.53–0.56 ambiguous region)darjs/responses/RESPONSE_2026-05-16_nl-testing-p5.md— NlAmbiguousError design + test failure analysisdarjs/decisions/nl-testing.md— full NL testing design; Layer 5 runner uses confidence threshold
023 — The Living CODEMAP: a Symbol Index That Stays Accurate
Most codebases have documentation that was accurate when written and wrong six months later. A CODEMAP generated from source — symbols, line numbers, smell markers — is always accurate because it can’t drift. The dar codemap --sync pattern: extract top-level symbols via regex, detect known smells, patch the markdown table. The byproduct discipline: every file edit re-runs --check in the pre-commit hook. The result: cold-start sessions navigate without verification reads.
File: 023_living_codemap.md
Status: ✅ Written
Source material:
darjs/tools/codemap/symbol-extractor.js— top-level-only regex extraction; const filter (function assignments only, not imports)darjs/tools/codemap/smell-detector.js— RULES array pattern; 5 smell rulesdarjs/tools/codemap/codemap-patcher.js—parseCodemapSections,checkSection,applyPatches; cleanSymbolName bug (params bleed)darjs/packages/cli/commands/codemap.js—dar codemap --check/--syncdarjs/docs/CODEMAP.md— live example: 25 stale patched, 17 missing appended on first rundarjs/responses/RESPONSE_2026-05-16_codemap-sync-tool.md— design doc, token ROI argument
024 — Your Test Suite Doesn’t Test Your Browser Code
The test suite passed. 1505 tests, all green. The Studio UI was completely broken. A function called uid was declared twice — fine in Node’s module scope, a parse-time SyntaxError in a browser script tag that silences every event handler. node --check in the pre-commit hook catches the exact category of error the tests cannot see: global scope redeclarations, the gap between the Node execution model and the browser execution model.
File: 024_test_suite_doesnt_test_browser.md
Status: ✅ Written
Source material:
darjs/packages/studio/public/studio.js— uid redeclaration incident; browser global scopedarjs/tools/hooks/check-js-syntax.js—node --checkpre-commit gatedarjs/.git/hooks/pre-commit— blocking hook suite
025 — One Artifact, Three Consumers
A scenario JSON file is read by three systems: dar test reads the structural fields (action, model, data, to) and makes HTTP requests; the NL Playwright runner reads nl and drives the browser; a business owner reading the Studio panel reads name, actor, and nl. One file, no conversion. The nl label is the hinge — it simultaneously describes intent for the human and drives automation for the NL runner. The design question for any data artifact: who reads this, and can I include all their fields without conflict?
File: 025_one_artifact_three_consumers.md
Status: ✅ Written
Source material:
darjs/packages/studio/server.js— scenarios-write endpoint (writes the unified format)darjs/packages/studio/public/studio.js— Studio scenario designer (business owner view)darjs/packages/testing/nl-runner.js—executeNlStep(NL consumer)darjs/packages/studio/__tests__/scenarios.test.js— format validation
026 — The Test Seam for Heavy Dependencies
vi.mock() doesn’t intercept dynamic import() inside CommonJS modules. When a module lazy-loads a 23MB ML model via await import(...), module-level mocking hangs the test suite. The solution is a _setPipelineForTest(fn) export — a private seam that bypasses the real model loader entirely. The _ prefix marks it as test infrastructure, not production API. The pattern applies to any lazy-loaded heavy dependency: database connections, HTTP clients, ML models — anything where “acquiring the resource” can be separated from “using the resource.”
File: 026_test_seam_for_heavy_deps.md
Status: ✅ Written
Source material:
darjs/packages/nlp/semantic-resolver.js—_setPipelineForTest,_pipelineOverride,getPipeline()seamdarjs/packages/nlp/__tests__/semantic-resolver.test.js—fakePipeline+beforeAll/_setPipelineForTestpatterndarjs/packages/testing/__tests__/nl-runner.test.js— same seam propagated up to nl-runner tests
027 — Parse, Don’t Run: Schema Introspection Without a Live Process
The PrismaSchemaAdapter pattern: a text-only implementation of your introspection interface that works without a running process. Why requiring execution to answer a metadata question is the wrong dependency. How the three-level SchemaAdapter hierarchy (abstract / DarJS live / Prisma text) decouples every consumer — renderer, router, generator — from ModelClass internals. The test that proves an interface is real: swap implementations, consumer tests are identical.
File: 027_parse_dont_run.md
Status: ✅ Written
Source material:
darjs/packages/core/adapters/SchemaAdapter.js— abstract base, throw-on-all-methods patterndarjs/packages/core/adapters/DarJSSchemaAdapter.js— live ModelClass wrapper, fromManifest factorydarjs/packages/core/adapters/PrismaSchemaAdapter.js— text parser, regex extraction, nested-paren @default fixdarjs/packages/platform-api/app.js—app.locals.schemabuilt at startup, passed to PageDefRouterdarjs/packages/platform-api/renderer/PageDefRouter.js— schema threaded to all renderer calls + coerceBody + buildWheredarjs/packages/cli/commands/generate.js— DarJSSchemaAdapter built inline before fromManifest
028 — The Contract Corpus Has Two Layers
Developer questions divide into two categories: “what exists that does X?” (answered by function contracts) and “how do I accomplish X?” (answered by procedure contracts). A corpus with only function contracts silently routes every “how do I” query to the LLM regardless of coverage. The procedure contract — @role: coordinator, CLI sequences in @example, task language in @reuse-when — closes the gap.
File: 028_two_layers.md
Status: ✅ Written
029 — The Token Cost Is in the Discovery
Every question about a framework involves a discovery phase — the LLM reading source to find which thing to reach for. Eight real queries, eight documented discovery paths, actual file byte sizes. dar find avoids 24,046 tokens of file reading across a single session of eight queries. The more important case: two scenarios where the LLM generates technically working but framework-incorrect code, and the token count doesn’t capture that at all.
File: 029_path_comparison.md
Status: ✅ Written
030 — Build Debug Tools Your AI Can See
You built a debug panel so humans could see browser state. Then you had to figure out how to give that same visibility to the AI working alongside you. The gap — tools built for human eyes vs. tools an AI can actually use — is the core agentic tooling problem most developers haven’t hit yet. Three options (manual relay, REST endpoints, CDP), why Chrome’s built-in remote debugging protocol is the right answer for browser state, and how to wire it into your project CLI in one script with zero dependencies.
File: 030_ai-visible-debug-tooling.md
Status: ✅ Written
019 — The Three Token Debts — and the One Architecture That Pays Them
Cold start tokens aren’t all the same thing. They break into three distinct debts: orientation tokens (where do I look?), interface tokens (what does each module take and return?), and reuse-discovery tokens (does something like this already exist?). The full retrieval stack — file map, CODEMAP, contracts.js, NLP index — replaces 4,000–8,000 tokens of cold-start reading with ~400 tokens of structured loading. Built from the chesswar modular rebuild session.
File: 019_three_token_debts.md
Status: ✅ Written
031 — The Debug Panel as an npm Package (TODO)
The debug-panel.js plugin architecture — register any plugin, card system, mobile-first bottom sheet — is genuinely reusable across any site. There’s no lightweight mobile-first debug overlay with a clean plugin API in the npm ecosystem. This article covers extracting it, the plugin contract, the built-in plugins (push, SW, storage, PWA, network) as optional add-ons, and why the AI-visibility angle (wiring it to CDP) makes it more than just another devtools panel.
File: not yet written
Status: 📋 TODO — context saved in ahmedbouchefra2 session 2026-05-18
032 — Pre-Index Your Codebase Before the Agent Needs It
How CodeGraph eliminates 94% of agent file-read tool calls by building a local SQLite knowledge graph from tree-sitter AST parsing — zero LLM tokens, real-time sync, 8 MCP tools. The general pattern: build a queryable index once so every session starts with a map instead of a blank filesystem.
File: 032_pre-index-your-codebase.md
Status: ✅ Written — 2026-05-20
033 — Structure Is Not Intent: Two Layers Every Code Intelligence System Needs
Structural indexes (CodeGraph) answer where things are and what touches them. Semantic contracts answer why you’d use them and when to reuse. Neither is complete without the other. Shows the combined workflow, the gap each system leaves, and how to build both layers as a byproduct of normal work.
File: 033_structure-vs-intent.md
Status: ✅ Written — 2026-05-20
034 — Five Principles Behind Every Good AI Code Search Tool
The underlying design decisions that separate AI-assisted code search from expensive file-reading sessions: static analysis at zero LLM cost, annotations as query targets, field weighting by intent, routing as a first-class dispatch decision, and graph relationships as queryable data.
File: 034_five-principles-ai-code-search.md
Status: ✅ Written — 2026-05-20
035 — Six Habits That Make Your Codebase More AI-Readable
Practical habits — not tools — that reduce the cost of every AI session: write intent at definition, use closed vocabularies, name callers and dependencies, group work into pipelines, extract metadata as a byproduct, pre-index before sessions. Each applies independently.
File: 035_six-habits-ai-readable-codebase.md
Status: ✅ Written — 2026-05-20
036 — Kill Your CODEMAP: When Structural Tools Make Manual Symbol Indexes Obsolete
The CODEMAP pre-commit hook blocked commits twice in one session — both times because lines shifted after adding a function. The pattern: manually maintained indexes are gap-fillers for missing structural tooling. Once CodeGraph is in place, CODEMAP is redundant on every dimension (symbol location, line numbers, caller/callee hints) and becomes pure maintenance cost. The decision to retire it, what replaced each part, and the general rule for when to kill any manual index.
File: 036_kill-your-codemap.md
Status: ✅ Written — 2026-05-21
037 — Two Layers of Intelligence: Merging Semantic Contracts with Structural Call Graphs
DarJS contracts had 263 nodes and 0 resolved edges. CodeGraph had 2,598 nodes and 3,941 real call edges. Neither was complete alone. The MergedAdapter uses CodeGraph nodes as the base, enriches them with DarJS semantic fields (role, domain, does, reuse-when) by name match, and uses CodeGraph edges exclusively. The result: a graph that answers both semantic questions (what is this for?) and structural ones (what breaks if this changes?). The general principle: documentation systems and AST tools answer different questions, and most projects have at most one of them.
File: 037_two-layers-semantic-structural.md
Status: ✅ Written — 2026-05-21
038 — codeview: How a Code Intelligence UI Gets Built as a Byproduct
codeview started as “can we have a visual for this?” and ended as an open-sourceable standalone tool. Three architectural decisions made it so: a four-method adapter interface (any data source that implements it works), auto-detection of available sources (no config, just point at a directory), and no build step (Alpine + D3 from CDN, plain http.createServer). The graph design problem — 2,042 nodes is a hairball — and three solutions: node size by reference count, labels hidden until zoom threshold, ego mode for one-click focus. What “byproduct” actually means: scoped by the question, not by a PRD.
File: 038_codeview-byproduct-ui.md
Status: ✅ Written — 2026-05-21
039 — Rules Don’t Route, Tools Do
You write Rule 0 in CLAUDE.md. The agent greps anyway. This isn’t disobedience — it’s execution momentum. Symbol lookup → grep is the dominant pattern for that intent across a decade of training data. Rules fire at instruction time; execution happens somewhere else entirely. Three 2026 mechanisms that actually enforce correct tool routing: pre-task routing with plan_turn (intercepts intent before momentum builds), Claude Code hooks at the Bash layer (enforcement at the tool call layer, not the instruction layer), and session waste auditing with get_optimization_report (makes the pattern visible as data). The practical takeaway: numbered protocols beat rule sentences because protocols execute; rules get interpreted.
File: 039_rules-dont-route-tools-do.md
Status: ✅ Written — 2026-05-22
Source Material
All pieces are grounded in the DarJS project:
- Repo:
/home/ahmed/antigravityapps/autonomous/darjs/ - Decisions:
darjs/decisions/phase1.md→phase11.md - Engineering patterns tutorial:
darjs/docs/tutorial-engineering-patterns.md - Framework prompt:
darjs/responses/PROMPT_dashboard-tailwind.md - Bonus essays (Claude series):
/home/ahmed/antigravityapps/autonomous/Claude/BONUS_instructions_as_design_patterns.md
Series location: /home/ahmed/antigravityapps/autonomous/agents-series/
Latest update: 2026-05-22 — 039 written; sourced from fix_locale → fill_labels rename session (tool-habit inertia, protocol vs rule, 2026 enforcement mechanisms)