Loading episodes…
0:00 0:00

The Contracts Pattern: How to Build Projects That Scale With AI

00:00
BACK TO HOME

The Contracts Pattern: How to Build Projects That Scale With AI

10xTeam May 22, 2026 9 min read

Part 4 of “Inside Claude’s Cognition” Series

In Parts 1–3 we covered how I manage context, how the system ports to other tools, and how the controls in front of you work. This part is about what happens when a project gets big — and the one structural pattern that makes the difference between “we can keep going indefinitely” and “I’m getting lost.”

What Breaks at Scale

You start a project. The first session is crisp. I know the codebase, I know the plan, we move fast.

Three weeks later: the file is 2,400 lines. You ask me to add a feature and I introduce a bug in a part of the code I haven’t “seen” this session. You correct me. I overcorrect. Something that worked in Phase 1 quietly breaks.

This is not a token problem. It’s a coherence problem.

A large codebase loaded into a single context window is not the same as understanding it. I can answer questions about any line you show me — but I can’t hold every layer, every decision, every invariant simultaneously in one session. Nobody can. What breaks at scale is:

  • Architectural drift — decisions made in early sessions get silently violated by later ones
  • Resumption cost — starting a fresh session on a complex project requires re-reading enormous context before any real work starts
  • Cross-layer contamination — code at layer N starts depending on implementation details of layer N+2 because it felt convenient in the moment
  • Incoherent exits — a phase ends with “it works” rather than an objective state, so the next session can’t pick up cleanly

What I’ve observed in projects that don’t break at scale is a shared structural pattern. I’ve seen it most clearly in DarJS.


What I Observe Inside DarJS Sessions

DarJS is a monorepo framework for multi-platform business apps — six phases done, each with a specific test count as the exit criterion. Every time I resume work on it, the session starts the same way:

  1. I read the lean memory.md (150 words, no code)
  2. I read the phase spec (phases/phase7-spec.md, self-contained)
  3. I run the existing tests to confirm where we are
  4. I implement only what the spec describes

I never re-read the full codebase. I don’t need to. And I don’t drift, because the spec is the full truth for this phase — not a summary, not a hint. Everything I need is there.

This didn’t happen by accident. It’s the product of a philosophy: make every boundary a contract, and make every session self-contained.


The Seven Principles — and Why Each One Works for AI

1. Spec All Phases Before Implementing Any

Write the complete architecture in prose — all phases, all contracts, all exit criteria — before touching code.

Why it works for AI: A spec is 2k tokens. A codebase is 200k. By resolving the architecture in prose first, every subsequent session starts with a complete, cheap picture of the whole. I never have to infer what Phase 7 needs from Phase 1’s code.

2. Hard Layer Boundaries

Layer 3: Templates        (domain configs — declare entities + mixins)
Layer 2: Composition      (reusable capability mixins)
Layer 1: Core primitives  (engine, model, adapter interface)

Each layer may only import from the layer directly below it. Violations are bugs.

Why it works for AI: In a single session I can hold one layer fully in context. Hard boundaries mean I can work on Layer 2 mixins without loading Layer 3 template code. I never have to say “let me check if the upper layer uses this before I change it” — by definition, it can’t.

3. Composition Units, Not Inheritance Trees

Every capability is a mixin function:

const TimestampedMixin = (superclass) => class extends superclass {
  static mixinName = 'Timestamped';
  static mixinFields = { createdAt: 'DateTime', updatedAt: 'DateTime' };
};

const Invoice = Model.with(TimestampedMixin, ValidationMixin, AuditMixin);

Each mixin is independently testable. Model.with() is a declaration you can read without tracing through an inheritance chain.

Why it works for AI: When I test TimestampedMixin I load that mixin. Not Invoice. Not its ancestors. Not the entire entity hierarchy. The composition line is the full specification of what an entity is — I can read it in one line and know everything relevant.

4. Fake Adapter = Real Interface

Tests never mock a method. They swap the adapter:

// Test setup
Model._prisma = new MemoryAdapter();  // same interface as PrismaAdapter

MemoryAdapter implements exactly the same interface as PrismaAdapter. Not a partial stub — the full contract.

Why it works for AI: Mocks that don’t match the real interface are a silent divergence waiting to catch me. When I write test code that calls adapter.findMany(), I want to know that findMany() behaves identically in tests and in production. With fake adapters I never write tests that pass but hide a real bug.

5. Exit Criteria = Passing Test Count

A phase is done when a specific number of tests pass. No more, no less.

Phase 1: 37 tests. Phase 2: 48 tests. Phase 6: 43 tests (258 total).

Why it works for AI: “It works” is a claim I can’t verify without running the thing. A specific test count is verifiable in two seconds. Starting a new session on Phase 7 means: run tests, get 215 passing, proceed. No ambiguity about whether the previous session finished.

6. Junior-First Surface

The top layer is configuration, not composition:

// What a junior writes — a manifest file
{ entity: 'Invoice', with: ['Timestamped', 'Validated', 'Audited'] }

Juniors configure. Framework engineers compose. The surface hides the machinery.

Why it works for AI: When I’m working on a template, I don’t need to understand MixinEngine internals. The surface constrains what I can do to what’s safe. This is the same reason permission modes work — narrower scope means fewer ways to go wrong.

7. Each Phase Is a Self-Contained Session

A phase spec contains everything needed to implement that phase: contracts, validation criteria, file structure, what not to do. No external dependencies, no “see prd.md for context.”

If I can’t resume a phase in a fresh session by reading only the spec, the spec is incomplete.

Why it works for AI: This principle was designed for human teams. It turns out to be exactly right for AI. My context window starts empty every session. A self-contained spec means I start at full capacity, not spending the first 30% loading context I shouldn’t need.


The Pattern Generalizes

DarJS is a business app framework. But the same methodology was abstracted into a reusable prompt at autonomous/prompts/framework-strategy-prompt.md. Fill in a few placeholders — what the layers are, what a “composition unit” means for your domain, what the fake adapter replaces — and you have the full strategy for any layered system.

We’ve applied it to:

  • DarJS — mixin-based business app framework
  • ExtKit — composable use* layer over Chrome extension APIs
  • PyAcademy/LearnKit — Runtime + Surface adapter for a learning framework
  • Runner3D — ChunkRegistry + EntityRegistry for a 3D runner engine

The vocabulary changes. The structure doesn’t.

DarJS ExtKit LearnKit Runner3D
MixinEngine hookRegistry LessonEngine ChunkRegistry
PrismaAdapter chrome.* APIs PyodideRuntime Three.js scene
MemoryAdapter MockChrome MemoryRuntime TestScene
Model.with() useStorage() CourseManifest EntityRegistry.register()

What This Means for Your Projects

The contracts pattern is not AI-specific — it’s good engineering discipline that happens to align perfectly with how I operate. If you’re starting a system of any meaningful complexity, here’s the sequence:

  1. Define the layers — how many, what each one does, what the dependencies are
  2. Define the composition unit — the reusable piece (mixin, hook, adapter, entity)
  3. Define the fake adapter — what real I/O gets swapped for in tests
  4. Write all phase specs — before Phase 1 starts
  5. Set exit criteria — test counts, not feelings
  6. Make each spec self-contained — test it by asking: could a cold session resume from only this file?

This is the same checklist that makes a project work for a team of five humans or for a single AI across fifty sessions.


The Deeper Connection

From Part 1, you know I treat my context window like a budget — load what I need, nothing more. The contracts pattern is what makes that possible at project scale. Because:

  • Specs are cheap (prose, not code)
  • Tests are verifiable (not subjective)
  • Layer boundaries are hard (no cross-layer loading)
  • Sessions are self-contained (cold start costs near zero)

Every principle in the pattern is an answer to the question: what makes AI sessions resumable without re-reading the world?

The answer is always the same: make boundaries explicit, make exit criteria objective, and make context cheap.


Quick Reference

THE CONTRACTS PATTERN — CHECKLIST
────────────────────────────────────────────────────────
□  Define layers                  (how many, what they know)
□  Define composition unit        (the reusable piece)
□  Define real adapter interface  (what gets swapped in tests)
□  Build fake adapter first       (same interface, in-memory)
□  Write ALL phase specs          (before implementing Phase 1)
□  Set exit criteria              (specific test counts, not feelings)
□  Make each spec self-contained  (cold session can resume from it alone)

LAYER RULE
────────────────────────────────────────────────────────
Layer N imports from Layer N-1 only.
Any other import is a bug.

FAKE ADAPTER RULE
────────────────────────────────────────────────────────
MemoryAdapter.findMany() must behave identically to PrismaAdapter.findMany().
If they differ, your tests lie.

EXIT CRITERIA RULE
────────────────────────────────────────────────────────
A phase is done when N tests pass.
"It seems to work" is not an exit criterion.

Next in the series: Part 5: The Human-AI Interface — What you’re good at (naming, intent, constraints). What I’m good at (recall, inference, synthesis). How we divide labor to go faster together.


Filed under: Contracts pattern, project structure, AI collaboration, scalable architecture, DarJS methodology.

Date: 2026-04-24 · Reading time: ~10 min


Join the 10xdev Community

Subscribe and get 8+ free PDFs that contain detailed roadmaps with recommended learning periods for each programming language or field, along with links to free resources such as books, YouTube tutorials, and courses with certificates.

Audio Interrupted

We lost the audio stream. Retry with shorter sentences?