The PRP Framework: A Deep Dive into Context Engineering for AI Assistants |

Hello everyone. In the last couple of days, the context engineering framework, PRP (Product Requirement Prompts), has gotten a lot of exposure. This is where we get into two of my favorite things for context engineering: Cloud code/commands and PRPs.

PRPs are similar to Product Requirements Documents (PRDs), which you've probably heard of if you've been diving into AI coding. However, they are specifically designed to instruct an AI coding assistant. Today, I'm planning to show you around a little bit in the repository.

There are many great resources available that provide a deeper understanding of context engineering principles. I highly recommend you go and check those out to get a solid foundation. Many interesting discussions are happening in online communities around cloud code, agentic engineering, and agent building in general.

What is the PRP Framework?

Let's take a look at the repo. The PRP framework, which stands for Product Requirements Prompts, is a derivative of PRDs (Product Requirement Documents). With a long background in business analytics and product management, writing PRDs has been a significant part of my work. When I started coding with AI, it was natural to use PRDs and other business analytics techniques to improve my prompting and the context I was giving to the AI.

In short, a PRP is a PRD combined with curated codebase intelligence and an agent runbook. It aims to be the minimum viable packet an AI needs to plausibly ship production-ready code in the first pass. It's a planning framework for how you curate context, an implementation plan, and references to documentation, both outside and inside your codebase. It's very much designed for working on existing, mature codebases as well. That was the original use case—I needed something that could work on existing projects.

I've been working on this and have had several versions before this one, starting in the summer of 2024. The concept has remained very much the same. The readme file contains a lot of instructions on how to get started, so please check that out.

A Deep Dive into the Repository

If you open up the PRPs repository, you will see a readme file. This readme is actually for feeding to the AI when the PRP is created, so we won't go too deep into that.

In the templates directory, you'll find all the PRP templates. These are basically prompt templates that the AI will use to fill when we execute our commands.

Core Commands and Structure

This is where it all starts. I have a lot of commands in the commands directory that I use in my day-to-day work. I have generalized them slightly. In every case where I use them in a real codebase, they are very much adapted for that specific project. This is the base I would use for most things, and I would add specific things for my use case. Please take a look at these; I use many of them daily, especially review-stage and review-general, which are very useful for getting help with code reviews. There are also some development helpers for git and other things.

Now, let's go through the PRP directory. There are many versions of the PRP, and they all serve a different use case. I will use PRP-base as our example for today.

This command takes your idea as an argument. Say you have an idea for a product you want to build or a large task you want to do on your existing codebase. You pass that in here as an argument. It's worth spending a lot of time upfront on preparing your feature request to feed into the PRP. The more work you do before you add it as an argument, the better your result will be. You can use a Jira task, an epic, or a traditional PRD to get more context out of it.

Here are some instructions for the AI on what it will do: - It will generate a complete PRP for general feature implementation with thorough research. - It ensures all the context is passed to the AI to enable self-validation and iterative refinement.

An important instruction here is that the AI only gets the context you are appending to the PRP, so it's important that all of it is there. There is some Chain of Thought prompting here to get it to work in this research process. It will go into your codebase, do a codebase analysis, and gather all the context it will need. It will also do external research online.

Then, it will search for similar patterns and features that you already have, as well as libraries, implementation examples from GitHub, blogs, best practices, and common pitfalls. It will also ask you for clarifications if it feels it doesn't have enough context from your argument.

The PRP Generation Process

Next comes the PRP generation itself. It will use the PRP-base template, which we'll look at soon, and fill it when it has all the context. There is some iteration here telling it that it must provide context from documentation, code examples, gotchas, patterns, and an implementation blueprint. It will add pseudo-code, reference real files, and for patterns, include error handling and so on.

It will also perform validation gating. The two main ones it does are syntax and style checks, and also unit tests. There are also a lot of other validation techniques that it uses.

There are also some prompt techniques, like adding critical marks and "ultra-think," which is something from the AI assistant that you can add to make it activate additional thinking tokens. I'm going to assume that a lot of you know how these tools work. It will then save that PRP into a new file in our PRPs directory. It also has a quality checklist at the end.

The PRP Template Explained in 5 Minutes

Let's take a look at that PRP template that it will be filling. This is a predefined template where all the context gathered in the context-gathering phase will be filled.

There is some front matter here: the purpose and the core principles, explaining to the AI what this is and why. It also references the readme that we looked at before.

It will add three core things in the beginning for the goal of what you're building: 1. The value of what you're building. 2. What exactly you're building. 3. The success criteria.

The more context you bring related to these core things, the better your result will be. If you don't provide this, it will make something up, and you have to be really sure that when you read this back after it's created, it fits your goals and your needs. I don't recommend trying to "wing it" through this. Really spend time planning and preparing; otherwise, you're not going to have a good result.

Then we have the context. It will give you all the context it has gathered and all the context that the AI thinks it's going to need through the implementation. It will put it here in different formats. It will also run tree to give you the desired codebase tree structure after the implementation (it will give one before and one after).

It will provide some known gotchas and the implementation blueprint, including data models and structures. This is always really good to go through after you have run the PRP. Especially worth looking at is the tree structure. Does it fit your pattern? Is it the architecture you want for your codebase? If not, just manually change it. The same goes for the blueprint, especially when it comes to models and schemas.

Then you have the list of tasks. Depending on how large your request is, it will create quite a large list of tasks. It will use information-dense keywords like find, inject, preserve, modify, and mirror. These are all designed to keep the task description as concise and information-dense as possible. It will also add pseudo-code when it thinks it's needed.

It will give you the integration points as well. This is a very important part where you need to look into what it has created. Sometimes you have to not do any change at all; sometimes you have to change an entire route or a migration file.

The Critical Role of Validation

Here we come to a really important part: the validation itself. There are a few core principles of a PRP: the context, of course, but one is also the validation. You want to make the AI run in loops until it has self-validated that what it has built is working. The more ways you can create for your AI to validate its work, the better your result is going to be.

Linters and Unit Tests: These are two of the easiest ways to do it. You can set up your own rules in the linters, which will help a lot. It will run until tests pass and until linters pass.
Integration Testing: If you're building an API, for example, it's very powerful to do integration testing with curl because it can run its own bash commands.
Advanced Validation: For more complex tests, like those involving deployments or Docker, you can use mock servers. For example, you can set up a level four validation where you prompt it to check the implementation through the mock servers and fix it until it works.

It has a final validation checklist, making sure that all tests pass, there are no linting errors, no type errors, and manual tests are successful. Then there's also a list of anti-patterns to avoid in the end.

An Example in Action

Here's an example of what this can look like when it's executed. I ran one of these PRPs from the base template on a project to refactor the structure of the codebase. It wasn't a lot of functional changes; it was just to make it less monolithic and split it up into files and modules.

This is more or less what I wrote as my argument. I didn't get very specific as it was just an example, but it still filled out a lot of things. The goal was to refactor the current monolithic codebase, which had a couple of files with around a thousand lines each, into a well-structured vertical slice architecture. It added the "why" itself. The success criteria were part of my claude.md file, which I will show later.

For context, it used my ai_docs directory, where I put library documentation for libraries that I know will be used during the implementation. Here I used uv. Part of the refactoring was to go from pip to uv, and I know the AI struggles sometimes with some uv setups, so I just added this here to give it easy access.

You can see it even references claude.md for the coding standards. It found documentation for Pydantic and pytest through web searching. The refactoring plan was in a markdown file as well.

Then we have the tree structure before and after the refactoring. It did quite an extensive modularization. When I ran this, it actually ran for a very long time, something like 1 hour and 40 minutes. It created hundreds of tests as well. It was working perfectly almost out of the box, with just one iteration to fix some import errors.

It gives you the models, the tasks (quite a long list), and some pseudo-code where it was needed. It also provides integration points and the validation loops (unit tests and linting).

For front-end development, you can use something like Puppeteer or Playwright. You can add a validation level to use the Puppeteer tool to test the end-to-end user flow, click all the buttons, and so on. It will actually open a browser, click around, and make sure everything works.

Executing the PRP

When you have your PRP ready and you're happy with it, you run the prp-base-execute command. You just pass the file path to the command:

/prp base-execute path/to/your.prp

The instructions for the AI are to implement the feature using the PRP file. It follows a Chain of Thought process: load the PRP, gather context, do an "ultra-think," create a comprehensive plan, execute the plan, validate, and complete.

There is some repetition in the instructions across the files, which was more necessary with older models. With newer models, you can probably get away with less of that. This process ensures that the AI references back to the PRP, especially when it goes across context windows. It's quite powerful how well it can work across multiple context windows on the same task.

Handling Different Task Sizes

The base PRP is intended for quite large pieces of work. For smaller tasks like a Jira ticket or a bug fix, I recommend using the task version. It's a sliced-up version of the base PRP but still has a lot of detail. It uses a very similar process and has a PRP-task template. It will make your Jira ticket complete with the full implementation plan because it will go and read and understand existing codebase patterns.

I have a bunch of experimental ones as well that I will be adding, for building mock servers, using specific libraries, and different languages like TypeScript with Next.js, React, and Astro.

Planning with PRPs

The planning one is also very interesting. You can give your feature idea to the planning template, and it will create your PRD. It's basically a PRD generator. It's quite powerful for getting your first draft of a PRD. It will give you an executive summary, problem statement, solution overview, success metrics, user stories, and primary user flows with Mermaid graphs. It will also include system architecture, technical specifications with sequence diagrams, and API contracts.

This is to help you plan things. I use this a lot when I plan a new piece of work. I send out planning agents to come back with reports that I can use to create my first PRD. Don't just run it blindly; use it as a research base.

The Importance of the `claude.md` File

Now for the last thing: the claude.md file. I have a library of claude.md files for different frameworks like Astro, Python, and Next.js. My core Python one has a lot of things specific to my use case, like my architecture and patterns. I use uv, so you might need to change that to poetry or pip.

As your project matures, you really want to make sure your claude.md file matures with you. Put project-specific instructions in your claude.md, and you will see exponentially better results.

A lot of people say that AI coding assistants are not good at working on large, existing codebases. I highly disagree. In a large, mature codebase, you have existing patterns. If your codebase has a good structure, is modular, and is easy to read, your AI assistant is going to do extremely well. It's only if your codebase is a mess that the AI is going to be a mess. Set up your codebase in a way that's easy for both you and your AI to understand.

Kicking Off a PRP

This is how I usually run things. I have my terminal with multiple sessions. I usually have my core development workflow running in my IDE.

Here, I'll show you how I kick off a PRP on a mature, large codebase for a voice AI agent.

/prp execute-base-prp prps/11-labs.prp

The idea is to integrate a new voice provider. I already have several voice providers in this codebase, so one of the core things it's doing is mirroring existing voice pipelines. All I have to do is execute the command. I have already prompted it through my PRP.

It will start gathering context, reading everything, and then start executing. One thing I highly recommend is that between the creation of the PRP and the execution, you should use a new instance of your AI assistant or at least clear the context. If you do planning and execution in the same context window, you will run out of context quickly, and it can lead to hallucinations.

You can see it's starting to work now. It has created its to-do list and is ready to start.

That's about it for the PRP framework. A lot of things are to come. I will be updating the templates and adding more things as I discover them. The next thing I'm going to be looking into is the hooks feature in the AI assistant. When I do, those things will also be added to this repo.

Thank you for reading, and I hope this gives you some value.