Align, Plan, Ship: From Ideas to Iterations with PRD-Driven AI Agents

Posted in: AI
Tags: AI

20 Jun2025

After my last post on how the PRD->Tasklist->Work process I’ve been using for the past month has been instrumental for my being able to effectively context switch between a dozen different projects I have received a number of requests about the details of the process. To be honest, I’ve got so used to it in these weeks of daily use that I did not realize how new this was for me and how a lot of people may never have heard about it. Additionally, the fact that I don’t have a good name for the process did not make it easier for people to find (if you have a good name for it, please let me know).

Sorry about that. Today I will try to describe the process and how I have been using it. But first, a story.

As I mentioned in my post on the history of my relationships with AI agents, I originally started with Tab-completions in Cursor and eventually ended up with a more and more sophisticated setup for my daily coding. The most recent dramatic shift in my approach has happened about a month ago, at the end of May 2025. Several developers told me that they were using a structured approach to prompting their AI agents which leads to a lot more reliable results. I did not immediately follow up on those ideas, but started trying to center my conversations with the agent around a single large file where I would ask the model to keep the context of the project at hand, would work with the model to create a rough plan, mark things as completed, etc.

It worked and definitely helped keep the agent on track a lot better.

Then I stumbled upon a Youtube video from the podcast How To AI in which Ryan Carson explained an approach to using AI agents in a very structured multi-step process that uses the amazing planning capabilities of powerful LLMs (like o3, Gemini 2.5, Opus 4) to create a detailed plan that could be executed by LLM agents much more reliably and then systematically using AI agents to keep track of the progress, etc.

As I was just ramping up yet another hobby project, I decided to use the process there to plan a pretty complicated feature that would have taken me at least a few weeks of weekend coding to get to any reasonable shape.

And holy shit! That weekend evening for sure was my “feel the AGI” moment as the future suddenly felt a lot closer.

The model asked a ton of extremely insightful questions that made me think deeply about many aspects of the project that I would otherwise never have considered or would have only discovered late in the project leading to costly fixes, rewrites or having to live with bad decisions.

After about an hour with that process I ended up with an artifact containing insane amounts of very dense context about the project including a clear and detailed plan of action from start to completion of the feature I wanted to build. If I were responsible for creating that plan, I probably would not have done a better job.

Since then, the process has been truly transformative to how I view my interaction with AI agents and how I approach solving any even remotely non-trivial work. I have used the process on a dozen different projects at home and at work, slightly improved a bunch of aspects of it and I don’t see ever going back to the previous life of naive attempts at one-shotting a solution or “vibe coding” my way to a completed feature.

Below is my attempt at defining the process as of June 19, 2025. I am fairly certain it will improve and change over time, but it may act as an example for anybody who wants to attempt it for their work.

Note: If you want to see what the end results look like, I’ve uploaded a few example PRDs and task lists to GitHub that show the actual artifacts this process generates for real projects.

Process Overview

The goal of this process is to help models deal with their limited context window (the amount of text they can “remember” in a single conversation) and work around the unpredictable nature of trying to randomly prompt to a working application using an unguided LLM agent.

There are three pillars to the process:

PRD (aka Product Requirements Document) – a detailed document containing as much detail as possible about the problem at hand. It should give the reader a clear understanding of why we are working on the problem, what the problem is, what kind of solution we are hoping to get at the end, a list of success criteria for the project, etc.
Task list – a separate document containing a very detailed multi-level plan for implementing the PRD along with all the detailed “persistent storage” where the agent keeps track of low-level implementation details of the solution (which files we touched, any unexpected findings from our implementation so far, links to any useful documentation or other sources of context), etc.
A step-by-step process of executing small sub-sections of the task list (often down to a single item) that always starts with a clean agent that knows about the PRD and the task list and is required to focus on a single simple step. This helps ground the agent and significantly reduces the scope of what the agent is required to understand.

PRD Creation Process

For any new project/feature/problem, anything non-trivial that may take me more than a couple of hours of work, I go through the following set of steps that rely on a set of Cursor rules I have added to the system and can reference in my chat.

Initial prompt

I use a special “Create a PRD” prompt to generate my PRD by opening a new chat, referencing the prompt by name and then asking the agent to create a PRD for me.

Note: I always use the biggest/smartest model I have access to (Gemini 2.5, o3, Opus 4, always in MAX mode), this is one step where there is absolutely no reason to try to save on tokens.

I often spend up to 20-30 minutes talking into my microphone with MacWhisper to brain-dump every single piece of context I have on the reasoning for the project, the context around it, my preferred technical details of the solution, any links to relevant pieces of context (docs, project-related Cursor rules, reference to source code, online URLs for articles, etc).

The more context I give the model at this step, the smoother everything goes later on.

Follow-up Questions and Final Results Tuning

After submitting the “Create a PRD” prompt, the model comes back with up to a dozen clarifying questions, which I copy into a file and answer each of them by talking into the microphone for a while. There is no structure to it, just a bunch of thoughts (including “I don’t know, you make the call“). I always try to answer as much as possible often including links to more resources I feel may be useful for the model.

Then I respond to the model with something like “here are my answers @answers.md“. At this point the model will think for a while and come back with a detailed PRD document for the project. I often do not accept the first draft right away and instead carefully work through it with the model to improve or clarify it.

Task List Generation

After I have a PRD, I go through the step of analyzing it to generate a detailed list of steps that would lead the project to completion. This step is a lot more iterative in nature because a lot of implementation details will depend on the details of your particular project and the quality of the context captured in the PRD document.

First, I start a new chat in Cursor (with a big model again!), reference the PRD file the model generated in the previous step and the “Generate Task List” prompt I have stored as a separate Cursor rule.

The model will generate a new file with a short description of the problem and a set of top-level tasks needed to execute the project to completion. I carefully review and manually edit the list until I believe it completely covers all the things I want the model to do (top-level steps/phases only, not too specific). This usually takes ~5-10 minutes.

After I am happy with the task list, I tell the model to continue the process to the problem breakdown phase where it will take the list and generate a very detailed step-by-step plan for executing the project. The model is explicitly guided to keep the tasks at a level where each one can be executed by an AI agent operating at a level of a junior engineer.

At the end of the process I end up with a detailed task list that I review and commit into my repository.

Step-by-Step Tasklist Execution

From this point forward, I operate in a loop:

Ensure the git status is clean for the project – I want to be able to reset to this point at will or ask the model to look at the diff from the last known stable state.
I open a new chat and reference the “Process Task List” prompt stored as a Cursor rule. Then I either ask the model to execute a specific portion of the Task List or just tell it to do the next item on the list.
From this point forward, all work is focused on executing the selected scope of work and verifying it works. It could take up to an hour to finish the item with me guiding the agent through problem completion, but in the majority of cases it produces a working solution on the first attempt (given a good set of Cursor rules for the project).
After the work is done, the model marks the item as complete in the task list and we commit the results.

Adapting the Plan During Execution

Sometimes when I go through the process of executing tasks, I may notice that the task uncovers some piece of context that I myself was missing, or I remember or realize a detail that we want to add to the project. For those cases, I would just ask the model to adjust the plan in the task list and generate new sections as needed describing additional steps we want to take.

And it goes the other way as well. Sometimes I would notice that a specific step that I thought was important becomes irrelevant as the definition of the problem changes during development. Then I just abandon those parts of the task list and remove them from the file. The task list is a living document, not a rigid contract.

Important: If I had to intervene during the execution of a given task, I always follow-up with the following prompt:

I am going to close this chat session soon and you will lose all memory of this conversation. Please reflect on your progress so far and update the task list document (@tasks-prd-my-feature.md) with any details that would be helpful for you to perform the next steps in our plan more effectively. Anything that surprised you, anything that prevented your solution from working and required debugging or troubleshooting – include it all. Do not go into specifics of the current task, no need for a progress report, focus on distilling your experience into a set of general learnings for the future.

Continuous Improvement

As I work through the task list, the document fills with all kinds of useful details needed to make the work on the project easier for the agent. Finally, when I notice something that would be helpful for all agents to know when working on this repository, I ask the model to update Cursor rules with relevant information. Similarly, I sometimes ask the model to update the docs used to generate PRD and the task list if I notice adjusting the task list along the way too much and would prefer the agent to do something differently next time. This is the key point of constant agent training and improvement I mentioned in my previous post and the PRD-based process makes this improvement much easier to execute by providing the space for reflection at the end of each step and at the end of a given feature project.

When NOT to Use This Process

The process works best when the task is complex enough that it would take an agent at least a few hours to complete. If you’re dealing with something that can be knocked out in 30 minutes of coding, the overhead of creating a PRD and task list is probably overkill – just go ahead and build it.

There’s also a top limit to its usefulness. If the feature you want to build or the problem you want to solve is extremely complicated and would require months of work, the model will probably not be able to plan it out effectively in one shot. The context window limitations and the sheer complexity of long-term planning make it nearly impossible for even the best models to create a coherent multi-month plan that won’t fall apart when it hits reality.

For those cases, I would create a top-level PRD, split it into a set of build stages, and then create a separate PRD for each stage and go through the whole process per stage. Think of it as applying the same approach recursively – break the massive problem down into smaller, more manageable chunks that the model can actually handle.

The cutoff for the top limit is currently unclear to me, but I have successfully used the process on tasks that take me a couple of weeks of full-time work to finish. Beyond that, I start to see the quality of the planning degrade significantly, and the task lists become either too vague to be useful or so detailed that they become brittle and break as soon as you start implementing.

A Note on Task-Automation Tools

There’s been a wave of tools lately that promise to handle AI task planning and execution automatically – things like Task Master, which has become one of the more popular examples. These tools typically rely on CLI workflows or MCP servers to generate and process task lists end-to-end.

I tried using some of them.

In my experience, they look great in demos and work okay for isolated projects where you don’t care too much about the implementation details—basically “vibe coding” with LLMs on steroids. But when I tried applying them to real projects with rich context (and lots of expectations around structure and quality), they fell short.

The models running these tools didn’t have access to my Cursor rules, project-specific docs, or even a shared understanding of past design decisions. As a result, they’d often hallucinate steps based on their own assumptions rather than actual requirements. Editing or course-correcting those hallucinations ended up being more work than just writing the plan myself.

Also – and maybe this is just me – but remembering the right CLI incantations for single-task execution, in just the right format, was more cognitive load than simply editing a markdown file.

So while those tools are impressive technically, I’ve found a manual PRD + task list process to be much more reliable and controllable, especially when I actually care about what gets built and how.

If you have any suggestions for improvement or comments on the described approach, let me know! If you are interested in content like this, feel free to join my free Telegram channel where I share my thoughts on AI-related topics and relevant content I find interesting.

Homo-Adminus Blog

Yet Another Admin’s Blog