Shipping Rinkflow solo: my agentic engineering epiphany

Over the past six months I solo built and shipped Rinkflow — a hockey practice planning app that leverages AI to generate age-appropriate plans for youth coaches. It's live on the App Store and Google Play, runs on the web, and is supported by 3 other web apps: a marketing site, a sharing site (so plans can be shared across iOS, Android, and the web at a universal URL), and a back office admin dashboard. It features streaming AI plan generation, a multi-tenant system for hockey associations, a whole onboarding flow, subscriptions on both platforms for B2C, a Stripe payment gateway for B2B sales, and even fairly robust analytics and error monitoring underneath. The project was essentially a 6-month course in how to ship from idea to production using modern AI tooling while applying everything I've learned in my career across software and product.

The experience has been, honestly, incredible. I've been spending most of my working hours for the last several years managing teams, but this latest revolution has sent my mind spinning in the same ways that excited me when I entered this field as an IC a couple decades ago. For years I'd wanted to build something serious as a side project and never quite started. If you've also got stacks of unused domain names that you keep renewing year-over-year, you know what I mean.

For me, the push came in May when I watched a short interview with Ryan Carson on his three-step AI coding workflow. It hit me like a lightning bolt. It was the first time I'd seen someone direct an LLM with real intent, not just pulling the vibe-coding slot machine and seeing what got spit out. He had slash commands for writing the Product Requirement Document, breaking it into a list of tasks, then processing those tasks step by step. That was my first taste of what people would soon start calling agentic engineering, and I knew the moment mattered. I was feeling so motivated, and made my first commit to Rinkflow the next week on June 1st.

The workflow was three slash commands, then two

The tooling behind that video was beautiful in its simplicity. It's just a couple of markdown prompt files, open-sourced as snarktank/ai-dev-tasks, that you drop into your editor as slash commands. I used them for nearly every feature I built in Rinkflow.

The first one, create-prd, turns a one-line idea into a spec. You specify the feature — anything from auth to plan generation to individual drill creation, for example — and instead of running off to write code, it stops and asks you four or five clarifying questions. They're multiple choice, so you can fire back "1A, 2C, 3B" and keep moving. Then it writes a Product Requirements Document into a /tasks folder: the problem, the goals, the user stories, a numbered list of what the feature has to do, and a non-goals section that spells out what it explicitly won't do. The whole thing is written so a junior developer could pick it up, which turns out to be exactly the right level to aim an LLM at.

A huge revelation from this process came in discovering what I've seen referred to as "reverse prompting": when you ask the AI to interview you. After drawing up the PRD, the AI would ask a list of open questions to get my feedback. These questions teased out edge cases I hadn't thought of and raised insightful points about how the new feature connected to the rest of the app's systems and features. I've started telling people I'm sometimes spending up to half a day or more just on the specs phase, iterating on the AI's feedback long before any code is written. It's a process I've really enjoyed.

The second one, generate-tasks, reads that PRD and turns it into a checklist. It does it in two passes. First it hands you the handful of big parent tasks and stops, so you can check the overall shape before it goes any further. You reply "Go," and it breaks each one down into small, checkable sub-tasks and lists the files it thinks it'll touch. What you're left with is a plain markdown file full of empty checkboxes. After working through 15 or 20 of these checklists, I've loved how each section maps cleanly to a point where I'd logically want to make a git commit. My commit history has never looked so clean!

That was the whole rhythm for Rinkflow: idea, PRD, task list, all written down as files in the repo before I'd touched a line of feature code. My /tasks directory still has dozens of them sitting in it. Writing the plan down first is also how you survive working across many sessions — the agent doesn't remember last Tuesday, but the file does. It also helped me, the human, context-switch and get my bearings back when returning to the project. If I had a particularly busy week working at my day job and shuttling the kids to their sports, I could come back to the codebase five, six, seven days later and pick up the build in exactly the spot I left off. Incredible.

There used to be a third command, process-task-list. Its only job was to babysit the actual work — do one sub-task, check the box, stop, wait for me to look at it, then move to the next. As of the last couple months, the models have just gotten good enough that you don't have to spell that out anymore. Hand a modern agent a task list and it'll work straight down it, checking things off as it goes, without anyone holding its hand. Watching a piece of scaffolding quietly disappear because the thing underneath it grew up is, in a small way, the whole story of this past year.

The rules live in a file, not in my head

Letting an agent work down a task list still needs a human in the loop, just a different kind. Every so often the model would make an assumption I didn't expect — name a thing in a way that cut against the rest of the app, reach for a pattern I'd already decided against, fill a gap in the spec with its own guess. The fix is almost never to argue with it in the moment. It's to notice the surprise, and then write the missing rule down in CLAUDE.md so it sticks. The first time it's a one-off correction. The second time you catch yourself correcting the same thing twice, that's the model telling you a rule belongs in the file, not in your head.

A lot of what's in my CLAUDE.md got there exactly this way. The big one is data fetching: every network call goes through TanStack Query, no exceptions. Left to its own devices an agent will reach for whatever works in the moment, or even just hand-roll its own solution. Each of those is a perfectly defensible little choice, right up until you've stacked a few hundred of them and no two screens in the app fetch data the same way. I learned this when I built favouriting, so coaches could favourite drills — but favouriting a drill on one screen wasn't persisting to another. That's when I realized the React Query cache wasn't being shared, because the model had implemented it differently on the two screens. Easy enough to fix, and easy enough to prevent going forward by writing it down in CLAUDE.md.

Other exceptions were narrower and weirder, like the rule that you can never use Restyle styling components inside a Gorhom bottom sheet, because the sheet runs its own React context and the theme just silently doesn't reach inside it. That's not a thing you'd ever think to say up front. You find it because an agent confidently does the obvious thing, the styling quietly breaks, and you'd rather not explain it a third time. Each line in that file is a small surprise I only wanted to have once.

The reviewer is also an agent

Before anything merges in Rinkflow it goes through a fixed gate: type-check, the full test suite, a check that the PR actually exists, and then an architecture-aware review that reads the diff against exactly the rules in CLAUDE.md — is this component reaching past its hook, is server state leaking into a store, is it hardcoding a colour the theme already owns. And if the change touches the database, a second reviewer takes over with a deliberately grumpy fifteen-year-DBA persona and reads the migration like it's out to get him: are the row-level-security policies sound, is there a rollback, are we writing down why each policy exists.

What I like about this is the reframing. A team gets consistency from culture — shared taste, code review, the accumulated "we don't do it that way here." On my own, I get a surprising amount of the same thing from a checklist an agent runs without fatigue, every single time. The standards live in version control instead of in somebody's head.

A product is bigger than one repo

Rinkflow isn't one codebase, it's five-ish. Shipping a single feature can touch the app, the admin dashboard, the public share viewer, the landing page, the App Store metadata, even the privacy policy and terms. Running an agent to audit this became its own checked step too. As a solo developer, having this kind of safety net is invaluable for keeping all parts of the system consistent.

The Rinkflow admin dashboard — a separate web app from the coaching app — showing a hockey association's platform usage: seat utilization, practice plans, AI generations, and the most popular plans and drills across the organization.

The job is still the job

The commands gave me a process and the CLAUDE.md file became the memory, but the thing doing the real work was older than either of them. Writing a clear spec before you build. Deciding on one way to do something and refusing to drift. Noticing when the model is about to make a mess and naming the rule before it spreads. That's not some new skill the AI taught me — it's the same thing I've spent years doing with teams, just pointed at an agent that types faster than I can.

That's what surprised me most about the whole six months. I went in thinking I was learning a new tool, and came out realizing I was mostly leaning on everything I already knew — about software, about product, about keeping a thing coherent while a lot of people, or a lot of agents, push on it at once. The tooling changed how much one person can carry. It didn't change what the work actually is. And as I evaluated the productivity gains, I kept trying to figure out: did this make me 5x more productive? 10x? But at the end of the day, I have an entire piece of software, a project that simply never would have existed at all — so what's the productivity gain on something existing versus nothing existing? Infinity?

Rinkflow's mascot: a cheerful snowman in an orange toque driving a blue Zamboni.

Now if you'll excuse me, the ice needs resurfacing.