Furmous: your pet was always the main character

It started out as a dumb idea, and it remains one to this day!

One afternoon I was dropping photos of our cat Dora into an AI image model and asking it to render her in increasingly ridiculous situations to entertain my kids. Dora as a computer programmer, Dora cheering on the Canucks, and soon they wanted to try some of the other family pets, too. Somewhere in all the laughing I had the thought that ruins every quiet weekend: I should build a little app for this.

So I built a simple web app that let the kids run the whole loop themselves: take a picture, upload it, pick a situation for the cat to be in, and see what came back from Nano Banana 2. That was the entire product. A photo in, a ridiculous cat out. It kept two kids busy at the kitchen table for an hour, which at the time was the only metric I cared about.

Then it just kind of snowballed. For such an unserious app, I had to solve a few interesting problems as I built it out.

Photorealism was the wrong goal

My first instinct was to make the renders as realistic as possible, but once the goal turned to potentially printing the output on clothing, that had to change. Switching to a flat, bold, cartoon line art style that can be read from across a room is the single change that made the whole thing look like a product instead of a tech demo.

Anti-slop guards

The model fails constantly, and creatively: a mangled render, an empty frame, a cat with five legs. The first real engineering wasn't clever, it was retrying until something good came back. That grew into a proper guard, where a second model now inspects every render and has its own instructions to count the limbs and the heads and the eyes and watches for cross-species blends. Anything that fails the check, or that I can't cleanly cut out later, gets thrown away and regenerated, and the rejects get archived so I can see what the prompt keeps getting wrong. I know you're dying to see some examples, so here ya go.

A rejected render of a golden retriever holding a coffee mug.A rejected render of a Yorkshire terrier in a hoodie holding a laptop.A rejected render of a Pomeranian roasting a marshmallow at a campfire.

All caught and regenerated before the background ever got keyed out, raw model output and all.

The family becomes the focus group

Once it was stable enough to pass around, I asked some family members to run it on their own pets. The most interesting feedback I got back was that female pets didn't look female, and so when they saw their pet rendered, it didn't trigger the "That looks just like her!" feeling. So I added the option to choose the pet's pronouns before generation, so I could tailor the prompt to their choice and get a more true-to-life result.

Show the model the actual photo

Initially, when a user uploaded their pet image, I was using a model to create a text description of the image and using that as the base of the prompt. It worked surprisingly well to get started, but it soon became apparent that sending the uploaded image to the model as a reference once again produced a leap in the "Oh my god, that's him!" feeling.

The prompt became a monster

The prompt started as a single line and grew into a big document that gets assembled fresh for every pet, out of the species and the scene and the chosen pronouns and whatever the photo description turned up. Most of what piled on over time is guardrails. There's a pose hint so a Pomeranian holding a marshmallow stick sits up to hold it instead of rendering on all fours with the prop floating in mid-air, and per-eye handling so a cat with two different coloured eyes keeps both instead of getting averaged into one. There's even a final check that every placeholder in the template actually got filled, so a typo fails loudly instead of quietly sending the model the literal text "[PET_NAME]".

From the screen to a shirt

This is the move that kind of productized the app. I'd been interested for a while in exploring print-on-demand services through companies like Printful and Printify to turn a render into a real, shipped product. I ended up going with Printful, and it sent me on a really interesting side quest learning about which t-shirt to choose (landed on the Bella+Canvas 3000 in the end), different printing techniques like DTG and DTF, and I even popped open Figma to design the neck label.

The background problem

A surprising snag happened when I realized that Nano Banana 2 can't output transparent PNGs. ChatGPT does this easily, and so do some other models, but I really liked Nano Banana's output the best for the line-art style I was going for. Printing on a shirt of any colour means a genuinely transparent background, and Nano Banana always hands back a full scene.

The trick I landed on is that you can't key out a colour your subject already contains, and a cartoon pet is mostly cream and tan, which is exactly what a background tends to be. So I made it a contract: render the pet on a saturated green, forbid green anywhere in the art, then do the cutting myself. All the pixel work runs through Sharp, the Node image library, which samples the image corners to learn the exact green the model used on that run, builds a per-pixel alpha mask from it, and falls back to a more tolerant pass when the corners come back ambiguous. Nano Banana renders "green" as a slightly different lime every single time, so rather than match an exact colour I key on green-dominance, whether a pixel is far more green than it is red or blue, which absorbs the drift without biting into the cat. A last sanity check on how much of the image actually went transparent catches the odd render that couldn't be keyed cleanly, and that one gets thrown back for a retry.

Watermarks that survive AI

Previews go out to anyone, and anyone can paste an image into ChatGPT and ask it to strip the watermark off. I build the watermark with Sharp, and most of the work was finding the right balance, enough opacity and coverage that the image is clearly a locked preview, but light enough that a customer can still judge whether they love the render. The thing that actually protects it is the pattern. A small "© preview only" mark tiled across the whole image leaves no clean patch for an AI to inpaint from, and the repeated copyright text makes most LLMs refuse to remove it outright.

A stranger can run up your cloud bill

Of course each generation has a cost associated with it, so I wanted to spend some time protecting the platform. It took a few layers. Cloudflare Turnstile keeps the bots out. A rate limiter caps generations per visitor on both IP and a session cookie, so clearing cookies or hopping IPs each still hits a wall. A cost circuit breaker in Redis tallies spend over a rolling hour and flips a kill switch if it spikes, one I can throw instantly without a deploy. And hard daily, weekly and monthly caps in the Google Cloud console are the final backstop for when my own code is the thing that's broken. Users could also potentially type unsavoury things into any of the inputs before generation, so a blocklist and Google's SafeSearch screen the words and the uploaded photo before any of the expensive stuff runs.

Taking real money

The last step was to wire up Stripe: checkout, payment, and the handoff into fulfillment so an order goes from "pay" to "printed" without me in the loop. Most of the work hid in the gap between "this works for me" and "a stranger can pay for this," like making the webhook idempotent so a retried Stripe event never prints the same shirt twice. Pretty standard stuff for this part of the implementation, but also critical for going all the way and making it real.

Where it landed

I set out to make my kids laugh and somehow ended up building furmous.com, a real store where you upload a photo of your pet and get them back as a themed cartoon, printed on a shirt and shipped to your door. I know the whole loop works because one of those shirts turned up at my own door, our Dora reimagined as a hooded computer hacker, which is the one that ended up on my own chest.

It's still a dumb idea, but so much fun. And Dora remains completely unimpressed, which feels about right.

Furmous: your pet was always the main character - Kevin Salter