cuesheet.
Open-source Python library that records LLM API calls once and replays them in tests. Drop-in pytest plugin, scrubs API keys, ships with a live web UI. Works with any SDK on httpx: Anthropic, OpenAI, Gemini, Mistral.

Selected work. Most of my output lives behind login walls or NDAs, so what's here is the part I can actually show.
More on the bench
A handful of backend work behind NDAs and one e-commerce build that's still curing. New posts on what they taught me as they go.