How Kiro Holds My Project in Its Head: A Sunday Session in About an Hour

June 28, 2026 · 11 min read

CLI Open Source Opinion Tooling Typescript

A Sunday session that should have eaten the whole day took about an hour, because the AI already knew my project instead of me re-explaining it for the tenth time.

Last Sunday I sat down to clear a backlog on coco: merge a pile of dependency PRs, review a Bitbucket integration, overhaul the screenshot pipeline, rewrite the README, cut a release. That’s normally a full day of work, most of it the boring kind. It took about an hour.

coco is an AI-powered Git CLI I maintain. It started as a commit-message generator and grew into a terminal workstation: 16 views, 128 color themes, multi-forge support across GitHub, GitLab, and Bitbucket. It’s around 130,000 lines of TypeScript counting tests, plus a marketing site and a wiki. Big enough that “just remind the AI how it all fits together” stopped being free a long time ago.

This post isn’t “AI wrote my code.” It’s about what changed when the AI stopped needing me to load the project into its head every single session. The tool I’ve been using for that is Kiro, but the idea is portable; the mechanics matter more than the brand.

The five minutes I paid every time

Every session used to open the same way. Here’s the project structure. Here’s the build system. Here’s the dependency direction I try to hold (lib/ <- git/ <- workstation/ <- commands/). Here’s where the tests live. Five minutes of preamble before anything useful happened.

Five minutes doesn’t sound like much until you do it ten times a day. The annoying part wasn’t the typing, it was the role. I wasn’t thinking about what to build, I was translating “what I want done” into “what the AI needs to know to do it.” You stop being a developer and start being a context-loading service with a pulse.

Steering files: write the context once

Kiro’s steering files are just markdown in .kiro/steering/ that loads into every session automatically. Write them once and every conversation starts already knowing the project. I keep four.

product.md is what coco is: every command, the multi-forge story, which providers it supports. structure.md is the source map: the four top-level concerns, the layering between them (lib/ at the base, up through commands/), how each command is laid out (config.ts + handler.ts + optional prompt.ts), where the tests go. That one quietly catches architectural drift; if I ask for a feature, Kiro already knows a new import from lib/ up into workstation/ is a smell, and the steering file even flags the handful of legacy violations that already exist as things to pay down rather than copy. tech.md is the operational stuff: build commands, the four-job CI pipeline, the screenshot pipeline, coverage thresholds. “Run the tests” stops triggering a follow-up question about which script.

The fourth one works differently. release-notes-style.md has inclusion: manual in its frontmatter, so it only loads when I pull it in on purpose. It’s a voice guide for release notes: verb-led bullets, grouped PRs, no marketing fluff, no em-dashes, a list of specific things not to do. Loading that every session would just be noise. Loading it the moment I’m drafting notes means the output lands in the right voice on the first try instead of the third.

The payoff is concrete. During the screenshot work I said “the README GIF shows the raw tsx command path and the commitlint prompt is firing.” No ramp-up. Kiro already knew bin/screenshot/ is the VHS pipeline, that recipes live in recipes.ts, that tape.ts generates the tape files, that tsx cold-starts run 2-3 seconds in a VHS shell. We went straight to the fix instead of spending the first few exchanges establishing what those files even were.

Specs: decide what “done” means before any code

Steering files handle session context. Kiro’s spec workflow handles how a whole feature gets built: requirements (what should this do?), then design (how does it work?), then tasks (what are the steps?). Each layer points back at the one above it.

The marketing-site redesign is the clearest case. It started as a vague goal: make the site dark and terminal-themed, position coco as a toolbelt and not just a commit helper. The first real step wasn’t code, it was writing 12 requirements with acceptance criteria. Things like “the hero section leads with the value of individual commands before it introduces the workstation,” or “the docs system scans the wiki at build time so new pages show up without editing a manifest,” or a hard Lighthouse target on mobile. None of those are implementation details. They’re constraints on the outcome, and writing them forces you to decide what actually matters before you start.

The design doc that followed had the usual route maps and component interfaces, plus the part I liked most: a few formal correctness properties. One of them said, roughly, that merging the manual manifest with the auto-discovered wiki pages must produce a list where every manual entry appears once unchanged, every non-overlapping discovered page appears with defaults, and nothing is duplicated. That property is what drove both the mergeManifests() implementation and its property-based test. Requirement to property to test to code, and you can trace any line back to why it exists.

That design broke into 50-plus tasks, each one naming the requirements it satisfies. It inverts the usual AI-coding move. Instead of “write me a function that does X,” it’s “here’s what the system has to do; you figure out the components and hand me a build plan.” I stay at the requirements level; the AI does the translation into architecture and tasks.

None of this is new for me, and that’s kind of the point. I’ve been working this way since well before the current crop of models, back when the agent on the other end was GPT-3.5 and would cheerfully wander off the moment you gave it room. Even then the results held up, because a good spec doesn’t lean on the model being clever. It hands over a small, well-scoped task with the context already attached and not much room to improvise. The structure did the work the model couldn’t do on its own yet. That’s the part I think people miss now that the models are strong: spec-driven development isn’t a way to flatter a smart model, it’s how you get reliable output from whatever model you happen to have.

The other thing the Kiro team got right, and got right early, is that the whole thing lives in files. I’ve used Claude Code’s planning mode and genuinely liked it, but the plan lived in the conversation rather than on disk. I couldn’t persist it the way I wanted. I couldn’t split the execution into the pieces I wanted and run them on my own schedule, couldn’t pick and choose what got built when, and couldn’t open the spec and the task list in my own editor to adjust them before kicking anything off. In Kiro the requirements, design, and tasks are just markdown in the repo. I edit them like any other file, commit them, reorder the task list, tick off what’s done, and start execution on exactly the slice I want. It sounds mundane until you’ve felt the gap between a plan you can version and one that evaporates when the session closes.

The Sunday session, start to finish

Here’s how the hour actually went, with all that context already loaded.

Dependency triage

I asked Kiro to look at the open dependabot PRs. It pulled all five, checked their ages against my 7-day soak rule, and split them: four safe minor and patch bumps, one major jump (eslint-plugin-react-hooks 4.6 to 7.1) that could break things. The safe ones got merged. For the major one it checked out the branch, merged main in, ran install, lint, and build, confirmed it was clean, then merged. The whole triage was about two minutes instead of the usual click-through-five-changelogs slog.

Screenshot pipeline

This one started as a visual complaint: the README GIF showed ugly absolute paths and a commitlint prompt that shouldn’t have been there. Kiro traced both (the COCO_CLI variable pointed at the raw tsx path, and --conventional tripped a check in a scenario with no commitlint config) and fixed them across three files.

Then it went further than I asked. It pointed out the 5000ms settle time was there to cover tsx cold-start, and that running the built dist instead would drop it to about 200ms. One change to bin/screenshot.ts cut three seconds of dead air out of every GIF. It halved the rest of the timing constants, added pngquant optimization and WebP stills, and wrote smaller README-specific recipes. Each step built on the last because it understood the pipeline as a whole, not one file at a time.

README rewrite

I asked for a review first and got a real critique: the feature list ran too long, three blockquote callouts read like changelog scraps, the usage section had eight variations of coco log, the keymap table belonged in the wiki. Then I said “rewrite it” and got 230 lines down to 95, each command linked to its wiki page, the two hero GIFs placed for impact. Because Kiro already knew every command, every wiki slug, and the image URLs, it was right on the first pass instead of needing three rounds of “actually that link is wrong.”

Merge and release

The Bitbucket forge adapter PR had been open a few days. Kiro checked CI (passing), summarized the changes (provider detection, a REST runner, list and detail loaders, PR and issue mutations, unit tests), merged it, then wrote the release notes in coco’s established voice, because that’s the moment I pulled in the manual steering file. Bumped to 0.73.0, pushed, released. Part of that release was updating the steering files themselves: theme count from 109 to 128, the multi-forge line to include Bitbucket, so the next session starts from accurate facts instead of stale ones.

What I added, and what I’m still skipping

Honesty check, because every one of these posts has a gap. Writing this one finally pushed me over the line on hooks, so I added my first two, both small. One runs ESLint on every TypeScript file I save under src/ or bin/ and fixes anything new before it reaches CI. The other watches the config types and regenerates the JSON schema when they change, because CI fails on schema drift and I was tired of finding out the hard way. They live in .kiro/hooks/ as a few lines of JSON each, and they’re not glamorous; they just take a rule the steering files already state and make it automatic. I still haven’t configured any custom MCP servers; the built-in GitHub tooling covers PR and issue work, and I haven’t hit the wall where a custom one would pay for itself.

The 80/20 is still steering files plus specs; that’s most of the value. The hooks are a thin layer of automation on top, and MCP is the one I’ll reach for only when the current setup starts to hurt, and not before.

One thing worth flagging if you go down this road: the same Kiro that holds all this context also has a rough edge I lost most of a separate weekend to, where its agent terminal silently reports success for commands it never ran. I wrote that whole debugging trail up in When Exit 0 Is a Lie. The context system is the good part; the terminal integration is the part to watch.

The shift isn’t “the AI codes faster.” It’s that an AI which already knows the project is ready to work the second I am. The steering files were maybe four hours to write well, and that cost is paid once; every session after starts at full speed. The specs cost a requirements phase, which is just forcing yourself to decide what done means before anyone writes a line. I end up thinking about what to build and why, and handing off the how.

The steering files and specs for coco are all in the repo under .kiro/steering/ and .kiro/specs/, MIT-licensed, if you want to see the real thing rather than my description of it. If you’re running a setup like this on a project of your own, I’m curious what you put in your steering files and what you deliberately left out.