Agentic Coding Workflows

A coding agent doesn’t just suggest lines — it takes a task and runs the agent loop: explore the repo, plan, edit files, run tests, fix what broke. This unlocks real delegation. It also demands a different workflow — your job becomes specifying and reviewing, not typing.

The agentic loop, applied to code

The two human checkpoints — review the plan and review the diff — are where quality is won or lost. Skip them and you’ve automated the writing of code nobody understands.

Scope the task well

Agents succeed or fail on the task you hand them. A good agentic task is:

Well-specified — clear definition of done, expected behavior, constraints.
Verifiable — there’s a concrete way to confirm success, ideally tests.
Bounded — a coherent unit of work, not “build the whole feature.”
Low-ambiguity — few unstated decisions; you’ve made the design calls.

Poor:  "Improve the checkout flow."
Good:  "In services/checkout.py, add a retry (3 attempts, exponential
        backoff) around the payment-gateway call. Only retry network
        and 5xx errors — never on a declined card. Add unit tests for
        both paths. Match the retry helper already in lib/http.py."

The good version made the design decisions for the agent. Agents execute well; they decide poorly. Keep the judgment yours.

Review the plan before the code

Most agents will outline an approach first. This is the cheapest place to catch a mistake — redirecting a plan costs a sentence; redirecting a finished 500-line diff costs a rewrite. Read the plan: right files? sound approach? missing an edge case? Correct it now.

Review the diff like any other PR

An agent’s output is a pull request, and it gets the same scrutiny as a colleague’s — see Working Effectively. Read every file. Run it. Check the edge cases and security. The author being an agent lowers the bar for nothing — and arguably raises it, since the agent has no stake in the result.

Keep diffs reviewable: prefer several small, focused agent tasks over one sprawling one. A 200-line diff gets a real review; a 2,000-line diff gets a rubber stamp.

Let the agent close its own loop

Agents are far more effective when they can verify their own work and iterate. Give them that loop:

Point them at the test command so they run tests and fix failures.
Ensure linters and type checks run, so mistakes are caught automatically.
A task with a clear pass/fail signal (tests, a build) lets the agent self-correct before you ever see the result.

This is why a strong test suite is now a force multiplier for AI-assisted development: it’s the agent’s feedback loop and your safety net.

Pitfalls

Pitfall	Why it happens	Counter
Over-scoping	Vague, sprawling task	Small, bounded, well-specified tasks
Rubber-stamp review	Diff too large to absorb	Keep diffs small; review every line
Plausible-but-wrong	Output reads well, logic is off	Run it; test edges; understand it
Context gaps	Agent didn’t see a key file	Name the relevant files and conventions
Lost-thread looping	Agent flails on a hard task	Stop it; re-specify; or do it yourself
Skill erosion	Delegating the thinking, not the typing	Keep owning design and judgment

Key takeaways

A coding agent runs the explore–plan–edit–test loop, shifting your role to specifying and reviewing. Hand it tasks that are well-specified, verifiable, bounded, and low-ambiguity — make the design decisions yourself. Review the plan before code (the cheapest fix) and the diff like any PR (the real gate). Keep diffs small. Give the agent a test command so it can self-correct. You remain fully accountable for every line that merges.