Skip to content
About

Agent Architectures

An agent is built from a simple idea repeated: let the LLM choose an action, run it, observe the result, and decide again. Everything else is refinement of that loop.

loop until done THINK LLM picks the next action ACT your code runs the tool OBSERVE feed the result back to context DONE? yes → return no → loop The LLM reasons; your code runs the tools and enforces the limits.
  • Think — the LLM reasons about the goal and picks the next action.
  • Act — your code executes that action: a tool call, an API request, code.
  • Observe — the result is fed back into the context.
  • Repeat until the LLM signals completion or a limit is hit.

The LLM supplies the reasoning; your code supplies the loop, the tools, and the limits. The LLM never executes anything itself — it only requests actions that your code chooses to run.

def run_agent(goal, tools, max_steps=10):
context = [system_prompt(goal, tools)]
for step in range(max_steps): # the limit is non-negotiable
decision = llm(context, tools=tools)
if decision.is_final_answer:
return decision.answer
result = execute_tool(decision.tool, decision.args) # your code runs it
context.append(decision)
context.append(result)
return "Stopped: step limit reached." # always have a fallback

ReAct is the foundational pattern: the model interleaves a Thought with an Action, then reads an Observation, and continues.

Thought: I need the user's latest order status.
Action: get_orders(user_id="u_123")
Observation: [{"id":"o_88","status":"shipped","eta":"May 26"}]
Thought: I have what I need.
Answer: Your order o_88 shipped and should arrive May 26.

Making the reasoning explicit improves decisions and makes the agent debuggable — when it goes wrong, the trace shows exactly which thought was flawed. Modern tool-calling APIs do this in a structured form, but the think/act/observe rhythm is the same.

For multi-step tasks, asking “what’s the very next action?” at every step is short-sighted. Planning has the agent draft the whole sequence first, then execute it:

  • Plan-and-execute — generate a step list, then work through it. Fewer expensive reasoning calls; clearer to inspect.
  • Replanning — when a step fails or reality differs from the plan, regenerate the remaining plan instead of blindly continuing.

A plan also gives you a natural place to insert a human approval checkpoint before anything irreversible runs.

Reflection lets an agent critique its own work and retry. After producing a result, it evaluates against the goal — or runs a check — and revises if needed.

Generate Self-critique / run a test an objective check works best Pass? done revise and regenerate — cap the retries

Reflection is especially effective when there’s an objective signal — code that must compile, output that must satisfy a schema, a sum that must reconcile. With only subjective self-judgment, gains are smaller and the extra calls may not be worth it.

More autonomy means more capability and more cost, latency, and unpredictability. Use the least that works:

ApproachControl flowUse when
Single LLM callNone — one shotThe task fits in one prompt
Fixed workflowYou hard-code the stepsThe steps are known in advance
ReAct agentLLM picks each stepSteps depend on intermediate results
Planning agentLLM plans, then executesLong multi-step tasks
Multi-agentMultiple agents collaborateTruly distinct sub-problems

An agent is the think–act–observe loop: the LLM reasons and requests actions, your code executes them and enforces limits. ReAct makes reasoning explicit and debuggable. Planning suits long tasks; reflection works best when there’s an objective check to retry against. Choose the minimum agency the task requires — a fixed workflow beats an autonomous loop whenever the steps are predictable.