In my previous post, we explored Prompt Chaining — the simplest way to break a complex task into sequential steps. But real-world systems rarely follow a straight line. Users don’t come with labels on their foreheads telling you what they need. Sometimes they want to book a flight, sometimes they want a factual answer, sometimes they want something you didn’t even anticipate.
That’s where the Routing pattern comes in. And honestly, this is where things start to feel like you’re building something real.
Pattern #2: Routing#
The Problem#
Imagine you’re building a customer service system. A user types: “Book me a flight to London.” Another one types: “What’s the capital of Italy?” If you throw both prompts at the same generic LLM pipeline, you get generic results. No specialization, no efficiency, no structure.
What you actually need is a coordinator — something that reads the request, understands the intent, and sends it to the right handler. A booking request goes to the booking specialist. A factual question goes to the information agent. An unclear request gets flagged for clarification.
This is routing. It’s the AI equivalent of the receptionist at a company who listens to your question and says “Let me transfer you to the right department.”
The Solution#
The Routing pattern introduces a classification step before any real work happens. The LLM first decides what kind of request it’s looking at, then delegates to the appropriate specialist pipeline. Two main approaches:
1. LCEL Chain Routing — The lightweight approach. A router chain classifies the intent and a RunnableLambda dispatches to the right handler function:
router_chain = coordinator_prompt | llm | StrOutputParser()
full_chain = (
{"decision": router_chain, "request": RunnablePassthrough()}
| RunnableLambda(route_to_handler)
)The LLM outputs a single word — booker, info, or unclear — and the lambda function routes accordingly. Simple, effective, and surprisingly powerful for most use cases.
2. LangGraph StateGraph Routing — The structured approach. When you need more control, visibility, and the ability to add complexity later, you model routing as a graph with conditional edges:
builder = StateGraph(RouterState)
builder.add_node("classify_intent", classify_intent)
builder.add_node("booking_node", booking_node)
builder.add_node("info_node", info_node)
builder.add_edge(START, "classify_intent")
builder.add_conditional_edges(
"classify_intent",
route_by_intent,
{"booking_node": "booking_node", "info_node": "info_node"}
)Each specialist is a full node in the graph — it can have its own LLM, its own prompt, its own tools. The coordinator classifies, and the graph’s conditional edges do the dispatching. This is the approach I prefer for anything beyond a toy example, because it scales naturally: need a new specialist? Add a node and an edge. Done.
Why This Matters#
Routing is deceptively fundamental. Every non-trivial agentic system has routing somewhere, whether it’s explicit or buried inside a framework’s abstractions. When you understand the pattern, you see it everywhere:
- Customer support bots that escalate to a human or resolve automatically.
- Multi-model systems that send simple queries to a fast/cheap model and complex ones to a powerful/expensive model.
- Tool-using agents that decide which tool to call before calling it.
The key insight is that the LLM itself is the router. You don’t need a separate rules engine or a decision tree. You just ask the model to classify, and then you act on the classification. It’s LLMs all the way down.
The Bigger Picture#
I’ve been working intensely on agentic AI systems — both in my role as CTO and as a personal deep-dive into what I believe is the most impactful shift in software architecture in years. Understanding these foundational patterns is not optional if you want to build solutions that actually work in production. The gap between “I called an API and got a response” and “I built an intelligent system that reasons, delegates, and adapts” is exactly these patterns.
I’m documenting my learning through this series, following Antonio Gulli’s excellent Agentic Design Patterns book. Full credit goes to him for the concepts — I’m refactoring the examples into pure Python with LangGraph, making them runnable and testable.
All the code from this post (and every other chapter) is available in my repository: carlosprados/Agentic_Design_Patterns. Every example runs with uv run, supports both Google Gemini and local models via Ollama, and is ready to experiment with.
What’s Next#
In the next post, we’ll tackle Parallelization — how to run multiple agents simultaneously and merge their results. Think of it as fan-out/fan-in for LLMs. It’s where things start to get fast.
Stay tuned.

