Prompt Engineering Patterns: Prompt Chaining, Decomposition, and Multi-Step Workflows

Prompt Engineering Patterns: Prompt Chaining, Decomposition, and Multi-Step Workflows

Learn how to implement structured outputs and multi-step workflows in your prompts. This section covers everything from prompt chaining to schema validation for effective LLM interactions.

9 audio · 2:51

Nortren·

What is prompt chaining?

0:19
Prompt chaining breaks a complex task into a sequence of simpler prompts, where the output of each prompt becomes the input to the next. Instead of asking the model to do everything in one prompt, you explicitly stage the process. Chaining produces better results on complex tasks because each step can be validated, formatted, and optimized independently.

When should you use prompt chaining?

0:19
Use prompt chaining when a task has clearly separable stages, when you want to validate intermediate outputs before proceeding, when each stage benefits from a different prompt or model, or when the full task does not fit reliably in a single context. Common examples include extract then summarize, classify then generate, and translate then refine.

What are the advantages of prompt chaining over a single complex prompt?

0:18
Chaining makes each step focused and easier to debug, allows validation between steps, enables routing different steps to different models for cost optimization, makes outputs more reliable because each prompt does one thing well, and supports caching of intermediate results. The cost is more total LLM calls and more orchestration logic.

What is the difference between prompt chaining and an LLM agent?

0:18
Prompt chaining has a fixed sequence of steps defined by the developer. An LLM agent decides what to do next at each step based on previous results. Chaining is more predictable and cheaper; agents are more flexible but harder to control. Anthropic's research recommends starting with chaining and only adding agent autonomy where it provides clear value.

What is task decomposition in prompting?

0:20
Task decomposition is asking the model to break a complex task into smaller subtasks before solving any of them. The model first outputs a list of steps, then executes each step in turn. Decomposition helps because LLMs perform better on small focused subtasks than on monolithic complex requests, and the breakdown itself often reveals what context each step needs.

What is the difference between decomposition and chain-of-thought?

0:20
Chain-of-thought produces reasoning steps within a single response leading to one final answer. Decomposition produces a structured plan of separate subtasks that may each be solved with their own LLM call. CoT is implicit and inline; decomposition is explicit and produces a workflow. They complement each other: decompose first, then use CoT within each subtask.

What is the map-reduce pattern in LLM prompting?

0:20
Map-reduce in LLM prompting splits a large input into chunks, processes each chunk independently with an LLM call (map), then combines the results into a final answer with another LLM call (reduce). It is the standard pattern for summarizing or analyzing documents that exceed the context window. Frameworks like LangChain and LlamaIndex provide built-in map-reduce chains.

What is parallel prompting and when is it useful?

0:19
Parallel prompting runs multiple LLM calls simultaneously instead of sequentially. It is useful when subtasks are independent, like analyzing multiple documents, generating multiple variants of content, or running self-consistency over many CoT samples. Parallel calls reduce wall-clock latency but use the same total tokens, so they help speed without saving cost.

What is the difference between sequential and parallel prompt patterns?

0:18
Sequential patterns have each step depend on the previous one, requiring strict ordering. Parallel patterns let independent steps run at the same time. Choose sequential when results feed into each other, parallel when they are independent. Many real workflows mix both: parallel where possible, sequential where required by data dependencies. ---