LCEL & ChainsLesson 10 of 10

Fallbacks, Retries & Streaming

Previous lessons called chain.invoke and waited for the full answer. This lesson adds three wrappers on the same explain_chain: .stream() prints chunks as they arrive, .with_retry() re-runs on API errors, .with_fallbacks() tries a backup chain when the primary fails.

Three wrappers

Prompt and parser stay the same — you attach the wrapper to explain_chain.

Three wrappers on the same explain_chain:

Streaming

.stream()

Print tokens as they arrive

Retry

.with_retry()

Re-run on transient API errors

Fallback

.with_fallbacks()

Switch to a backup chain on failure

All three work on the same explain_chain from earlier lessons.

Streaming

Replace .invoke() with .stream() and loop over chunks. Each chunk is a piece of the parsed output string.

.stream() vs .invoke()

invoke — wait for full answer

chain.invoke({"tag": "a"})

→ "The <a> tag creates a hyperlink…"

stream — chunks as they generate

for chunk in chain.stream(…):

→ "The" → " <a>" → " tag" → " creates" → …

invoke waits for the full string. stream yields pieces you can print as they arrive.

for chunk in explain_chain.stream({"tag": "a"}):
    print(chunk, end="", flush=True)

Retry

Wrap any chain with .with_retry(). LangChain re-invokes on transient API failures up to your limit.

.with_retry()

retry_chain = explain_chain.with_retry(
    stop_after_attempt=3,
)

# LangChain retries on rate limits, timeouts, and
# other transient errors — then raises if all fail.

Same explain_chain underneath — only the wrapper changes.

retry_chain = explain_chain.with_retry(stop_after_attempt=3)
answer = retry_chain.invoke({"tag": "img"})

Fallback

Pass a list of backup chains to .with_fallbacks(). If the primary raises an error, LangChain tries the next runnable in the list.

primary fails → fallback runs:

input

{"tag": "button"}

→

primary

gpt-does-not-exist ✕

→

fallback

gpt-4o-mini ✓

The demo uses a broken model name on purpose so you always see the fallback run.

Demo uses gpt-does-not-exist on purpose. In a real app, point fallback at a working model or simpler chain.

broken_primary = explain_prompt | ChatOpenAI(model="gpt-does-not-exist") | parser
reliable_chain = broken_primary.with_fallbacks([explain_chain])
answer = reliable_chain.invoke({"tag": "button"})

The demo script

fallbacks_retries_streaming_demo.py — streaming, retry, then fallback.

◇fallbacks_retries_streaming_demo.py

"""fallbacks_retries_streaming_demo.py"""

for chunk in explain_chain.stream({"tag": "a"}): …

retry_chain = explain_chain.with_retry(stop_after_attempt=3)

reliable_chain = broken_primary.with_fallbacks([explain_chain])

Download the code

fallbacks_retries_streaming_demo.py

stream → retry → fallback

Download .py

Save into your langchain-course folder. Needs venv, .env, and packages from Project Setup.

Run it

Activate the venv from Project Setup, then:

python fallbacks_retries_streaming_demo.py

PowerShell — (.venv) active

(.venv) PS C:\projects\langchain-course> python fallbacks_retries_streaming_demo.py

--- Streaming ---

Answer: The <a> tag creates a hyperlink to another page or resource.

--- With retry ---

Answer: The <img> tag embeds an image into the page.

--- Fallback ---

Answer: The <button> tag creates a clickable button for user actions.

Three sections in order: streaming, retry, fallback. Fallback always hits explain_chain after the broken primary.

Quick reference

.stream() — print output while it generates instead of waiting for invoke to finish.
.with_retry() — up to 3 attempts on rate limits or transient API errors.
.with_fallbacks() — backup chain when primary raises (demo: broken model → explain_chain).
Docs: Streaming · Retry · Fallbacks.

What's Next

Last lesson in LCEL & Chains. Chat History & Memory starts with ChatPromptTemplate.

← PREVIOUS

RunnableLambda

ChatPromptTemplate