Last month I rebuilt a chatbot that had grown into a spaghetti monster of API calls, prompt templates, and “temporary” fixes that somehow survived six sprints. The kind of code where developers are afraid to change anything because something will break, they just don’t know what.

Then I rewrote it with LangChain. 29 lines became 12. My weekend suddenly had fewer Slack alerts.

flowchart LR
    subgraph vanilla["Vanilla Python"]
        v29["29 lines<br/>Memory handling<br/>Prompt templates<br/>Retry logic"]
    end
    subgraph langchain["LangChain"]
        l12["12 lines<br/>All handled by<br/>framework"]
    end
    
    vanilla -->|"17 lines saved"| langchain
    
    style vanilla fill:#fed7aa,stroke:#ea580c
    style langchain fill:#bfdbfe,stroke:#2563eb
    style v29 fill:#fed7aa,stroke:#c2410c
    style l12 fill:#bfdbfe,stroke:#1d4ed8

The “I wrote it myself” trap

When building a custom chatbot, developers end up maintaining everything: conversation memory, prompt formatting, retry logic, streaming… It’s like insisting on baking your own bread every morning. Admirable? Sure. Sustainable when also trying to ship features? Not so much.

LangChain is basically the bakery. It handles the boring parts—the stuff that’s the same for every chatbot—so developers can focus on what makes their chatbot different.

from langchain_openai import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.2)
chain = ConversationChain(llm=llm, memory=ConversationBufferMemory())

print(chain.predict(input="Walk me through LangChain in two sentences."))

That’s a working chatbot with memory. Twelve lines. The vanilla Python version needed 29 just to track conversation history without losing context.

For more advanced use cases

Most chatbots eventually need retrieval—answering questions from documentation, not just the model’s training data. Here’s where frameworks really shine. Adding RAG (Retrieval Augmented Generation, for those keeping acronym score at home) is just a few more lines:

from langchain_core.prompts import ChatPromptTemplate
from langchain.chains import RetrievalQA

prompt = ChatPromptTemplate.from_template(
    """Use the snippets below to answer:
    {context}
    Question: {question}
    """
)
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=my_vectorstore.as_retriever(),
    chain_type_kwargs={"prompt": prompt},
)
response = qa.invoke({"question": "How do we tune alpha?"})

No need to rewrite memory handling or prompt logic. It’s just snapping a new piece onto the existing chain.

The mental model

Think of LangChain like LEGO for LLM apps:

  • Chains = pre-built sequences. Great for predictable workflows like FAQs or onboarding wizards.
  • Agents = the model picks what to do next. More flexible, but guardrails are recommended unless surprise API bills are welcome.
  • Memory = conversation history that just works. No more passing around growing lists of messages.

A few things I learned the hard way

  1. Start boring. Begin with a simple chain. Add agents and tools only when the use case genuinely needs them.
  2. Log everything. LangChain’s callback system makes this easy. When (not if) something weird happens, receipts will be needed.
  3. Tokens still cost money. Abstractions hide complexity, but they don’t hide costs. An abstraction that makes five API calls is still five API calls.

The real win isn’t the line count—though that’s nice. It’s that six months from now, when I need to change something, I’ll actually understand what’s happening. And maybe, just maybe, I’ll get to keep my weekend.