Running Open Source Models Locally
LangChain Basics sent your question to OpenAI over the internet. This lesson points the same script shape at Ollama on your PC — same invoke call, no API bill.
What changes
OpenAI scripts use ChatOpenAI and read OPENAI_API_KEY. Local scripts use ChatOllama and talk to localhost:11434 — the URL you already saved in .env during Project Setup.
You installed Ollama and pulled llama3.2 in Ollama setup. LangChain does not replace Ollama — your Python file sends HTTP requests to the Ollama app in the background.
Before you run
Start the Ollama app (or keep it running from earlier). Confirm llama3.2 is on disk and the API responds:
ollama list confirms llama3.2 is on disk. The local API listens on port 11434 by default.The script
Create hello_ollama.py in your project folder. Same HTML question as hello_langchain.py so you can compare the two outputs side by side.
hello_langchain.py — only the client class and connection change.Run it
Activate the venv, then:
python hello_ollama.pyOpenAI vs Ollama in code
| Piece | OpenAI (cloud) | Ollama (local) |
|---|---|---|
| Client class | ChatOpenAI | ChatOllama |
| Name in code | gpt-4o-mini | llama3.2 |
| Auth | OPENAI_API_KEY | none (local) |
| Call | llm.invoke([HumanMessage(...)]) | |
Later lessons change the name in one spot. See LangChain Ollama docs for other Ollama options.
If it fails
- Connection refused — Ollama is not running. Open the app or run
ollama serve. - model not found — run
ollama pull llama3.2again. - Very slow or hangs — common on the first run while Ollama loads
llama3.2. Close other heavy apps if your laptop struggles.
What's Next
Ollama is wired up. Next: tune temperature and max_tokens.