Vector Databases
Embeddings left the float lists in a Python variable. A vector store writes them to a folder and hands back matching lines when you call similarity_search. We use local Chroma in the demo.
Before you run
Activate the venv from Project Setup. Install Chroma's database package:
pip install chromadb
Keep OPENAI_API_KEY in .env from OpenAI Account Setup . from_texts calls OpenAI to build the vectors.
Demo flow:
Vector stores in LangChain
A vector store pairs each text line with its float list. You load the lines once, then pass a query string to pull the best matches. Chroma, FAISS, Pinecone, and Postgres all expose the same LangChain methods.
| Store | Runs on |
|---|---|
| Chroma | Local folder |
| FAISS | Local file |
| Pinecone | Cloud |
| PostgreSQL | Your DB server |
from_texts to load data and similarity_search to read it back.Store chunks with Chroma
Chroma.from_texts embeds each string and writes a chroma_demo_db folder next to your script. Everything stays on your PC.
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
texts = [
"The <a> tag creates a hyperlink.",
"The <title> tag sets the browser tab title.",
]
vectorstore = Chroma.from_texts(
texts=texts,
embedding=embeddings,
persist_directory="./chroma_demo_db",
)After first run
chroma_demo_db at the start so each run starts clean. Drop that line when you want data to persist between runs.Read matching lines
similarity_search returns documents only. similarity_search_with_score adds a distance float per row — lower number means a closer match in Chroma.
query = "How do I make a link on a page?"
results = vectorstore.similarity_search_with_score(query, k=2)
for doc, score in results:
print(f"[{score:.4f}] {doc.page_content}")FAISS, Pinecone, PostgreSQL
FAISS — install faiss-cpu, build with FAISS.from_texts, save with save_local. Stays on one machine.
Pinecone — runs on Pinecone's servers. Set PINECONE_API_KEY and your index name on PineconeVectorStore.
PostgreSQL — turn on the pgvector extension, then PGVector from langchain-postgres (installed in PostgreSQL Chat Message History) if you already keep app data in Postgres.
Chroma docs: docs.trychroma.com.
Run the demo
Download the script, unzip if needed, then run:
vector_databases_demo.py
Creates chroma_demo_db/ on first run
OPENAI_API_KEY in .env — from_texts calls OpenAI before anything is written to Chroma.python vector_databases_demo.py
If it fails
- ModuleNotFoundError: chromadb — run
pip install chromadb. - AuthenticationError — check
OPENAI_API_KEYin.env. - Permission error on chroma_demo_db — close other programs using the folder, or delete
chroma_demo_dbby hand.
More detail: LangChain vector stores docs.
What's Next
Chroma store is working. Next up: Retrieval-Augmented Generation (RAG).