PostgreSQL Hybrid Search
After Hybrid Search, swap Chroma.from_texts for PGVector. The BM25Retriever and EnsembleRetriever code stays as-is.
Before you run
Venv from Project Setup. Reuse POSTGRES_URL from PostgreSQL Chat Message History. You need pgvector on the server — the image below includes it:
docker run -e POSTGRES_PASSWORD=postgres -p 5432:5432 pgvector/pgvector:pg16
Then install:
pip install langchain-postgres psycopg[binary] rank_bm25
What changes from Chroma:
Swap Chroma for PGVector
PGVector wants a postgresql+psycopg:// connection string. The demo rewrites POSTGRES_URL with a one-line .replace(). Load chunks with add_documents, not from_texts.
import os
from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import BM25Retriever
from langchain_core.documents import Document
from langchain_openai import OpenAIEmbeddings
from langchain_postgres import PGVector
postgres_url = os.getenv(
"POSTGRES_URL",
"postgresql://postgres:postgres@localhost:5432/chatdb",
)
connection = postgres_url.replace("postgresql://", "postgresql+psycopg://", 1)
texts = [
"The <a> tag creates a hyperlink. Set the href attribute to the URL.",
"The <title> tag sets the browser tab title.",
"The <h1> tag marks the main heading on a page.",
]
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = PGVector(
embeddings=embeddings,
collection_name="hybrid_search_demo",
connection=connection,
use_jsonb=True,
)
vectorstore.add_documents([Document(page_content=t) for t in texts])
vector_retriever = vectorstore.as_retriever(search_kwargs={"k": 2})
bm25_retriever = BM25Retriever.from_texts(texts)
bm25_retriever.k = 2
hybrid_retriever = EnsembleRetriever(
retrievers=[bm25_retriever, vector_retriever],
weights=[0.5, 0.5],
)
docs = hybrid_retriever.invoke("href attribute URL")
for doc in docs:
print(doc.page_content)pre_delete_collection=True in the demo script clears the collection on each run so you start fresh.
Run the demo
postgresql_hybrid_search_demo.py
Same query as hybrid_search_demo.py
POSTGRES_URL and OPENAI_API_KEY in .env.python postgresql_hybrid_search_demo.py
If it fails
- extension "vector" is not available — use
pgvector/pgvector, not plainpostgres. - Connection refused — check
POSTGRES_URLand that the container is running. - No module named langchain_postgres — run the install line above.
BM25 errors: see Hybrid Search. PGVector docs: langchain-postgres.
What's Next
Same hybrid setup, vectors in Postgres. Next: Indexing API.