This gist contains the code used for Tembo's blog post on pgvector
.
Please install the required python packages:
python3 -m venv ./venv
source ./venv/bin/activate
pip install -r ./requirements.txt
You can simply use a container:
docker pull ankane/pgvector
docker run --name pgvector -e POSTGRES_PASSWORD=password -p 5432:5432 ankane/pgvector
See pgvector for more details.
Then you can connect to the pgvector container:
docker exec -it pgvector /bin/bash
and create the required database and enable pgvector on that database:
psql -h localhost -U postgres
postgres=# create database vector_db;
psql -h localhost -U postgres vector_db
vector_db=# create extension vector;
sh get_data.sh
It will download the blogs from Tembo.io, place them in /tmp
and then in removes the markdown tags. The resulting files will be in your current directory, in the corpus
directory.
python3 ./load_embeddings.py
python3 ./generate_query_vector.py