Skip to content

Instantly share code, notes, and snippets.

@janakiramm
Last active July 27, 2023 10:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save janakiramm/55250aa0db4819e7c6f889e342abdb20 to your computer and use it in GitHub Desktop.
Save janakiramm/55250aa0db4819e7c6f889e342abdb20 to your computer and use it in GitHub Desktop.
Chroma VectorDB for Word Embeddings
import chromadb
phrases=[
"Amanda baked cookies and will bring Jerry some tomorrow.",
"Olivia and Olivier are voting for liberals in this election.",
"Sam is confused, because he overheard Rick complaining about him as a roommate. Naomi thinks Sam should talk to Rick. Sam is not sure what to do.",
"John's cookies were only half-baked but he still carries them for Mary.",
]
ids=["001","002","003","004"]
metadatas=[{"source": "pdf-1"},{"source": "doc-1"},{"source": "pdf-2"},{"source": "txt-1"}]
chroma_client = chromadb.Client()
collection = chroma_client.get_or_create_collection(name="tns_tutorial")
collection.add(
documents=phrases,
metadatas=metadatas,
ids=ids
)
collection.peek()
results = collection.query(
query_texts=["Mary got half-baked cake from John"],
n_results=2
)
print(results['documents'][0][0])
results = collection.query(
query_texts=["cookies"],
where={"source": "pdf-1"},
n_results=1
)
print(results)
collection.delete()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment