Skip to content

Instantly share code, notes, and snippets.

@linuxandchill
Last active May 5, 2024 17:37
Show Gist options
  • Star 13 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save linuxandchill/f533867a137def7c36656ff4d119d708 to your computer and use it in GitHub Desktop.
Save linuxandchill/f533867a137def7c36656ff4d119d708 to your computer and use it in GitHub Desktop.
Create and retrieve embeddings using Langchain and Supabase
////////////////////////////////////
/* "dependencies": {
"@supabase/supabase-js": "^2.13.1",
"langchain": "^0.0.44"
} */
////////////////////////////////////
import { OpenAI } from "langchain/llms";
import {
RetrievalQAChain,
ConversationalRetrievalQAChain,
} from "langchain/chains";
import { SupabaseVectorStore } from "langchain/vectorstores";
import { OpenAIEmbeddings } from "langchain/embeddings";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { ChatOpenAI } from "langchain/chat_models";
import * as fs from "fs";
import * as dotenv from "dotenv";
import { createClient } from "@supabase/supabase-js";
dotenv.config();
const openAIKey = process.env.OPENAI_API_KEY;
const client = createClient(
process.env.SUPABASE_URL || "",
process.env.SUPABASE_PRIVATE_KEY || ""
);
export const createEmbeddings = async () => {
// Initialize the LLM of choice to answer the question.
const model = new OpenAI({ apiKey: openAIKey });
const text = fs.readFileSync("pizzareview.txt", "utf8");
// Split and create docs
const textSplitter = new RecursiveCharacterTextSplitter({
chunkSize: 200,
chunkOverlap: 50,
});
const docs = await textSplitter.createDocuments([text]);
// Create a vector store from the documents.
const vectorStore = await SupabaseVectorStore.fromDocuments(
docs,
new OpenAIEmbeddings(),
{
client,
tableName: "documents",
queryName: "match_documents",
}
);
// Create a chain that uses the OpenAI LLM and Supabase vector store.
const chain = RetrievalQAChain.fromLLM(model, vectorStore.asRetriever());
const res = await chain.call({
query: "What did he rate the pizza?",
});
console.log({ res });
};
// createEmbeddings()
export const query = async () => {
const chat = new ChatOpenAI({
modelName: "gpt-3.5-turbo",
apiKey: openAIKey,
});
// console.log(client)
const vectorStore = await SupabaseVectorStore.fromExistingIndex(
new OpenAIEmbeddings(),
{
client,
tableName: "documents",
queryName: "match_documents",
}
);
let chain = ConversationalRetrievalQAChain.fromLLM(
chat,
vectorStore.asRetriever(),
{ returnSourceDocuments: true }
);
console.log(chain.questionGeneratorChain.prompt.template);
const query = "What did he rate the pizza?";
const res = await chain.call({ question: query, chat_history: [] });
// /* Ask it a follow up question */
// const chatHistory = query + res.text;
// const followUpRes = await chain.call({
// question: "How was the char on the pizza?",
// chat_history: chatHistory,
// });
};
// query();
@rrubio
Copy link

rrubio commented May 21, 2023

This worked perfectly! thank you!

@tehfailsafe
Copy link

When you call the new OpenAIEmbeddings on line 71, are you paying for that each query? I currently have a similar setup working using the MemoryVectorStore, but it regenerates my document embeddings each query. I'm looking for a way to reduce costs by using supabase vector DB to store all my docs and then only have to generate embedding for the question itself.

Or am I misunderstanding something?

@Jakisundays
Copy link

@tehfailsafe hey, im dealing with embedding costs too, just like you were. Did you find a smart way to handle it?

@mzafarr
Copy link

mzafarr commented Nov 20, 2023

@Jakisundays Have you figured it out? facing the same issue.

@niikkhilsharma
Copy link

Thanks Man! Really appreciated from this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment