Skip to content

Instantly share code, notes, and snippets.

View gustavz's full-sized avatar
💭
All I know is that i know nothing. And even with that, I'm not sure.

Gustav von Zitzewitz gustavz

💭
All I know is that i know nothing. And even with that, I'm not sure.
View GitHub Profile
@gustavz
gustavz / langchain_chroma_openai_vectorstore.py
Created April 17, 2024 06:44
Creating a langchain vectorstore with chroma and openai embeddings loading pdfs
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_community.document_loaders import PDFMinerLoader, PyMuPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
pdf_path = "https://www.barclaycard.co.uk/content/dam/barclaycard/documents/personal/existing-customers/terms-and-conditions-barclaycard-core-2019.pdf"
loader = PDFMinerLoader(pdf_path) # loads all text into a single document
loader = PyMuPDFLoader(pdf_path) # loads each page as a separate document
documents = loader.load()
@gustavz
gustavz / launch.json
Created February 28, 2024 09:58
VS Code debug pytest current file and test
{
"version": "0.2.0",
"configurations": [
{
"name": "Python: Current File",
"type": "python",
"request": "launch",
"program": "${file}",
"console": "integratedTerminal",
"cwd": "${fileDirname}",
@gustavz
gustavz / compare_gpt_responses_async.py
Created February 28, 2024 08:58
Async Compare GPT Responses to Dataframe
import numpy as np
import pandas as pd
import asyncio
import openai
from sentence_transformers import SentenceTransformer
openai.api_key = "YOUR-API-KEY"
model = SentenceTransformer('all-MiniLM-L6-v2')
@gustavz
gustavz / launch.json
Created July 26, 2023 08:36
VS Code debug config for streamlit apps
{
"version": "0.2.0",
"configurations": [
{
"name": "Python: Streamlit",
"type": "python",
"request": "launch",
"module": "streamlit",
"env": {
// any .env variables you may have
@gustavz
gustavz / list_deeplake_datasets.py
Last active July 26, 2023 07:25
List all available Deep Lake cloud datasets for a given user / orgnaization.
from deeplake.util.bugout_reporter import deeplake_reporter
from deeplake.client.client import DeepLakeBackendClient
def list_deeplake_datasets(
org_id: str = "",
token: str = None,
) -> None:
"""List all available Deep Lake cloud datasets.
Removed from deeplake in: https://github.com/activeloopai/deeplake/pull/2182/files
"""
@gustavz
gustavz / data_class_with_decision_log.py
Last active December 8, 2022 11:43
data class with decision log
import trafaret as t
class DecisionLog:
def __init__(self, reason_key=''):
self.init()
self._reason_key = reason_key
self._len_reason_key = len(reason_key)
def init(self):
self.logs = []