Skip to content

Instantly share code, notes, and snippets.

View amotl's full-sized avatar

Andreas Motl amotl

  • $PYTHONPATH
View GitHub Profile
@amotl
amotl / cratedb_heap_exchaust_weird_error.py
Last active February 5, 2024 17:00
Attempt to trip a low-memory condition in CrateDB, resulting in a weird error message.
"""
Attempt to trip a low-memory condition in CrateDB, resulting in a weird error message
like `SQLParseException[ShardCollectContext for 0 already added]`.
This program tries to emulate the MLflow test case `test_search_runs_returns_expected_results_with_large_experiment`,
succeeded by a `DELETE FROM` table truncation operation.
Remark: It did not work out well. This program trips `OutOfMemoryError[Java heap space]`
right away. Please use the MLflow test case reproducer demonstrated at:
https://github.com/crate-workbench/mlflow-cratedb/issues/53#issuecomment-1927234463
@amotl
amotl / cratedb_regression_15488.py
Created January 30, 2024 00:38
CrateDB error: The assembled list of ParameterSymbols is invalid. Missing parameters.
"""
## About
Regression with CrateDB nightly-5.7.0-2024-01-26-00-02.
The assembled list of ParameterSymbols is invalid. Missing parameters.
-- https://github.com/crate/crate/issues/15488
## Setup
@amotl
amotl / update-from-merges.sql
Created December 16, 2023 22:05
Demo: SQL "UPDATE ... FROM" for upsert/merge operations. Reflecting Meltano's PostgreSQL data loader (target) adapter.
-- Demo: SQL "UPDATE ... FROM" for upsert/merge operations.
-- Reflecting Meltano's PostgreSQL data loader (target) adapter.
--
-- Usage:
--
-- psql postgresql://postgres@localhost:5432/ < update-from-merges.sql
-- crash --host http://crate@localhost:4200/ < update-from-merges.sql
-- Make a blank slate.
DROP TABLE IF EXISTS main;
@amotl
amotl / cratedb-vector-knn-exercise.sql
Created December 12, 2023 21:53
CrateDB: Exercise storing and searching by vectors using its "FLOAT_VECTOR" and "KNN_MATCH".
/**
* CrateDB: Exercise storing and searching by vectors using its "FLOAT_VECTOR" and "KNN_MATCH".
* The example uses euclidean distance for vector similarity search.
*
* Synopsis::
*
* docker run --rm -it --publish=4200:4200 crate/crate:nightly -Cdiscovery.type=single-node
* crash < cratedb-vector-knn-exercise.sql
*
* Resources:
@amotl
amotl / pgvector-vector-knn-exercise.sql
Created December 12, 2023 21:53
PostgreSQL/pgvector: Exercise storing and searching by vectors using its operators.
/**
* PostgreSQL/pgvector: Exercise storing and searching by vectors using its operators.
* The example uses euclidean distance for vector similarity search.
*
* Synopsis::
*
* docker run --rm -it --publish=5432:5432 --env "POSTGRES_HOST_AUTH_METHOD=trust" ankane/pgvector postgres -c log_statement=all
* psql postgresql://postgres@localhost < pgvector-vector-knn-exercise.sql
*
* Resources:
"""
pip install requests 'requests-cache<2'
"""
import os
import requests_cache
import typing as t
from langchain.embeddings import OpenAIEmbeddings
import typing as t
import pytest
def monkeypatch_pytest_notebook_treat_cell_exit_as_notebook_skip():
"""
Patch `pytest-notebook`, in fact `nbclient.client.NotebookClient`,
to propagate cell-level `pytest.exit()` invocations as signals
to mark the whole notebook as skipped.
import logging
import sys
import requests
from langchain.schema import Document
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import CrateDBVectorSearch
from langchain.embeddings import OpenAIEmbeddings
from crate.client import connect
def main():
conn = connect("localhost:4200")
cursor = conn.cursor()
sql = """
INSERT INTO "testdrive" ("meetingId","agency","createdAt","meetingDate","sourceType","sourceUrl","steps","title","updatedAt","videoURLs","timeline","transcription","id") VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?)
DROP TABLE IF EXISTS "testdrive";
CREATE TABLE IF NOT EXISTS "testdrive" (
"id" TEXT NOT NULL,
"steps" OBJECT(STRICT) AS (
"transcribe" TEXT,
"parse" TEXT,
"load" TEXT
),
"timeline" OBJECT(STRICT) AS (