Skip to content

Instantly share code, notes, and snippets.

@peterc
peterc / embedding_store.rb
Last active December 28, 2023 06:27
Using SQLite to store OpenAI vector embeddings from Ruby
# Example of using SQLite VSS with OpenAI's text embedding API
# from Ruby.
# Note: Install/bundle the sqlite3, sqlite_vss, and ruby-openai gems first
# OPENAI_API_KEY must also be set in the environment
# Other embeddings can be used, but this is the easiest for a quick demo
# More on the topic at
# https://observablehq.com/@asg017/introducing-sqlite-vss
# https://observablehq.com/@asg017/making-sqlite-extension-gem-installable
@rain-1
rain-1 / 000-crossword-solving.md
Last active April 23, 2024 05:25
crossword-solving-with-gpt.md

Solving crosswords with GPT

This is my research report. I've included a lot of the code and chat interactions for people to read through if interested. I worked on this crossword https://www.theguardian.com/crosswords/quick/16553

I had a vision for a GPT powered crossword solver. My idea is that it would do a tree search over GPT generated guesses that would include the knowns so far, like:

givens

I didn't end up doing that because ChatGPT and GPT-4 are terrible at questions involving the length of words, or guessing words that contain specific letters at specific locations. It can sometimes do them but usually fails. I think this is because it's token based. I am curious whether a character based LLM would be better at such tasks.

Some remarks on Large Language Models

Yoav Goldberg, January 2023

Audience: I assume you heard of chatGPT, maybe played with it a little, and was imressed by it (or tried very hard not to be). And that you also heard that it is "a large language model". And maybe that it "solved natural language understanding". Here is a short personal perspective of my thoughts of this (and similar) models, and where we stand with respect to language understanding.

Intro

Around 2014-2017, right within the rise of neural-network based methods for NLP, I was giving a semi-academic-semi-popsci lecture, revolving around the story that achieving perfect language modeling is equivalent to being as intelligent as a human. Somewhere around the same time I was also asked in an academic panel "what would you do if you were given infinite compute and no need to worry about labour costs" to which I cockily responded "I would train a really huge language model, just to show that it doesn't solve everything!". We

@peterc
peterc / get_sizes.sql
Last active February 2, 2021 02:52
Get the size of different tables and other relations in a PostgreSQL database
SELECT
schema_name, rel_name, table_size,
pg_size_pretty(table_size) AS size
FROM (
SELECT
nspname AS schema_name,
relname AS rel_name,
pg_table_size(pg_class.oid) AS table_size
FROM pg_class, pg_namespace
WHERE pg_class.relnamespace = pg_namespace.oid
DROP DATABASE IF EXISTS 📚;
CREATE DATABASE 📚;
USE 📚;
CREATE TABLE 👤(
🔑 INTEGER PRIMARY KEY,
🗣 varchar(64), -- name
🗓 DATE -- date of registration
require 'rake/tasklib'
require 'rake/clean'
##
# Creates a cross-compiled library that mruby-cli can use to link with
# each cross-built command.
#
# CrossLibrary.new "curl" do |cross|
# cross.release_name = "curl-7.28.0"
# cross.url = "https://curl.example/download/#{cross.release}.tar.gz"
@rowanmanning
rowanmanning / README.md
Last active February 18, 2024 21:12
Writing a Friendly README. This a companion-gist to the post: http://rowanmanning.com/posts/writing-a-friendly-readme/
@tenderlove
tenderlove / mt_complete.rb
Last active December 11, 2020 19:56
tab completion for minitest tests
#!/usr/bin/env ruby --disable-gems
# Tab completion for minitest tests.
#
# INSTALLATION:
#
# 1. Put this file in a directory in your $PATH. Make sure it's executable
# 2. Run this:
#
# $ complete -o bashdefault -f -C /path/to/this/file.rb ruby
@zerowidth
zerowidth / sql.md
Last active April 11, 2017 18:31
GitHub::SQL - a helping hand for SQL in a rails app

DEPRECATED

GitHub::SQL has been released as an officially-maintained project and ruby gem: github/github-ds.

lass KeenHttpOutput < Fluent::Output
Fluent::Plugin.register_output('out-keen-http', self)
def initialize
super
require 'rubygems'
require 'eventmachine'
require 'em-http-request'
require 'cgi'
require 'json'