Skip to content

Instantly share code, notes, and snippets.

View theturtle32's full-sized avatar

Brian McKelvey theturtle32

View GitHub Profile
@dannguyen
dannguyen / README.openai-structured-output-demo.md
Last active November 3, 2024 12:36
A basic test of OpenAI's Structured Output feature against financial disclosure reports and a newspaper's police blotter. Code examples use the Python SDK and pydantic for the schema definition.

Extracting financial disclosure reports and police blotter narratives using OpenAI's Structured Output

tl;dr this demo shows how to call OpenAI's gpt-4o-mini model, provide it with URL of a screenshot of a document, and extract data that follows a schema you define. The results are pretty solid even with little effort in defining the data — and no effort doing data prep. OpenAI's API could be a cost-efficient tool for large scale data gathering projects involving public documents.

OpenAI announced Structured Outputs for its API, a feature that allows users to specify the fields and schema of extracted data, and guarantees that the JSON output will follow that specification.

For example, given a Congressional financial disclosure report, with assets defined in a table like this:

@jaretburkett
jaretburkett / person-terms
Created June 27, 2023 02:30
Terms for tagging pictures of humans
aboriginal
above average
abstract composition
abusive
accessories
accountant
acid wash
acne-prone skin
acne scars
@jaretburkett
jaretburkett / Humans v1 - Token Counts
Created June 27, 2023 02:28
Humans v1 - Token Counts
This file has been truncated, but you can view the full file.
smiling mouth revealing white straight teeth - 24426
anxious expression with biting lower lip - 17012
shallow depth of field - 16806
early childhood age - 14067
social worker - 12566
smiling mouth revealing slightly crooked teeth - 12329
broad grin revealing straight white teeth - 11336
pediatrician - 11212
preschooler age - 10873
headshot - 10462
@syntaqx
syntaqx / cloud-init.yaml
Last active October 21, 2024 18:35
cloud init / cloud config to install Docker on Ubuntu
#cloud-config
# Option 1 - Full installation using cURL
package_update: true
package_upgrade: true
groups:
- docker
system_info:
@ipbastola
ipbastola / clean-up-boot-partition-ubuntu.md
Last active August 16, 2024 13:39
Safest way to clean up boot partition - Ubuntu 14.04LTS-x64, Ubuntu 16.04LTS-x64

Safest way to clean up boot partition - Ubuntu 14.04LTS-x64, Ubuntu 16.04LTS-x64

Reference

Case I: if /boot is not 100% full and apt is working

1. Check the current kernel version

$ uname -r 
@JunichiIto
JunichiIto / alias_matchers.md
Last active October 21, 2024 00:54
List of alias matchers in RSpec 3

This list is based on aliases_spec.rb.

You can see also Module: RSpec::Matchers API.

matcher aliased to description
a_truthy_value be_truthy a truthy value
a_falsey_value be_falsey a falsey value
be_falsy be_falsey be falsy
a_falsy_value be_falsey a falsy value
@jkreps
jkreps / benchmark-commands.txt
Last active September 15, 2024 11:37
Kafka Benchmark Commands
Producer
Setup
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test-rep-one --partitions 6 --replication-factor 1
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test --partitions 6 --replication-factor 3
Single thread, no replication
bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 50000000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196
@eskim
eskim / redis.markdown
Created August 18, 2012 14:03 — forked from bdotdub/redis.markdown
Running redis using upstart on Ubuntu

Running redis using upstart on Ubuntu

I've been trying to understand how to setup systems from the ground up on Ubuntu. I just installed redis onto the box and here's how I did it and some things to look out for.

To install: