https://svelte.dev/docs/kit/creating-a-project
brew upgrade npm
https://svelte.dev/docs/kit/creating-a-project
brew upgrade npm
- tool: fd | |
url: https://github.com/sharkdp/fd | |
devlang: rust | |
supercedes: | |
- find | |
tagline: A simple, fast and user-friendly alternative to 'find' | |
description: | | |
A simple, fast and user-friendly alternative to 'find' | |
references: |
One of the most incomprehensible errors I have ever run into, with Microsoft forums and ChatGPT/Claude being almost totally useless. Hopefully anyone else running into this situation will come across this gist and save themselves hours of frustration.
Huge thanks to Mr. Excel for the solution, with a major assist by r/excel
tl;dr this demo shows how to call OpenAI's gpt-4o-mini model, provide it with URL of a screenshot of a document, and extract data that follows a schema you define. The results are pretty solid even with little effort in defining the data — and no effort doing data prep. OpenAI's API could be a cost-efficient tool for large scale data gathering projects involving public documents.
OpenAI announced Structured Outputs for its API, a feature that allows users to specify the fields and schema of extracted data, and guarantees that the JSON output will follow that specification.
For example, given a Congressional financial disclosure report, with assets defined in a table like this:
#!/usr/bin/env python3 | |
""" | |
skimschema.py | |
============== | |
Create an excel file of transposed data rows, for easy browsing of | |
a data file's contents (csvs only for now) | |
Longer description |
SELECT | |
unique_key | |
, pddistrict AS pd_district | |
, DATE(timestamp) AS incident_date | |
, category | |
, descript AS description | |
, dayofweek AS day_of_week | |
, resolution | |
, UPPER(address) AS address | |
, longitude |
wrangled.csv
Example usage:
Note: This gist refers this older gist that shows the AWS transcribe API: https://gist.github.com/dannguyen/9b8c51f5bb853209f19f1a0f18f0f74c
I went into the AWS console for Transcription, which has an interface for real-time transcription here: https://console.aws.amazon.com/transcribe/home?region=us-east-1#realTimeTranscription
Then I used my phone to play out this snippet of the 2008 VP presidential debate, featuring speech from Biden and Palin: https://twitter.com/dancow/status/1313951588428517385
fieldname | value |
---|---|
act | 1 |
scene | 5 |
speaker | Horatio |
lines | Propose the oath, my lord. |
~~~~~~~~~ | |
act | 1 |
scene | 5 |
speaker | Hamlet |
I wrote these instructions on how to install and use xsv – a powerful CSV-handling command-line tool, because someone asked how to deal with a data file that was too big to open in Excel or even Notepad. I didn't know how familiar the person was with installing/running downloadable .exe files or with Powershell, so I've tried to include some general instructions that hopefully are useful to even novices.
This mini-guide is not at all meant to be exhaustive as it basically shows just one of xsv's many useful functions. But if you're new to the idea of using command-line tools to do things, hopefully this can be a friendly intro to it.
Here's an example of a CSV that, at 3 million rows, is too big for Excel to open: https://burntsushi.net/stuff/worldcitiespop.csv